Abstract: Malware, or malicious software, infects target computers, injecting malicious instructions and compromising system integrity. Malware’s construction allows it to integrate malicious actions into legitimate software. Such malicious actions may let attackers steal passwords or damage sensitive data. Malware may be utilized on an individual level against users or on an organizational level against institutions or businesses, exemplified by the ransomware attacks against the University of California, San Francisco. While the specific infection mechanism varies greatly depending on the malware type, malware as a whole utilizes advanced software techniques to stay hidden from detection software. Simple methods like encryption obfuscate malware from traditional detection methods used by most anti-malware. More advanced techniques like polymorphism and metamorphism allow malware to manipulate itself to increase obfuscation, much like a biological virus. The ever-increasing complexity of malware heightens the importance of developing countermeasures to the growing problem. Users must also maintain vigilance and follow good computing practices to counter malware’s development.
If you feel worthless, last place in life’s rat race, if you feel fortune has cast you aside, I want you to know: You are valuable. Even with not a cent to your name, you have worth. Scans of your driver’s license, to start, might go for a few bucks on the dark web. Your social media logins are maybe about $25. Some of the better money is in your bank account—not in your account balance, but in the credentials. They’re worth closer to $60. Even if you’re not aware of this value, others certainly are, and they’ll employ every technique to capture it [1].
Malware is one such technique, carefully crafted and refined to separate you from your value. Short for malicious software, malware is best defined as “software which intentionally executes malicious payloads on victim machines” [2]. Malware isn’t just computer code thrown together; it’s crafted, engineered software. Malware authors apply advanced software engineering techniques to secretly execute malicious payloads, all while eluding your antivirus’ watchful eye. Code obfuscation, polymorphic structure, encryption—these are all tools in the malware engineer’s toolbox to bypass security systems and exploit you. To explore how this occurs, let’s understand the basic mechanics of a common malware type.
Basic Mechanisms of a Virus
In the early days of malware, malicious software often took the form of viruses. Broadly speaking, viruses are self-replicating pieces of code [3]. They usually attach themselves to files that execute on a computer, called executables (typically ending in “.exe”) [4]. But an infected computer typically doesn’t come with a virus preloaded from the factory: something the user has done likely gave them a virus. Perhaps the user has plugged a malware-loaded flash drive into their PC. Or perhaps the user has been enticed by a fake download button, as seen in Figure 1. Less reputable file-sharing websites are often inundated with fake download links that, when clicked, may download a virus or other form of malware onto your computer. But supposing the initial damage has been done and a user has acquired malicious software, let’s look at how the software actually harms the user.
Figure 1, an image of a fake download button
Executables, the type of file with which viruses typically associate, are essentially sequences of instructions for a computer to follow. Viruses work by modifying this sequence of instructions to execute malicious code. To illustrate how a virus might work, suppose we had the set of instructions to create a peanut butter and jelly sandwich. These instructions might look like those listed in Figure 2, on the left. However, we can also see a set of infected instructions on the right. These instructions represent our virus-infected executable. For the infected instructions, we’ve modified the first instruction, or entry-point, to jump to our malicious instruction. The malicious line instructs us to pay Ben Wiencko a large sum of money, clearly not the intention of the PB&J maker. Now, you might notice if a malicious actor has penned in their own demands on your favorite PB&J recipe, but computers don’t understand the instructions they execute in the way that humans do. They execute what they’ve been told to. Following the infected instructions like a computer, we would pay Ben Wiencko his $1,000 before even starting our sandwich. After executing the malicious instruction, we return to executing our original sandwich-making instructions. Computer viruses do the exact same thing with machine code to trick a computer into execution. Through the technique illustrated here, a virus can execute malicious code and execute the original code simultaneously, hiding its presence from a user [4].
Uninfected Instructions |
Infected Instructions |
1. On one slice of bread, spread peanut butter evenly over the bread. | 1. Jump to instruction 5 (Injected instruction) |
2. On the other slice of bread, spread the jelly evenly over the bread. | 2. On the other slice of bread, spread the jelly evenly over the bread. |
3. Put the two slices of bread together with the peanut butter and jelly facing in. | 3. Put the two slices of bread together with the peanut butter and jelly facing in. |
4. End Task | 4. End Task |
5. Send Ben Wiencko $1000.00 (Malicious instruction) | |
6. On one slice of bread, spread peanut butter evenly over the bread, jump to instruction 2 |
Figure 2, example adapted from Diagram 1 found in [4], PB&J instructions from [5]
The virus is by no means the only type of malware, but it illustrates how malicious instructions might be unknowingly executed by your computer. Other types of malware, such as the worm, Trojan horse, and rootkit, use a complex or varied method of infection, but ultimately, malware’s goal is to have your computer execute a malicious instruction [6].
To clarify, real malware’s malicious instructions aren’t blatantly ridiculous like the PB&J example’s malicious line. Criminals don’t exactly paste invoices into executables and expect returns. Rather, malicious code may deploy other components to further capitalize on you, like a keylogger [6]. Keyloggers track the keys you type, letting malware authors access your confidential data. When was the last time you logged in to handle online banking or make an online purchase? Had a keylogger been deployed on your machine, your passwords, credit card numbers, and bank account details would have been compromised. Your data may be sold to others, enabling further crimes like identity theft to follow. Other malicious instructions may deploy adware, which overwhelms your computer with unwanted advertisements [6]. Particularly dangerous malware, called ransomware, holds sensitive data from your computer’s memory hostage unless you pay a large sum of money [6]. It takes only a single misclick from you to put everything on your computer at risk, everything from your social media passwords to your family photos. Clearly, malware must be countered.
Countering Malware: Signature-Based Analysis
Suppose you’ve been tasked with determining if a dangerous package is a bomb. Aside from running, what’s the first step you should take? To determine if it really explodes, take a look at its construction. Look for telltale signs of a bomb: a switch, a power source, exposed wires. Then, armed with this knowledge, consult a manual. Look to see if the features you’ve found match known bomb constructions. Anti-malware software, sometimes colloquially called an antivirus, does this all the time with malware. Traditional antiviruses analyze signatures, breaking down software into a set of instructions or a set of byte sequences. Then, they consult a ‘malware manual’, a database of known malicious signatures. If a match is found, the antivirus designates the software as malware [8]. This process is called signature-based analysis.
Countering Anti-Malware: Encryption
Malware developers know this and modify their malware through a variety of techniques to make your antivirus as ineffective as possible. The first step: encryption. This process hides malicious code by using a generated key and an encryption and decryption algorithm. The exact specifics of encryption vary based on the process used, but a typical encryption algorithm looks like this: Malware code is XORed with a generated key, resulting in a completely different output [9]. XORing is a technique that scrambles data to hide its content. In your computer, data can be represented with sequences of binary digits called bits, which are either ‘1’ or ‘0’. So the generated key and code that malware wants to hide are really just a long sequence of 1’s and 0’s.
For instance, your code might look like “1011111011101111…”—almost unreadable to a human, but easily interpreted by a computer.
XORing combines each bit of the code with the corresponding bit from the key to produce an output bit. The algorithm is simple. Look at the first bit of the key and code. If the two bits are the same, output a ‘0’; if the two bits are different, output a ‘1’. So XORing code that reads ‘0’ with a key that reads ‘1’ produces an output of ‘1’, because the code and key bits are different. In Figure 3 below, we see the result of XORing a sequence of 4 bits. When code needs to be decrypted and read, the XOR operation can be performed again to generate the original code from the result and key.
Bit 1 | Bit 2 | Bit 3 | Bit 4 | |
Code | 0 | 1 | 0 | 1 |
Key | 1 | 1 | 0 | 0 |
XOR Result | 1 | 0 | 0 | 1 |
Figure 3, an example of XORing
Essentially, XORing mathematically ties the generated key and code, while making the code unreadable to an observer. When the code needs to be run, a decryption algorithm and key reverse the encryption by undoing the mathematical operation applied to the code. After the host computer is infected, the malware encrypts itself again and generates a brand-new key for the next variant of that malware. Harkening back to the PB&J example, an executing virus that uses encryption would first jump to the decryption algorithm, translating the malicious instruction “Send Ben Wiencko $1,000” from its unreadable encrypted state to regular text. Then, as presented in the example, the malicious instructions would execute. One of the malicious instructions would generate a new key for the malware and re-encrypt its code. Much like a pathogen, this encryption process modifies the malware upon infection, severely hindering the effectiveness of antivirus scanners. Encryption directly counters the signature-based analysis used by most antivirus software. Signature analysis matches code instructions with known malware patterns, but encryption obscures the instructions, making it difficult to match against the malware database [9].
Figure 4 [9], a diagram of encrypted malware
Not only can encryption make malware more difficult to detect, but encryption can be used offensively against users by malicious software. One category of malware discussed earlier, ransomware, functions by encrypting a host’s data, making it unusable to the host. Then, the malware demands the host pay a ransom, generally, a large sum of money, to gain access to their data. Thus, encryption serves both as a concealment mechanism and an offensive tool for malware [6].
One particularly salient example of an encryption offensive occurred at the University of California, San Francisco (UCSF) in 2020. While working on a cure for the COVID-19 pandemic, the UCSF research institution faced an attack from hackers. The NetWalker criminal gang used ransomware to encrypt sensitive records from UCSF. University IT staff had to frantically unplug computers to prevent the malware from spreading. UCSF entered negotiations with the criminals, who demanded millions. The criminals settled for $1.14 million in Bitcoin, transferring the decryption software upon receipt. While UCSF recovered their data in the end, the gilded pockets of the hackers taught one lesson: Ransomware works. Successful ransomware attacks encourage more ransomware attacks. Encryption, whether used offensively or defensively, is a vital tool for malware makers [7].
Polymorphic Malware
But, this simple encryption process is not immune to the signature-based detection antiviruses use. We can see in Figure 4 that encrypted malware uses a decryption algorithm, or decryptor, that is not concealed. The reason that the decryptor itself is not encrypted is that the infected computer needs to be able to read the decryptor’s instructions to translate encrypted code into executable malicious instructions. If the decryptor were encrypted, the malware would have no avenue for interpreting the encrypted malware body.
Rather than try to decrypt or analyze the concealed malware body, antivirus software can instead track decryption algorithm signatures because they do not change [9]. So, the next malware development focuses on hiding the decryption algorithm from signature-based analysis. Polymorphic malware does just this.
“Poly” meaning “many” and “morphic” meaning “form”—polymorphic malware creates many forms of obfuscated decryptors through a complex mutation engine that generates an encryption and decryption algorithm while executing its malicious instruction. The mutation engine uses a variety of obfuscation techniques to generate unique encryption and decryption algorithms, making changes to the encryptors’ and decryptors’ code that alter its appearance but not its functionality [9]. If you’ve ever copied homework from a friend while in grade school, you might’ve changed the answers slightly to make your copy appear unique to the teacher. Polymorphic malware does the same, creating millions of different algorithm versions to hide from anti-malware. The function of each decryptor is unchanged: it decrypts. But the presentation of each decryptor to signature-based analysis looks different because of slight alterations. This results in much less effective anti-malware [9].
But how does this malware archetype translate into affecting you? Well, the effectiveness of obfuscation techniques directly impacts you. When detected, anti-malware quarantines and deletes malicious software. More effective obfuscation methods bypass the security anti-malware provides, meaning more of your data is stolen and privacy breached. In other words, malware’s polymorphic construction makes every download link or insecure email attachment magnitudes more dangerous to you, because your antivirus may not be equipped to handle complex obfuscation. In a world without polymorphic malware, ransomware that encrypts your entire hard drive may be caught prior to executing. With polymorphic malware, your chances of recovering your data plummet.
To be clear, polymorphic malware is not an esoteric technique restricted to only the most skilled malware programmers; it’s mainstream. If you’ve had your data deleted or your identity stolen by malware, polymorphic malware is probably to blame. In 2017, “96 percent of all malware files detected and blocked by Windows Defender were detected only once on a single computer and never seen again” [10]. This behavior is consistent with polymorphic malware. As malware modifies its decryptors or itself, anti-malware software like Windows Defender would view these generated variants as unique.
Figure 5 [9], a diagram of polymorphic malware
Metamorphic Malware
As the next step in malware’s evolution, metamorphic malware discards decryptor mutations. “Meta” meaning “self-referential”, metamorphic malware focuses on mutating itself instead. The idea is that anti-malware systems are always detecting malware based on its structure. If the malware is unconcealed, anti-malware can detect that. If the malware is encrypted, anti-malware can detect the decryptor instead. Even with the obfuscation techniques detailed above, antivirus software can still detect polymorphic malware to some extent. Therefore, metamorphic malware focuses on completely eliminating signature detection. Similar to how polymorphic malware mutates decryptors without changing the function, metamorphic malware mutates itself in small ways by a variety of methods. For instance, metamorphic malware could reorder the sequence of malicious instructions, making the signature appear different while maintaining the same functionality [9]. The consequence? Metamorphic malware is dangerous, extremely dangerous. Metamorphic malware is “nearly impossible to detect by the traditional signature-based techniques…” [9]. It’s not that your antivirus is just outclassed by metamorphic malware; it’s completely eclipsed by it. However, you likely won’t run into much metamorphic malware in your day-to-day life. Metamorphic malware is a relatively recent development, and its complexity makes it difficult to implement [8], [9].
That’s not to say that metamorphic malware poses no threat to you. Sure, you’re unlikely to encounter it browsing the web, but metamorphic malware exposes massive vulnerabilities in antiviruses you regularly use. If metamorphic malware becomes more prevalent, it could seriously endanger you and your data, and you have little recourse. In such a scenario, security prices might surge as antivirus developers look beyond the traditional signature-based analysis methods toward more complex machine learning or AI-based solutions [2]. Organizations and individuals alike may have to pay top dollar for the same level of protection they enjoy now. Additionally, the computational expense for antiviruses may rise even higher than the monetary costs. With complex metamorphic malware, detection systems may have to allocate significant computational resources to keep up, reducing the effectiveness of your computer. Without novel solutions to metamorphism, slower computers and faster malware may await.
The Broad Effects of Malware
Up to this point, we’ve seen how malware hides malicious code from your unsuspecting computer, and we’ve seen the massive toolbox of software engineering techniques malware employs to elude anti-malware software. Ultimately, these developments make the internet a more dangerous place for you, even despite attempts to counter them. As it stands now, the malware industry reaps “billions of dollars in profits” [3], and malware’s various methods to stay hidden have everything to do with that. In fact, the malware problem is worsening, “In recent years, malware became a profitable industry attracting many cybercriminals to create more intricate forms of malware” [3]. Currently “malware attacks related to social media, healthcare industry,… and cryptocurrencies are also on the rise” [2]. Now that malware can target everything from wearable smart devices to self-driving cars, it’s more important than ever to keep yourself safe [3]. For the sake of your safety and your data’s safety, keep your antivirus up to date and practice safe computing habits. Be cautious about the websites you visit and links you click, and avoid downloading from unknown sources. As malware continues to develop past traditional security measures, informed users are the first line of defense.
References
[1] M. Zoltan, “Dark Web Price Index 2023,” Privacy Affairs. Accessed: Oct. 30, 2024. [Online]. Available: https://www.privacyaffairs.com/dark-web-price-index-2023/
[2] O. Aslan and R. Samet, “A Comprehensive Review on Malware Detection Approaches,” IEEE access, vol. 8, pp. 6249–6271, 2020, doi: 10.1109/ACCESS.2019.2963724.
[3] M. N. Alenezi, H. Alabdulrazzaq, A. A. Alshaher, and M. M. Alkharang, “Evolution of Malware Threats and Techniques: A Review,” International journal of communication networks and information security, vol. 12, no. 3, pp. 326–337, 2020.
[4] C. Nachenberg, Computer virus-antivirus coevolution, vol. 40, no. 1. New York: Association for Computing Machinery, 1997, pp. 46–51. doi: 10.1145/242857.242869.
[5] Skippy, “The Classic Peanut Butter and Jelly Sandwich,” Skippy Brand Peanut Butter. Accessed: Oct. 30, 2024. [Online]. Available: https://www.peanutbutter.com/recipes/the-classic-peanut-butter-and-jelly-sandwich
[6] M. F. A. Razak, N. B. Anuar, R. Salleh, and A. Firdaus, “The rise of ‘malware’: Bibliometric analysis of malware study,” Journal of network and computer applications, vol. 75, pp. 58–76, 2016, doi: 10.1016/j.jnca.2016.08.022.
[7] J. Tidy, “How hackers extorted $1.14m from University of California, San Francisco,” BBC News. Accessed: Dec. 09, 2024. [Online]. Available: https://www.bbc.com/news/technology-53214783
[8] A. Ray and A. Nath, “Introduction to Malware and Malware Analysis: A brief overview,” IJARCSMS, vol. 4, no. 10, pp. 22–30, Nov. 2016.
[9] S. K. Sahay, A. Sharma, and H. Rathore, “Evolution of Malware and Its Detection Techniques,” in Information and Communication Technology for Sustainable Development, vol. 933, Singapore: Springer Singapore Pte. Limited, 2020, pp. 139–150. doi: 10.1007/978-981-13-7166-0_14.
[10] A. Johnson, “The Devastating Effect Of Polymorphic Malware,” Cybercrime Magazine, Apr. 03, 2020. Accessed: Oct. 30, 2024. [Online]. Available: https://cybersecurityventures.com/the-devastating-effect-of-polymorphic-malware/