What is Zip Bomb?
November 20, 2022
The classic zip bomb is a tiny archive, most of which is measured in kilobytes. When this file is unpacked, its contents are more significant than the system can handle. Typically, these are hundreds of gigabytes of data, and the more advanced may reach petabytes (millions of GB) or even exabytes (billions of gigabytes). So, yes, to be clear, we’re talking about filling the excubites in the kilobytes.
The first mention of a zip bomb is dated 1996. One of the users of the then-popular messaging service Fidonet posted on the bulletin board a malicious archive, which an unsuspecting administrator opened.
When you open a file, it starts unpacking all the data, which causes the program or the entire system to crash, as there is simply not enough space to unpack this amount of data.
42.zip - A Classic Zip Bomb
The most common zip bomb you can find on the Internet - is "42.zip". It weighs only 42 Kb in a packed form.
However, if you unpack it, you get 4.5 PetaBytes (36,000,000 GB) of data on the way out!
This is achieved by a recursively nested zip files system, where the lowest zip-file level is decompressed to size 4.3 GB. The construction uses the most common decompression algorithm, which is compatible with most zip parsers.
Zip Bomb Definition: How it works?
The principle of the zip bomb is that it creates, for example, a text file that is either empty or contains the same symbols and is archived. Because the file contains the same information, it will archive itself and have a much smaller size than the others. Then another 16 of the same archives are created, but since they are completely identical in the hash, they will be like one single file and weigh nothing. Then another 16 copies, then another 16 copies, and so 6 times. Eventually, we have 6 layers of 16 archives, each of which has 16 same archives.
What is compression?
Compression is the reduction of the number of bits required to represent data. Let’s look at this in more detail:
| xxxyyyyxxxyxxxyxxx
This string is 18 characters long. The xxx can be found a lot of times. This is what’s known as statistical redundancy. Let's take the longest common sequences in data and represent them using as few bits as possible. Now, compressing this string means we have to represent this information in less than 18 characters. Replace every occurrence of ‘xxx’ with a symbol, say ‘$’, and see what happens.
Now we use an intermediate (compressed) string form along with some instructions on how to get the original string:
| $yyyy$y$y$
| $=xxx
The first line is our compressed data, and the second - is instruction. A dictionary that we have created tells us that if we need to decompress the data, we should replace every occurrence of $ with xxx to get back the original data. Now let's count the total number of characters.
Now we need 10 + 5 = 15 to represent the same information.
What is Zip Bomb Used For?
Since the zip bomb does not directly damage the system, it is often used to cause a failure or disablement of the program trying to access it. It can also be used to disable antivirus software to create a backdoor for other typical malware.
Instead of stealing the regular operation of the program, the zip-bomb allows the program to work as intended. Still, the archive is carefully designed, so unpacking it (for example, antivirus scanning for viruses) takes an excessive amount of time, disk space, or memory (or all of it). At this time, the attacker may try to infect the system with a real virus. Although sometimes in an attempt to scan attachments, the antivirus takes all the resources of the PC, thereby loading the system so much that further use of the device becomes impossible.
Where Does the Zip Bomb Come From?
It’s almost impossible to catch such a virus these days accidentally. Most modern anti-viruses have learned to recognize and neutralize zip bombs, and in practice, the effectiveness of such an attack is minimal.