OBFUSCATION

There are a number of methods malware authors will use in order to obfuscate their programs. The level of sophistication varies widely. One of the most pedestrian ways to slow down the reverse engineer is to use a packer. A packer can be utilized to compress a binary on disk and, in some cases, speed up the loading process. A packer compresses an existing program, and then inserts executable code to reverse the process and uncompress the program into memory.

The Ultimate Packer for Executables (UPX) is a very popular cross-platform packer available at http://upx.sourceforge.net. If executing the command grep UPX <file> generates any hits, then you might be dealing with a file packed by UPX. If you get a hit, download the UPX package and decompress the file with the -d option. The first bytes in a file packed with UPX are shown in Figure 10.38.

FIGURE 10.38

The first part of a file packed with UPX.

In the past clever malware authors might have written self-modifying code that changes as it is executed. This was quite easily done with DOS systems that had no memory protection whatsoever. In the modern world, even Windows will mark executable memory blocks as read-only, making this obfuscation method a thing of the past.

Modern day compilers benefit from decades of research and do a great job of optimizing code. Optimized code is also very uniform, which allows it to be more easily reverse engineered. As a result, obfuscated code is likely handwritten in Assembly. This is both good and bad. The good thing is that you have to be a skilled Assembly coder to write malware this way. The bad thing is that you have to be a skilled Assembly coder to interpret and follow the code! Again, for the purposes of incident response, if you encounter code using the obfuscation techniques discussed in this section, it is probably malware. There are some paranoid companies that obfuscate their products in order to discourage reverse engineering, but those products are few and far between on the Linux platform.

So what sorts of things might one do to obfuscate Assembly code? How about using obscure Assembly instructions. In this chapter, we have covered just a handful of Assembly instructions. Yet this is enough to get a high-level view of what is happening in most programs. Start using uncommon operations, and even the experienced Assembly coders are running to Google and their reference manuals.

Compilers are smart enough to replace calculations involving only constants with the answers. For example, if I want to set a flag in position 18 in a bitvector, and I write x = 2

^ 17 or x = 1 << 17 this will be replaced with x = 0x20000. If you see calculations involving only constants that are known at compile time, suspect obfuscation (or poorly written Assembly).

Authors may also intentionally insert dead code that is never called in order to throw the reverse engineer off track. I once worked for a company that had written their PC software product in COBOL (yes, I was desperate for a job when I took that one). The primary author of their main product had inserted thousands of lines of COBOL that did absolutely nothing. I discovered this when I ported the program to C++. Incidentally, the complete COBOL listing required an entire box of paper. The C++ program was less than 200 pages long, despite running on three operating systems in graphical or console mode (old program was DOS and console only).

Authors might also insert several lines of code that are easily replaced by a single line. One of the techniques is to employ an intermediate variable in every calculation even when this is unnecessary. Another trick is to use mathematical identities when making assignments.

One of the few techniques that still works when programming in a high level language is function inlining. If you look back in this chapter, you will see that a lot of the information we gleaned from our unknown binaries was based on tracing through what functions were called locally (looking at disassembly), in libraries (ltrace), and system calls (strace). Inlining turns a program into one big function. The one big function will still have library and system calls but will be noticeably harder to grasp.

results matching ""

    No results matching ""