Using the GNU Debugger for reverse engineering

The GNU debugger, gdb, is the standard debugger for Linux programmers. It is very powerful. Unfortunately, all this power comes with a bit of a learning curve. There are some Graphical User Interface (GUI) front ends to gdb, but the tool is command line based. I will only hit the highlights in this book. For a full course on how to get the most from gdb, I recommend the GNU Debugger Megaprimer at PentesterAcademy.com (http://www.pentesteracademy.com/course?id=4).

Before we get started, I feel that I should point out that gdb was not designed to be used for reverse engineering. There are other debuggers tailor made for reverse engineering. Of these, IDA Pro is perhaps the most popular. With pricing that starts in excess of US$1100, IDA Pro is not for the casual reverse engineer, however.

To load a program into gdb, simply type gdb <executable>. You should see some messages about your file, concerning whether or not it contained debugging symbols (most files you examine likely lack the extra debugging information), a statement that says to type “help” to get help, and then a (gdb) prompt. Typing help leads to the screen shown in Figure 10.26.

FIGURE 10.26

The gdb main help screen.

If you type help info in gdb, you will get a long list of things that the debugger will report on. One of these items is functions. Running info functions with xingyi_bindshell loaded in the debugger produces a long list, some of which is shown in Figure 10.27. Functions are displayed along with their addresses. Incidentally, gdb commands can be abbreviated as long as they are not ambiguous. Typing inf fun would have returned the same results.

FIGURE 10.27

Partial results from running info functions in gdb.

As mentioned previously, gdb can be used to disassemble a program. The command for disassembling a function is just disassemble , i.e. disassemble main. The command can be shortened to disas. Before running this command, you might wish to switch from the default AT&T syntax to the more common Intel syntax. To do so, issue the command set disassembly-flavor intel. Partial results from disassembling main are shown in Figure 10.28. Note that, unlike the output from objdump shown in Figure 10.15, the main function ends at 0x401312.

FIGURE 10.28

Partial results from disassembling main in gdb.

If, after viewing the disassembly of various functions, you decide to run the program (in your sandbox!), you may wish to set at least one breakpoint. The command to set a breakpoint is simply break

. If you supply a function name, the breakpoint is set at the beginning of the function. The command info break lists breakpoints. To delete a breakpoint type delete . The run command will start the program and run everything up to the first breakpoint (if it exists). Typing disassemble with no name or address after it will disassemble a few lines after the current execution point. These commands are illustrated in Figure 10.29.

FIGURE 10.29

Breakpoint management commands.

If you are stopped at a breakpoint, you can take off running again to the next breakpoint, if any, with continue. You may also use stepi and nexti to execute the next Assembly instruction and execute the next Assembly instruction while stepping over any functions encountered, respectively. When stepping through a program, you can just hit the <enter> key, as this causes gdb to repeat the last command. The use of these stepping functions is demonstrated in Figure 10.30.

FIGURE 10.30

Using stepping functions in gdb.

As you are tracing through a program, you might want to examine different chunks of memory (especially the stack) and various registers. The x (examine) command is used for this purpose. The help for x and the first twenty giant values (8 bytes) on the stack in hexadecimal (gdb command x/20xg $rsp) are shown in Figure 10.31. Note that because the stack grows downward, it is much easier to display in the debugger.

FIGURE 10.31

Using the examine command in gdb.

The command info registers display all the registers, as shown in Figure 10.32. Note that if you are running a 32-bit executable, the registers will be named EXX, not RXX, as described earlier in this chapter. For reverse engineering, the RBP, RSP, and RIP (base, stack, and instruction pointers, respectively) are the most important.

FIGURE 10.32

Examining registers in gdb.

Let’s turn our focus to xingyi_rootshell, now that we have learned some of the basics of using gdb. First. we load the program with gdb xingyi_rootshell. Next, we set a breakpoint at the start of main by typing break main. If you prefer Intel syntax, issue the command set disassembly-flavor intel. To run the program with command line argument(s), append the argument(s) to the run command, i.e. run sw0rdm4n. This sequence of instructions is shown in Figure 10.33.

FIGURE 10.33

Running xingyi_rootshell in gdb.

Running disassemble results in the disassembly of the current function, complete with a pointer to the next instruction to be executed. The disassembly of main is shown in Figure 10.34. There are a few things that are readily apparent in this snippet. Thirty-two bytes (0x20) of space are allocated on the stack for local variables. Memory is then allocated on the heap with a call to n_malloc. Two addresses (one from the stack and one inside the program) are loaded into RAX and RDX, and then strcmp is used to compare the two strings stored at these locations. Of significance here is that this snippet makes it clear this embedded string is some sort of password.

FIGURE 10.34

Disassembly of the xingyi_rootshell main function.

If we did not yet realize that this embedded value was a password, we could use the command x/s *0x6010d0 to display this password as shown in Figure 10.35. Note that this is extra easy because the binary was not stripped and had a descriptive name for this variable. Even if it was stripped, the fact that an address is referenced to RIP indicates a variable that is embedded in the program binary. We see the argument to the system call is loaded from address 0x400B9C. If we examine this with x/s 0x400b9c, we see that a bash shell is being started.

FIGURE 10.35

Using gdb to determine the program password and target of system call.

What about the xingyi_reverse_shell? We can do a quick analysis by following the same procedure. First, we load it in gdb with gdb xingyi_reverse_shell. Next, we set a breakpoint in main with break main. Optionally, we set the disassembly flavor to Intel with set disassembly-flavor intel. We can run the program with a parameter using run 127.0.0.1. At this stage, the program should be stopped just inside of the main function and typing disassemble will produce the output shown in Figure 10.36.

FIGURE 10.36

Disassembling main from xingyi_reverse_shell in gdb.

Breaking down the disassembly in Figure 10.36, we can see that this function is fairly simple. Thirty-two (0x20) bytes of space are allocated on the stack. The number of command line parameters and the first parameter are stored in [RBP – 0x14] and [RBP – 0x20], respectively. If the number of command line parameters is greater than 1 (recall that the program name is counted here), then we jump over the line call <_print_usage>. The address of the second command line argument (located at the start of this list + 0x8 to skip the 8-byte pointer for the first argument) is loaded into RDI and validate_ipv4_octet is called. If we did not already know that this command line parameter was supposed to be an IP address, this code snippet would help us figure it out. Again, if the binary was stripped, we would need to work a little harder and investigate the function at 0x400AC5 to figure out what it does. If this function doesn’t return success, _print_usage is called.

Assuming everything is still good, the daemonize function is called. Once again, if the binary had been stripped of this descriptive name, we would have to work a bit harder and delve into the daemonize function to determine its purpose. We see another address referenced to RIP. Here a parameter for _write pid_to_file is being loaded from address

0x6020B8. Running the command x/s *0x6020b8 reveals this string to be

“/tmp/xingyi_reverse_pid”. This and the remainder of the disassembly of main are shown in Figure 10.37.

FIGURE 10.37

Second half of disassembly of main in xingyi_reverse_shell.

We can see that fwrite is called a couple of times and that _log_file is also called. If we examine the values referenced in Figure 10.37, we will see that 0x6020B0 contains the value 0x1E61 (7777) and 0x6020C8 contains the string “/tmp/xingyi_reverse.port”. It was fairly easy to determine what these three binaries do because the author made no attempt to obfuscate the code. The filenames, function names, variable names, etc. made this process easy. What if a malware author is trying to make things hard to detect and/or understand?

Using the GNU Debugger for reverse engineering