MBR-based primary partitions
We will start with the simplest case, primary partitions from MBR-based drives. I have broken up the mounting code into three separate scripts for simplicity. Feel free to combine them if that is what you prefer. It is open source after all. The following script will mount primary partitions from an MBR-based image file.
!/usr/bin/python
#
mount-image.py
This is a simple Python script that will # attempt to mount partitions from an image file.
Images are mounted read-only.
# Developed by Dr. Phil Polstra (@ppolstra) # for PentesterAcademy.com import sys import os.path import subprocess import struct
“”” Class MbrRecord: decodes a partition record from a Master Boot Record
Usage: rec = MbrRecord(sector, partno) where sector is the 512 byte or greater sector containing the MBR partno is the partition number 0-3 of interest
rec.printPart() prints partition information
“”” class MbrRecord():
def init(self, sector, partno):
self.partno = partno # first record at offset 446 & records are 16 bytes offset = 446 + partno * 16 self.active = False # first byte == 0x80 means active (bootable) if sector[offset] == ‘\x80’:
self.active = True self.type = ord(sector[offset+4]) self.empty = False # partition type == 0 means it is empty if self.type == 0:
self.empty = True
sector values are 32-bit and stored in little endian format self.start = struct.unpack(‘<I’, sector[offset + 8: \
offset + 12])[0] self.sectors = struct.unpack(‘<I’, sector[offset + 12: \
offset + 16])[0] def printPart(self): if self.empty == True:
print(“<empty>”) else:
outstr = “” if self.active == True: outstr += “Bootable:” outstr += “Type “ + str(self.type) + “:” outstr += “Start “ + str(self.start) + “:” outstr += “Total sectors “ + str(self.sectors) print(outstr) def usage():
print(“usage “ + sys.argv[0] +
“ \nAttempts to mount partitions from an image file”) exit(1) def main():
if len(sys.argv) < 2: usage()
read first sector if not os.path.isfile(sys.argv[1]):
print(“File “ + sys.argv[1] + “ cannot be opened for reading”) exit(1) with open(sys.argv[1], ‘rb’) as f:
sector = str(f.read(512)) if (sector[510] == “\x55” and sector[511] == “\xaa”):
print(“Looks like a MBR or VBR”) # if it is an MBR bytes 446, 462, 478, and 494 must be 0x80 or 0x00 if (sector[446] == ‘\x80’ or sector[446] == ‘\x00’) and \
(sector[462] == ‘\x80’ or sector[462] == ‘\x00’) and \
(sector[478] == ‘\x80’ or sector[478] == ‘\x00’) and \ (sector[494] == ‘\x80’ or sector[494] == ‘\x00’):
print(“Must be a MBR”)
parts = [MbrRecord(sector, 0), MbrRecord(sector, 1), \
MbrRecord(sector, 2), MbrRecord(sector, 3)] for p in parts:
p.printPart() if not p.empty:
notsupParts = [0x05, 0x0f, 0x85, 0x91, 0x9b, 0xc5, 0xe4, 0xee]
if p.type in notsupParts:
print(“Sorry GPT and extended partitions are “ + “not supported by this script!”) else:
mountpath = ‘/media/part%s’ % str(p.partno) # if the appropriate directory doesn’t exist create it if not os.path.isdir(mountpath):
subprocess.call([‘mkdir’, mountpath]) mountopts = ‘loop,ro,noatime,offset=%s’ % \
str(p. start * 512) subprocess.call([‘mount’, ‘-o’, \ mountopts, sys.argv[1], mountpath])
else:
print(“Appears to be a VBR\nAttempting to mount”) if not os.path.isdir(‘/media/part1’):
subprocess.call([‘mkdir’, ‘/media/part1’])
subprocess.call([‘mount’, ‘-o’, ‘loop,ro,noatime’, \
sys.argv[1], ‘/media/part1’])
if name == “main”:
main()
Let’s break down the preceding script. It begins with the usual she-bang; however, this time we are running the Python interpreter instead of the bash shell. Just as with shell scripts, all of the lines beginning with “#” are comments. We then import Python libraries sys, os.path, subprocess, and struct which are needed to get command line arguments, check for the existence of files, launch other processes or commands, and interpret values in the MBR, respectively.
Next we define a class MbrRecord which is used to decode the four partition entries in the MBR. The class definition is preceded with a Python multi-line comment known as a docstring. Three double quotes on a line start or stop the docstring. Like many objectoriented languages, Python uses classes to implement objects. Python is different from other languages in that it uses indentation to group lines of code together and doesn’t use a line termination character such as the semicolon used by numerous languages.
The line class MbrRecord(): tells the Python interpreter that a class definition for the MbrRecord class follows on indented lines. The empty parentheses indicate that there is no base class. In other words, the MbrRecord is not a more specific (or specialized) version of some other object. Base classes can be useful as they allow you to more easily and eloquently share common code, but they are not used extensively by people who use Python to write quick and dirty scripts to get things done.
The line def init(self, sector, partno): inside the MbrRecord class definition begins a function definition. Python allows classes to define functions (sometimes called methods) and values (also called variables, parameters, or data members) that are associated with the class. Every class implicitly defines a value called self that is used to refer to an object of the class type. With a few exceptions (not described in this book) every class function must have self as the first (possibly only) argument it accepts. This argument is implicitly passed by Python. We will talk more about this later as I explain this script.
Every class should define an init function (that is a double underscore preceding and following init). This special function is called a constructor. It is used when an object of a certain type is created. The init function in the MbrRecord class is used as follows:
partition = MbrRecord(sector, partitionNumber)
This creates a new object called partition of the MbrRecord type. If we want to print its contents we can call its printPart function like so:
partition.printPart()
Back to the constructor definition. We first store the passed in partition number in a class value on the line self.partno = partno. Then we calculate the offset into the MBR for the partition of interest with offset = 446 + partno * 16, as the first record is at offset 446 and each record is 16 bytes long.
Next we check to see if the first byte in the partition entry is 0x80 which indicates the partition is active (bootable). Python, like many other languages, can treat strings as arrays. Also, like most languages, the indexes are zero-based. The == operator is used to check equality and the = operator is used for assignment. A single byte hexadecimal value in Python can be represented by a packed string containing a “\x” prefix. For example, ‘\x80’ in our script means 0x80. Putting all of this together we see that the following lines set a class value called active to False and then resets the value to True if the first byte in a partition entry is 0x80. Note that Python uses indentation to determine what is run if the if statement evaluates to True.
self.active = False
first byte == 0x80 means active (bootable) if sector[offset] == ‘\x80’:
self.active = True
After interpreting the active flag, the MbrRecord constructor retrieves the partition type and stores it as a numerical value (not a packed string) on the line self.type = ord (sector[offset+4]). The construct ord(
Finally, the starting and total sectors are extracted from the MBR and stored in appropriate class values. There is a lot happening in these two lines. It is easier to understand it if you break it down. We will start with the statement sector[offset + 8: offset + 12]. In Python parlance this is known as an array slice. An array is nothing but a list of values that are indexed with zero-based integers. So myArray[0] is the first item in myArray, myArray[1] is the second, etc. To specify a subarray (slice) in
Python the syntax is myArray[
The slices in these last two lines of the constructor contain 32-bit little endian integers in packed string format. If you are unfamiliar with the term little endian, it refers to how multi-byte values are stored in a computer. Nearly all computers you are likely to work with while doing forensics store values in little endian format which means bytes are stored from least to most significant. For example, the value 0xAABBCCDD would be stored as 0xDD 0xCC 0xBB 0xAA or ‘\xDD\xCC\xBB\xAA’ in packed string format. The unpack function from the struct library is used to convert a packed string into a numerical value.
Recall that the struct library was one of the imported libraries at the top of our script. In order for Python to find the functions from these imported libraries you must preface the function names with the library followed by a period. That is why the unpack function is called struct.unpack in our script. The unpack function takes a format string and a packed string as input. Our format string ‘<I’ specifies an unsigned integer in little endian format. The format string input to the unpack function can contain more than one specifier which allows unpack to convert more than one value at a time. As a result, the unpack function returns an array. That is why you will find “[0]” on the end of these two lines as we only want the first item in the returned array (which should be the only item!). When you break it down, it is easy to see that self.start = struct.unpack(‘<I’, sector[offset + 8: offset + 12])[0] gets a 4-byte packed string containing the starting sector in little endian format, converts it to a numeric value using unpack, and then stores the result in a class value named start.
The printPart function in MbrRecord is a little easier to understand than the constructor. First this function checks to see if the partition entry is empty; if so, it just prints “<empty>”. If it is not empty, whether or not it is bootable, its type, starting sector, and total sectors are displayed.
The script creates a usage function similar to what we have done with our shell scripts in the past. Note that this function is not indented and, therefore, not part of the MbrRecord class. The function does make use of the sys library that was imported in order to retrieve the name of this script using sys.argv[0] which is equivalent to $0 in our shell scripts.
We then define a main function. As with our shell scripts, we first check that an appropriate number of command line arguments are passed in, and, if not, display a usage message and exit. Note that the test here is for less than two command line arguments. There will always be one command line argument, the name of the script being run. In other words, if len(sys.argv) < 2: will only be true if you passed in no arguments.
Once we have verified that you passed in at least one argument, we check to see if the file really exists and is readable, displaying an error and exiting if it isn’t, in the following lines of code:
if not os.path.isfile(sys.argv[1]):
print(“File “ + sys.argv[1] + “ cannot be opened for reading”) exit(1)
The next two lines might seem a bit strange if you are not a Python programmer (yet). This construct is the preferred way of opening and reading files in Python as it is succinct and insures that your files will be closed cleanly. Even some readers who use Python might not be familiar with this method as it has been available for less than a decade, and I have seen some recently published Python books in forensics and information security still teaching people the old, non-preferred way of handling files. The two lines in question follow.
with open(sys.argv[1], ‘rb’) as f:
sector = str(f.read(512))
To fully understand why this is a beautiful thing, you need to first understand how
Python handles errors. Like many other languages, Python uses exceptions for error handling. At a high level exceptions work as follows. Any risky code that might generate an error (which is called throwing an exception) is enclosed in a try block. This try block is followed by one or more exception catching blocks that will process different errors (exception types). There is also an optional block, called a finally block, that is called every time the program exits the try block whether or not there was an error. The two lines above are equivalent to the following:
try:
f = open(sys.argv[1], ‘rb’) sector = str(f.read(512)) except Exception as e:
print ‘An exception occurred:’, e finally:
f.close()
The file passed in to the script is opened as a read-only binary file because the ‘rb’ argument passed to open specifies the file mode. When the file is opened, a new file object named f is created. The read function of f is then called and the first 512 bytes (containing the MBR) are read. The MBR is converted to a string by enclosing f.read(512) inside str() and this string is stored in a variable named sector. Regardless of any errors, the file is closed cleanly before execution of the script proceeds.
Once the MBR has been read we do a sanity check. If the file is not corrupted or the wrong kind of file, the last two bytes should be 0x55 0xAA. This is the standard signature for an MBR or something called a Volume Boot Record (VBR). A VBR is a boot sector for a File Allocation Table (FAT) filesystem used by DOS and older versions of Windows. To distinguish between a VBR and MBR we check the first byte for each MBR partition entry and verify that each is either 0x80 or 0x00. If all four entries check out, we proceed under the assumption that it is an MBR. Otherwise we assume it is a VBR and mount the only partition straightaway.
The line
parts = [MbrRecord(sector, 0), MbrRecord(sector, 1), \
MbrRecord(sector, 2), MbrRecord(sector, 3)] creates a list containing the four partition entries. Notice that I said line not lines. The “\” at the end of the first line is a line continuation character. This is used to make things more readable without violating Python’s indentation rules.
At this point I must confess to a white lie I told earlier in this chapter. Python does not have arrays. Rather, Python has two things that look like arrays: lists and tuples. To create a list in Python simply enclose the list items in square brackets and separate them with commas. The list we have described here is mutable (its values can be changed). Enclosing items in parentheses creates a tuple which is used in the same way, but is immutable. Some readers may be familiar with arrays in other languages. Unlike arrays, items in a list or tuple can be of different types in Python.
Once we have the list of partitions, we iterate over the list in the following for loop:
for p in parts:
p.printPart() if not p.empty:
notsupParts = [0x05, 0x0f, 0x85, 0x91, 0x9b, 0xc5, 0xe4, 0xee] if p.type in notsupParts:
print(“Sorry GPT and extended partitions “ + \
“are not supported by this script!”) else:
mountpath = ‘/media/part%s’ % str(p.partno) # if the appropriate directory doesn’t exist create it if not os.path.isdir(mountpath):
subprocess.call([‘mkdir’, mountpath]) mountopts = ‘loop,ro,noatime,offset=%s’ % str(p.start * 512) subprocess.call([‘mount’, ‘-o’, mountopts, sys.argv[1], mountpath])
Let’s break down this for loop. The line for p in parts: starts a for loop block. This causes the Python interpreter to iterate over the parts list setting the variable p to point to the current item in parts with each iteration. We start by printing out the partition entry using p.printPart(). If the entry is not empty we proceed with our attempts to mount it.
We create another list, notsupParts, and fill it with partition types that are not supported by this script. Next, we check to see if the current partition’s type is in the list with if p.type in notsupParts: . If it is in the list, we print a sorry message. Otherwise (else:) we continue with our mounting process.
The line mountpath = ‘/media/part%s’ % str(p.partno) uses a popular Python construct to build a string. The general format of this construct is “some string containing placeholders” % . For example, ‘Hello %s, My name is %s’ % (‘Bob’, ‘Phil’) would evaluate to the string ‘Hello Bob, My name is Phil’. The line in our code causes mountpath to be assigned the value of ‘/media/part0’, ‘/media/part1’, ‘/media/part2’, or ‘/media/part3’.
The line if not os.path.isdir(mountpath): checks for the existence of this mountpath directory. If it doesn’t exist it is created on the next line. The next line uses subprocess.call() to call an external program or command. This function expects a list containing the program to be run and any arguments.
On the next line the string substitution construct is used once again to create a string with options for the mount command complete with the appropriate offset. Note that str(p.start * 512) is used to first compute this offset and then convert it from a numeric value to a string as required by the % operator. Finally, we use subprocess.call() to run the mount command.
Only one thing remains in the script that requires explanation, and that is the last two lines. The test if name == “main”: is a common trick used in Python scripting. If the script is executed the variable name is set to “main”. If, however, the script is merely imported this variable is not set. This allows the creation of Python scripts that can both be run and imported into other scripts (thereby allowing code to be reused).
If you are new to Python you might want to take a break at this point after walking through our first script. You might want to reread this section if you are still a bit uncertain about how this script works. Rest assured that things will be a bit easier as we press on and develop new scripts.
The results of running our script against an image file from a Windows system are shown in Figure 5.12. Figure 5.13 depicts what happens when running the script against an image from an Ubuntu 14.04 system.
FIGURE 5.12
Running the Python mounting script against an image file from a Windows system.
FIGURE 5.13
Running the Python mounting script against an image file from an Ubuntu 14.04 system.