EXTENDED FILESYSTEM BASICS

Running Linux allows you to have lots of choices. This includes your choice of filesystems. That said, some version of the Linux extended filesystem is found on the vast majority of Linux systems. These are commonly referred to as extN filesystems, where N is the version in use (normally 2, 3, or 4). The ext2 filesystem is popular for partitions that don’t change often such as boot partitions. Most Linux distributions use ext4 by default on general use filesystems such as /, /home, /usr, /opt, etc.

A natural question to ask is what is the extended filesystem extended from? The answer is the Unix File System (UFS). While the extN family is an extension of UFS, it is also a simplification. Some of the features in UFS were no longer relevant to modern media, so they were removed to simplify the code and improve performance. The extN family is meant to be robust with good performance.

There is a reason that ext2 is normally reserved for static filesystems. Both ext3 and ext4 are journaling filesystems, but ext2 is a non-journaling filesystem. What’s a journal? In this context it isn’t a chronicling of someone’s life occurrences. Rather journaling is used to both improve performance and reduce the chances of data corruption.

Here is how a journaling filesystem works. Writes to the media are not done immediately, rather the requested changes are written to a journal. You can think of these updates like transactions in a database. When a command returns it means that either the entire transaction was completed (all of the data was written or updated) in which case it returns success or the filesystem was returned to its previous state if the command could not be completed successfully. In the event that the computer was not shut down cleanly, the journal can be used to return things to a consistent state. Having a journaling filesystem significantly speeds up the filesystem check (fsck) process.

Extended filesystems store information in blocks which are organized into block groups. The blocks are normally 1024, 2048, or 4096 bytes in size. Most media you are likely to encounter use 512 byte sectors. As a result, blocks are 2, 4, or 8 sectors long. For readers familiar with the FAT and NTFS filesystems, a block in Unix or Linux is roughly equivalent to a cluster in DOS or Windows. The block is the smallest allocation unit for disk space.

A generic picture of the block groups is shown in Figure 7.1. Keep in mind that not every element shown will be present in each block group. We will see later in this chapter that the ext4 filesystem is highly customizable. Some elements may be moved or eliminated from certain groups to improve performance.

We will describe each of the elements in Figure 7.1 in detail later in this chapter. For now, I will provide some basic definitions of these items. The boot block is just what it sounds like, boot code for the operating system. This might be unused on a modern system, but it is still required to be there for backward compatibility. A superblock describes the filesystem and tells the operating system where to find various elements (inodes, etc.). Group descriptors describe the layout of each block group. Inodes (short for index nodes) contain all the metadata for a file except for its name. Data blocks are used to store files and directories. The bitmaps indicate which inodes and data blocks are in use.

FIGURE 7.1

Generic block group structure. Note that some components may be omitted from a block group depending on the filesystem version and features.

The extended filesystem allows for optional features. The features fall into three categories: compatible, incompatible, and read-only compatible. If an operating system does not support a compatible feature, the filesystem can still be safely mounted. Conversely, if an operating system lacks support for an incompatible feature, the filesystem should not be mounted. When an operating system doesn’t provide a feature on the read-only compatible list, it is still safe to mount the filesystem, but only if it is attached as read-only. Something to keep in mind if you ever find yourself examining a suspected attacker’s computer is that he or she might be using non-standard extended features.

The Sleuth Kit (TSK) by Brian Carrier is a set of tools for filesystem analysis. One of these tools, fsstat, allows you to collect filesystem (fs) statistics (stat). By way of warning, this tool appears to be somewhat out of date and may not display all of the features of your latest version ext4 filesystem correctly. Don’t worry, we will develop some up-to-date scripts later in this chapter that will properly handle the latest versions of ext4 as of this writing (plus you will have Python code that you could update yourself if required).

In order to use fsstat you must first know the offset to the filesystem inside your image file. Recall that we learned in Chapter 5 that the fdisk tool could be used to determine this offset. The syntax for this command is simply fdisk . As we can see in Figure 7.2, the filesystem in our PFE subject image begins at sector 2048.

FIGURE 7.2

Using fdisk to determine the offset to the start of a filesystem.

Once the offset is determined, the command to display filesystem statistics is just fsstat -o <offset> , i.e., fsstat -o 2048 pfe1.img. Partial results from running this command against our PFE subject image are shown in Figure 7.3 and Figure 7.4. The results in Figure 7.3 reveal that we have a properly unmounted ext4 filesystem that was last mounted at / with 1,048,577 inodes and 4,194,048 4kB blocks. Compatible, incompatible, and read-only compatible features are also shown in this screenshot. From Figure 7.4 we can see there are 128 block groups with 8,192 inodes and 32,768 blocks per group. We also see statistics for the first two block groups.

FIGURE 7.3

Result of running fsstat – part 1.

FIGURE 7.4

Results of running fsstat – part 2.

EXTENDED FILESYSTEM BASICS