Directory entries

We have learned that inodes contain all the metadata for a file. They also contain the location of the file’s data blocks. The only thing that remains to be known about a file is its name. This connection between an inode and a filename is made in directories. Not surprisingly, directories are stored in files in Linux, the operating system where everything is a file. In our discussion of inodes earlier in this chapter we did say that inode 2 was used to store the root directory.

The classic directory entry consists of a 4-byte inode number, followed by a 2-byte record length, then a 2-byte name length, and finally the name which may be up to 255 characters long. This is shown in Table 7.15. Notice that the name length is two bytes, yet the maximum name length can be stored in only one byte. This may have been done for byte alignment purposes originally.

Table 7.15. The classic directory entry structure.

Offset	Size	Name	Description
0x0	4	Inode	Inode
0x4	2	Rec len	Record length
0x6	2	Name len	Name length
0x8		Name	Name string (up to 255 characters)

Realizing that the upper byte was unused, an (incompatible) filesystem feature, File Type, was created to re-purpose this byte to hold file type information. It should be fairly obvious why this is on the incompatible feature list, as interpreting this as part of the name length would make it seem like all the filenames had become garbled. This optimization speeds up any operations that only make sense for a certain type of file by eliminating the need to read lots of inodes merely to determine file type. The directory entry structure for systems with the File Type feature is shown in Table 7.16.

Table 7.16. Directory entry structure when File Type feature is enabled.

Offset	Size	Name	Description
0x0	4	Inode	Inode
0x4	2	Rec	Record length
0x6	1	Name	Name length
0x7	1	File type	0x00 Unknown 0x01 Regular 0x02 Directory 0x03 Char device 0x04 Block device 0x05 FIFO 0x06
0x8		Name	Name string (up to 255 characters)

The original directories had no checksums or other tools for integrity checking. In order to add this functionality without breaking existing systems, a special type of directory entry known as a directory tail was developed. The directory tail has an inode value of zero which is invalid. Older systems see this and assume that the end of the directory (tail) has been reached. The record length is set correctly to 12. The directory tail structure is shown in Table 7.17.

Table 7.17. Directory tail structure.

Offset	Size	Name	Description
0x0	4	Inode	Set to zero (inode zero is invalid so it is ignored)
0x4	2	Rec len	Record length (set to 12)
0x6	1	Name len	Name length (set to zero so it is ignored)
0x7	1	File type	Set to 0xDE
0x8	4	Checksum	Directory leaf block CRC32 checksum

The linear directories presented thus far in this section are fine as long as the directories do not grow too large. When directories become large, implementing hash directories can improve performance. Just as is done with the checksum, hash entries are stored after the end of the directory block in order to fool old systems into ignoring them. Recall that there is an ext4_index flag in the inode that will alert compatible systems to the presence of the hash entries.

The directory nodes are stored in a hashed balanced tree which is often shortened to htree. We have seen trees in our discussion of extents earlier in this chapter. Those familiar with NTFS know that directories on NTFS filesystems are stored in trees. In the NTFS case, nodes are named based on their filename. With htrees on extended filesystems nodes are named by their hash values. Because the hash value is only four bytes long, collisions will occur. For this reason, once a record has been located based on the hash value a string comparison of filenames is performed, and if the strings do not match, the next record (which should have the same hash) is checked until a match is found.

The root hash directory block starts with the traditional “.” and “..” directory entries for this directory and the parent directory, respectively. After these two entries (both of which are twelve bytes long), there is a 16-byte header, followed by 8-byte entries through the end of the block. The root hash directory block structure is shown in Table 7.18.

Table 7.18. Root hash directory block structure.

Offset	Size	Name	Description
0x0	12	Dot rec	“.” directory entry (12 bytes)
0xC	12	DotDot rec	“..” directory entry (12 bytes)
0x18	4	Inode no	Inode number set to 0 to make following be ignored
0x1C	1	Hash	0x00 Legacy 0x03 Legacy unsigned 0x01 Half MD4 0x04 Unsigned half MD4 0x02 Tea 0x05
0x1D	1	Info length	Hash info length (0x8)
0x1E	1	Indir levels	Depth of tree
0x1F	1	Unused	Flags (unused)
0x20	2	Limit	Max number of entries that follow this header
0x22	2	Count	Actual number of entries after header
0x24	4	Block	Block w/i directory for hash=0
0x28		Entries	Remainder of block is 8-byte entries

If there are any interior nodes, they have the structure shown in Table 7.19. Note that three of the fields are in italics. The reason for this is that I have found some code that refers to these fields and other places that seem to imply that these fields are not present.

Table 7.19. Interior node hash directory block structure. Entries in italics may not be present in all systems.

Offset	Size	Name	Description
0x0	4	Fake inode	Set to zero so this is ignored
0x4	2	Fake rec len	Set to block size (4k)
0x6	4	Name length	Set to zero
0x7	1	File type	Set to zero
0x8	2	Limit	Max entries that follow
0xA	4	Count	Actual entries that follow
0xE	4	Block	Block w/i directory for lowest hash value of block
0x12		Entries	Directory entries

The hash directory entries (leaf nodes) consist of two 4-byte values for the hash and block within the directory of the next node. The hash directory entries are terminated with a special entry with a hash of zero and the checksum in the second 4-byte value. These entries are shown in Table 7.20 and Table 7.21.

Table 7.20. Hash directory entry structure.

Offset	Size	Name	Description
0x0	4	Hash	Hash value
0x4	4	Block	Block w/i directory of next node

Table 7.21. Hash directory entry tail with checksum.

Offset	Size	Name	Description
0x0	4	Reserved	Set to zero
0x4	4	Checksum	Block checksum

We can now add some code to our extfs.py file in order to interpret directories. To keep things simple, we won’t utilize the hash directories if they exist. For our purposes there is likely to be little if any speed penalty for doing so. The additions to our extfs.py file follow.

def printFileType(ftype):

if ftype == 0x0 or ftype > 7:

return “Unknown” elif ftype == 0x1: return “Regular” elif ftype == 0x2:

return “Directory” elif ftype == 0x3: return “Character device” elif ftype == 0x4: return “Block device” elif ftype == 0x5: return “FIFO” elif ftype == 0x6: return “Socket” elif ftype == 0x7: return “Symbolic link” class DirectoryEntry(): def init(self, data): self.inode = getU32(data) self.recordLen = getU16(data, 0x4) self.nameLen = getU8(data, 0x6) self.fileType = getU8(data, 0x7) self.filename = data[0x8 : 0x8 + self.nameLen] def prettyPrint(self):

print(“Inode: %s File type: %s Filename: %s” % (str(self.inode), \

printFileType(self.fileType), self.filename))

parses directory entries in a data block that is passed in def getDirectory(data):

done = False retVal = [] i = 0 while not done:

de = DirectoryEntry(data[i: ]) if de.inode == 0: done = True else:

retVal.append(de) i += de.recordLen if i >= len(data): break return retVal

There are no new techniques in the code above. We can also create a new script, ils.py, which will create a directory listing based on an inode rather than a directory name. The code for this new script follows. You might notice that this script is very similar to icat.py with the primary difference being that the data is interpreted as a directory instead of being written to standard out.

!/usr/bin/python

ils.py

This is a simple Python script that will # print out file for in an inode from an ext2/3/4 filesystem inside # of an image file.

Developed for PentesterAcademy # by Dr. Phil Polstra (@ppolstra) import extfs import sys import os.path import subprocess import struct import time from math import log def usage():

print(“usage “ + sys.argv[0] + “ <offset> \n”\

“Displays directory for an inode from an image file”) exit(1) def main():

if len(sys.argv) < 3: usage()

read first sector if not os.path.isfile(sys.argv[1]):

print(“File “ + sys.argv[1] + “ cannot be openned for reading”) exit(1) emd = extfs.ExtMetadata(sys.argv[1], sys.argv[2]) # get inode location inodeLoc = extfs.getInodeLoc(sys.argv[3], \ emd.superblock.inodesPerGroup) offset = emd.bgdList[inodeLoc[0]].inodeTable \

emd.superblock.blockSize + \ inodeLoc[1] * emd.superblock.inodeSize with open(str(sys.argv[1]), ‘rb’) as f:

f.seek(offset + int(sys.argv[2]) * 512) data = str(f.read(emd.superblock.inodeSize)) inode = extfs.Inode(data, emd.superblock.inodeSize)

datablock = extfs.getBlockList(inode, sys.argv[1], sys.argv[2], \ emd.superblock.blockSize) data = “” for db in datablock:

data += extfs.getDataBlock(sys.argv[1], long(sys.argv[2]), db, \

emd.superblock.blockSize)

dir = extfs.getDirectory(data) for fname in dir:

fname.prettyPrint() if name == “main”:

main()

The results from running the new script against the root directory (inode 2) and the /tmp directory from the PFE subject system are shown in Figure 7.24 and Figure 7.25, respectively. Notice that the “lost+found” directory is in inode 11 which is the expected place. In Figure 7.25 two files associated with a rootkit are highlighted.

FIGURE 7.24

Running ils.py against the root directory of the PFE subject system.

FIGURE 7.25

Running ils.py against the /tmp directory of the PFE subject system.

Directory entries

Directory entries

parses directory entries in a data block that is passed in def getDirectory(data):

!/usr/bin/python

ils.py

This is a simple Python script that will # print out file for in an inode from an ext2/3/4 filesystem inside # of an image file.

Developed for PentesterAcademy # by Dr. Phil Polstra (@ppolstra) import extfs import sys import os.path import subprocess import struct import time from math import log def usage():

read first sector if not os.path.isfile(sys.argv[1]):

FIGURE 7.24

results matching ""

No results matching ""