Maintaining Integrity

It is of the utmost importance that evidence not be altered while it is being collected and examined. We are fortunate in digital forensics that we can normally make an unlimited number of identical copies of evidence. Those working with physical forensics are not so lucky. In fact, in many cases difficult choices must be made when quantities of physical evidence are limited as many tests consume evidence.

The primary method of insuring integrity of digital evidence is hashing. Hashing is widely used in computer science as a way of improving performance. A hash function, generally speaking, takes an input of variable size and outputs a number of known size. Hashing allows for faster searches because computers can compare two numbers in one clock cycle versus iterating over every character in a long string which could require hundreds or thousands of clock cycles.

Using hash functions in your programs can add a little complication because more than one input value can produce the same hash output. When this happens we say that a collision has occurred. Collisions are a complication in our programs, but when we are using hashes for encryption or integrity checking the possibility of many collisions is unacceptable. To minimize the number of collisions we must use cryptographic hash functions.

There are several cryptographic hash functions available. Some people still use the Message Digest 5 (MD5) to verify integrity of images. The MD5 algorithm is no longer considered to be secure and the Secure Hash Algorithm (SHA) family of functions is preferred. The original version is referred to as SHA1 (or just SHA). SHA2 is currently the most commonly used variant and you may encounter references to SHA2 (224 bits), SHA256 (256 bits), SHA384 (384 bits), and SHA512 (512 bits). There is a SHA3 algorithm, but its use is not yet widespread. I normally use SHA256 which is a good middle ground offering good performance with low chances of collisions.

We will discuss the details of using hashing in future chapters. For now the high level process is as follows. First, calculate a hash of the original. Second, create an image which we will treat as a master copy. Third, calculate the hash of the copy and verify that it matches the hash of the original. Fourth, make working copies of your master copy. The master copy and original should never be used again. While it may seem strange, the hash on working copies should be periodically recalculated as a double check that the investigator did not alter the image.

results matching ""

    No results matching ""