09: istat FAT16

Estimated time to complete: Three to four hours

In this assignment we’ll continue the previous assignment’s parsing of the FAT16 filesystem. (I hope you wrote clear code!) In particular, we’re going to recreate (some of) the functionality of The Sleuth Kit’s istat.

istat examines a particular inode (in FAT16, a metadata address corresponding to a directory entry) and returns information about the file or directory represented by that directory entry:

> istat adams.dd 7
Directory Entry: 7
Allocated
File Attributes: File, Archive
Size: 2585088
Name: DESIGNS.DOC

Directory Entry Times:
Written:    2008-08-28 16:10:28 (EDT)
Accessed:   2008-08-28 00:00:00 (EDT)
Created:    2008-08-28 16:10:28 (EDT)

Sectors:
3743 3744 3745 3746 3747 3748 3749 3750 
...list of sectors elided...
8791 8792

You’re going to recover most of this same information directly in a Python program.

What to do

Implement an istat_fat16() function.

istat_fat16 takes two mandatory arguments. The first is an open file-like object containing a valid FAT16 filesystem. The second denotes the inode (metadata address) to examine. The next argument is optional, and denotes the sector size of the filesystem (default: 512). The last argument is optional, and denotes the offset (in sectors, default: 0) from the start of the file-like object to the start of the FAT16 filesystem.

For ease of debugging and autograding, istat_fat16 will return a list of strings. The list should be composed exactly of the lines of output that istat would produce when run on the same filesystem and inode (ignoring leading and trailing whitespace) – this is the same format as for the previous assignment.

As before, most of the values in the output of istat should be self-explanatory given lecture material and Carrier’s text (though please feel free to ask if you don’t know what a value is or where it comes from!). Lecture 02 gives a brief overview of the output of fls. And the end of Lecture 11 / beginning of Lecture 12 describes how metadata addresses are assigned to directory entries on a FAT16 filesystem by TSK. (We didn’t actually talk about it in class until the start of lecture 12.)

From here, you could reproduce the icat functionality, as you’ve figured out where a file resides in the cluster area, but I won’t ask you to do that.

We will test your function metadata addresses pointing to directory entries that represent undeleted directories and files, on metadata addresses corresponding to directory entries representing long file names, and on addresses corresponding to deleted files and/or directories.

We will not test your function on invalid values (invalid metadata addresses, or valid metadata addresses that point to areas of disk that are not directory entries). Nor will we test it on “virtual” addresses (like $MBR).

On filename case

DOS FAT16 only allow uppercase filenames. But a later version (Win9x? NT? Not sure) introduced a way to flag a filename as lowercase, separate from long filenames.

In particular, byte 12 (0xc) of the directory entries, listed as “reserved” in Carrier, encodes this information. See: https://en.wikipedia.org/wiki/Design_of_the_FAT_file_system#Directory_entry

The relevant part is “Windows NT and later versions uses bits 3 and 4 to encode case information (see VFAT); otherwise 0.”

And then, with some more digging, here: https://en.wikipedia.org/wiki/8.3_filename

If a filename contains only lowercase letters, or is a combination of a lowercase basename with an uppercase extension, or vice versa; and has no special characters, and fits within the 8.3 limits, a VFAT entry is not created on Windows NT and later versions such as XP. Instead, two bits in byte 0x0c of the directory entry are used to indicate that the filename should be considered as entirely or partially lowercase. Specifically, bit 4 means lowercase extension and bit 3 lowercase basename, which allows for combinations such as example.TXT or HELLO.txt but not Mixed.txt.

What to submit

Submit a single Python file named istat_fat16.py. This file must define the function described above. It should look like the following, but with an actual implementation rather than a placeholder. I’ve included code to parse FAT date/time stamps for you. Note that you have to pass in the tenths parameter to decode_fat_time separately to get the correct time – it’s stored in a different byte than the rest of the the timestamp.

import struct


def as_unsigned(bs, endian='<'):
    unsigned_format = {1: 'B', 2: 'H', 4: 'L', 8: 'Q'}
    if len(bs) <= 0 or len(bs) > 8:
        raise ValueError()
    fill = '\x00'
    while len(bs) not in unsigned_format:
        bs = bs + fill
    result = struct.unpack(endian + unsigned_format[len(bs)], bs)[0]
    return result


def decode_fat_time(time_bytes, tenths=0, tz='EDT'):
    v = as_unsigned(time_bytes)
    second = int(int(0x1F & v) * 2)
    if tenths > 100:
        second += 1
    minute = (0x7E0 & v) >> 5
    hour = (0xF800 & v) >> 11
    return '{:02}:{:02}:{:02} ({})'.format(hour, minute, second, tz)


def decode_fat_day(date_bytes):
    v = as_unsigned(date_bytes)
    day = 0x1F & v
    month = (0x1E0 & v) >> 5
    year = ((0xFE00 & v) >> 9) + 1980
    return '{}-{:02}-{:02}'.format(year, month, day)


def istat_fat16(f, address, sector_size=512, offset=0):
    pass


if __name__ == '__main__':
    # The print() below just exercises the time/date decoder and should not be included
    # in your final submission!
    #
    # the values below are from the directory entry in adams.dd that corresponds to the
    # creation date/time of the `IMAGES` directory in the root directory, at
    # metadata address 5; it starts at offset 0x5240 from the start of the image
    print(decode_fat_day(bytes.fromhex('E138')), decode_fat_time(bytes.fromhex('C479'), 0))

Tests

Here are some tests for the ever-present adams.dd: istat-adams-tests.zip.

You can also write your own on the basis of fls and istat output on other disk images we’ve provided so far in the course.

You are probably should write these additional tests. If you wish to be memorialized as a hero forever in the hearts of your peers, you might even post ‘em to Piazza.

fls

You may find it helpful to use fls to get metadata addresses to generate your own tests. fls lists the contents of directories stored in a volume, and includes information on deleted files. fls can take a FAT16 filesystem as input and return output as follows:

fls adams.dd
r/r 3:  ADAMS       (Volume Label Entry)
d/d 5:  images
r/r 7:  Designs.doc
v/v 163171: $MBR
v/v 163172: $FAT1
v/v 163173: $FAT2
d/d 163174: $OrphanFiles

representing the contents of the root directory of that filesystem, the last four of which have special meanings.

Three of the four are “virtual” entries, which TSK uses to provide access to the named entities. The last is for orphan files. The metadata addresses of these four entries immediately follow the last actual metadata address in the volume.

fls can also take an inode number (a metadata address representing a directory entry that corresponds to a directory) as an optional argument, and will display the contents of that directory instead of the root directory:

fls adams.dd 5
r/r * 549:  _MG_3027.JPG