13: Introduction to NTFS

DRAFT

Announcements

So proud: Look how many people finished A07 on time!

We’re still going over regrade requests for the exam. Please get them all in by tomorrow (Friday).

Regret clause: If you think maybe some of your homework code looks too much like a friend’s, please let me know before I do honesty policy checking next week.

Finishing up FAT

Let’s finish up the example from last class.

(We may not get through all of the NTFS material below today; that’s OK, we’ll review and continue after break.)

NTFS: An overview

NTFS is the default filesystem on all modern versions of Windows; thus, like FAT, it’s in wide use. Unlike FAT, it’s in use as the filesystem that most home and business computers run on.

In some ways NTFS is simpler than FAT: it was designed from the ground up to be extensible, and so its design is more principled (and free of the legacy cruft that encumbers, say, the FAT boot sector and its FAT12/16/32 nonsense).

That’s the good news. The bad news is that the extensibility is not theoretical. Generic on-disk data structures wrap more specific data structures, so that the internals can be updated over time. To understand NTFS, we’ll need to cover both the generic and specific data structures (though not all of them), and that means there’s a lot of details to keep straight in your head. I’ll do my best in lecture, but you are almost certainly going to need to read and re-read Carrier as well to understand this material.

One core concept in NTFS is simple: Everything is (or is stored in) a file. Regular files, directories, the structures that control the filesystem’s layout on disk (like the FAT from FAT16) – all are either files or stored in files. There’s no separate plane of existence for filesystem metadata (like inodes in a UNIX-y filesystem or the FAT + directory entries in FAT16). Certain special files have special names (like what we think of as the “boot sector” from FAT is called $Boot in NTFS), but they’re all just considered files by the file system.

The next core concept in NTFS is that the above is a bit of a cheat. Most (but not all) of what we think of as filesystem metadata is stored in one particular data structure: the Master File Table (MFT), which is stored as a file (named $MFT of course) but contents-wise is analogous to the FATs and dirents, as we’ll see. We’re going to spend a lot of time today and next week talking about the MFT and how it relates to the files stored on disk.

The final high-level thing you need to know about NTFS is that it breaks a disk up into allocatable units called clusters, just like FAT. Just like FAT, clusters are sized as a power-of-two-multiple of the underlying disk sector size. Unlike FAT, though, cluster 0 starts at the beginning of the partition, so there’s none of the “first cluster is cluster number 2” nonsense to contend with.

Finding the MFT

The Master File Table (MFT) contains information about all files and directories in its NTFS. Each file/directory has an MFT entry; the table is just a linear array of MFT entries, numbered with a file number, starting from 0.

How do we find the MFT? Just like in FAT, the first sector of the volume contains a boot sector (which again, in NTFS is just a file, named $Boot). That boot sector encodes the minimal information necessary to understand and parse the volume, including the bytes per sector, sectors per cluster, cluster address of the MFT, and the MFT entry size. Once you can find the MFT, you need to go there to learn the rest of what you need to know about the volume.

Let’s look at an example. Download simple.ntfs and follow along with Table 13.18 on page 380 in Carrier.

00000000  eb 52 90 4e 54 46 53 20  20 20 20 00 02 08 00 00  |.R.NTFS    .....|
00000010  00 00 00 00 00 f8 00 00  00 00 00 00 00 00 00 00  |................|
00000020  00 00 00 00 80 00 80 00  ff 4f 00 00 00 00 00 00  |.........O......|
00000030  04 00 00 00 00 00 00 00  ff 04 00 00 00 00 00 00  |................|
00000040  f6 00 00 00 01 00 00 00  a4 a4 a1 72 46 d9 dc 42  |...........rF..B|
00000050  00 00 00 00 fa 33 c0 8e  d0 bc 00 7c fb 68 c0 07  |.....3.....|.h..|

The bytes per sector are stored in bytes 11–12. Here it’s 512.

The sectors per cluster are stored in byte 13. Here it’s 8, so clusters are 8 * 512 B = 4 KB clusters.

The cluster address of the MFT is stored in bytes 48–55. Here, it’s 4.

The size of the file record is at byte 64. It (and the size of the index record, at byte 68) is stored in a special format. If, when interpreted as a signed byte it’s positive, then it’s the number of clusters used for that record. If it’s negative, than 2^(abs(value)) bytes are used. Here it’s -10, which means that file records are 1KB each (this is the default value).

Compare this with index records, the size of which are stored in byte 68. Here, it’s 1, which means index records are 4KB (one cluster) long.

So now we can go and find the start of the MFT in the volume. It’s at cluster 4. Cluster 4 is 4 * 4,096 bytes into the file, at offset 0x4000.

Let’s double check against the output of fsstat to see that we’re doing this correctly.

fsstat simple.ntfs
# ... output follows ...

What’s in the MFT?

The MFT is just a sequence of entries. The first 16 are reserved by MS for filesystem metadata information, but in practice it’s the first 24 that are reserved. Table 11.1 shows the contents of the reserved entries.

Entry 0 is an entry for the MFT itself. We need this, because although the boot sector tells us where the MFT starts, it (the MFT) might run across multiple clusters. This entry tells us where to find the rest of the MFT!

Entry 3 is the $Volume information; entry 6 is for the $Bitmap (similar to the FAT, but it only tracks allocation, not runs); entry 7 is for the $Boot sector, and so on.

We’re going to look at one shortly. But before we do, let’s talk a little about the general structure of an MFT entry.

It starts with an MFT entry header, described in detail in Table 13.1. Then there’s a sequence of attribute (header, content) pairs, with (usually) some unused space at the end of the entry.

The attribute header identifies the attribute type, size, and name, among other things.

The attribute contents can have any format and any size: one perhaps obvious use is to store the contents of a file corresponding to the entry. Small attribute contents can fit in the MFT entry (one systems consequence is that small enough – roughly, under 700 B – files don’t automatically waste tons of space, as they do in FAT, since they don’t live in a cluster). These are called resident attributes generally, whether they store files or just other attribute content.

Larger attributes (again, might be files, might be other things) might not fit in the entry; these are called non-resident. Non-resident attributes are stored in clusters. The clusters are identified by runlists. Runlists are just lists of runs of contiguous clusters that hold the file. See Figure 11.6.

We know that the MFT is at 16K into this volume; let’s use some UNIXy tools to pull out the first entry so that we can see offsets from zero in this entry:

dd if=simple.ntfs of=zeroth-mft-entry bs=1024 count=1 skip=16

…and take a look at it.

00000000  46 49 4c 45 30 00 03 00  00 00 00 00 00 00 00 00  |FILE0...........|
00000010  01 00 01 00 38 00 01 00  98 01 00 00 00 04 00 00  |....8...........|
00000020  00 00 00 00 00 00 00 00  04 00 00 00 00 00 00 00  |................|
00000030  03 00 00 00 00 00 00 00  10 00 00 00 60 00 00 00  |............`...|
00000040  00 00 18 00 00 00 00 00  48 00 00 00 18 00 00 00  |........H.......|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000070  06 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000080  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000090  00 00 00 00 00 00 00 00  30 00 00 00 68 00 00 00  |........0...h...|
000000a0  00 00 18 00 00 00 02 00  4a 00 00 00 18 00 01 00  |........J.......|
000000b0  05 00 00 00 00 00 05 00  80 64 e1 e7 8b a1 d2 01  |.........d......|
000000c0  80 64 e1 e7 8b a1 d2 01  80 64 e1 e7 8b a1 d2 01  |.d.......d......|
000000d0  80 64 e1 e7 8b a1 d2 01  00 70 00 00 00 00 00 00  |.d.......p......|
000000e0  00 6c 00 00 00 00 00 00  06 00 00 00 00 00 00 00  |.l..............|
000000f0  04 03 24 00 4d 00 46 00  54 00 00 00 00 00 00 00  |..$.M.F.T.......|
00000100  80 00 00 00 48 00 00 00  01 00 40 00 00 00 01 00  |....H.....@.....|
00000110  00 00 00 00 00 00 00 00  12 00 00 00 00 00 00 00  |................|
00000120  40 00 00 00 00 00 00 00  00 30 01 00 00 00 00 00  |@........0......|
00000130  00 04 01 00 00 00 00 00  00 04 01 00 00 00 00 00  |................|
00000140  11 13 04 00 00 00 00 00  b0 00 00 00 48 00 00 00  |............H...|
00000150  01 00 40 00 00 00 03 00  00 00 00 00 00 00 00 00  |..@.............|
00000160  00 00 00 00 00 00 00 00  40 00 00 00 00 00 00 00  |........@.......|
00000170  00 10 00 00 00 00 00 00  10 00 00 00 00 00 00 00  |................|
00000180  10 00 00 00 00 00 00 00  11 01 02 00 00 00 00 00  |................|
00000190  ff ff ff ff 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 03 00  |................|
00000200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000003f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 03 00  |................|
00000400

See Table 13.1 on page 353.

It starts with the four byte sequence corresponding to ASCII "FILE" (or "BAAD" if there’s an error on the disk in this entry). Let’s find the first attribute – remember, all entries in the MFT consist of an MFT header, followed by attributes (which themselves consist of headers and contents). This first attribute’s offset (from the start of the entry) is stored in bytes 20–21. Here, its value is 56 (0x38).

Attributes: headers and contents

Let’s skip to the attribute, which starts with a header in a standard format. See Tables 13.2–13.4. The first 16 bytes are the same in resident and non-resident attribute headers; after that they diverge.

The header starts with a four-byte type tag. Here, it’s 16, which is a $STANDARD_INFORMATION header. This and $FILE_NAME (48) are two attributes that nearly every entry will have.

The next four bytes (4–7) tell us the length. Here it’s 96; this means the next attribute starts at offset 56 + 96 = 152 (96 from the start of the current attribute’s start).

Byte 8 (offset from 56, remember, so byte 64 in the dump) tells us if the attribute’s content is non-resident. Here it’s zero, so this attribute;s content is resident – that is, it’s embedded in the MFT entry.

(Discussion names for standard attributes? E.g., as ADS?)

Let’s jump ahead to bytes 16–19 (again: offset from 56, so go to 72 0x48) to get the size (72) and bytes 20–21 (offset 76: 0x4c) to get the offset (24) of this attribute’s content. (Sanity check: 72 + 24 = 96, which is the size of the attribute in total and also the offset to the next attribute)

So you get the idea; Tables 13.5 and 13.6 tell you how to parse the $STANDARD_INFORMATION attribute’s contents.

More next class.