14: More on NTFS

(Marc says: If you took careful notes and would like to share them with the class, send them to me and I'll post them here.

Summary and outline from our guest lecturer follows.)

  • a review of what you did previously
  • fixup array in detail
  • talked about stopping condition for finding all attributes
  • that attributes are resident or non-resident
  • and then a lot of time on run lists for non-residents.
  • we talked about two complement, and padding.

(here is the outline)

  • Review lecture from last time.
    • $BOOT is sector zero in the file system
    • We can parse it to find the byte offset to the start of the $MFT
      • from $BOOT we get bytes_per_sector, and sectors_per_cluster, MFT_cluster_start
      • Bytes per sector 512
        Sectors per cluster 8
        MFT start 262144
        MFT starts at byte 262144 * 512 * 8 = 1073741824
      • we also get Size of MFT entry ; always 1024
    • We seek to start of MFT and read in first 1024, which is the first entry
  • Each MFT has a header, and then a series of attributes of variable length.
    • carrier’s analogy is that each entry is a box with a set of items.
    • All attributes have a 16 byte headers
      • some are resident, and all data is contained within attribute (i.e, within entry)
      • the rest are non-resident, and data appears in a list of clusters
    • $STD_INFO is always resident
      • parsing is straightforward given where we are in the semester
      • dates are a little hinky
    • $FILE_NAME is always resident
    • $DATA is resident only for files of about 700 bytes or fewer
      • what’s the data for the $MFT? it’s the clusters in which the entries are allocated
      • the MFT entries grow as the number of files grow (or the underlying disk is expanded in size)
        • it never shrinks
      • for other files, the $DATA attribute is the file content
  • So, let’s look at an entry
    • MFT entry header
      • fix up array, is there as a feeble error check
        • show example
    • In python, one would actually make the switch to the byte array and continue
    • then parse an attribute; stopping condition for finding all attributes
    • parse a resident one
    • go through run lists
    • getSigned(), getUnsigned
    • more run lists
    • do negative offset