03: File System Lab

In this assignment, which is more lab-like than question-based, you will explore the implementation of create several filesystems. In particular, you will manipulate them in specified ways (creating and deleting files), and then see which portions of those files are recoverable or not.

You’ll need access to a UNIX-like system to complete this assignment. I recommend a recent installation of Ubuntu Linux. You can install Ubuntu in a virtual environment such as provided by Virtual Box. Keep in mind you’ll need to execute some of these commands as root (or via sudo, or by otherwise having necessary privileges), in particular the mounting and unmounting of filesystems.

Overview

You are going to create an empty disk image, initialize it with a filesystem, mount it, execute a series of commands that create and remove files (and thus change the filesystem and disk image), unmount the disk image, and then examine the disk image to see what data remains.

You’ll repeat this process, once for FAT, once for NTFS, once for Ext3, and once for ZFS. You’ll also perform a small experiment with redundancy in ZFS.

The basic steps

Create a new disk 10 MB image, for example:

dd if=/dev/zero of=fat-image.dd bs=1M count=10

This will create a 10M disk image, which is big enough for most of what you need to do for this assignment. But ZFS requires at least 64M, so you may want to make them all that big for consistency; up to you.

Initialize it with a file system:

mkfs.fat fat-image.dd

Create a unique mount point:

mkdir /mnt/my-fat-fs

Mount the image:

mount image.dd /mnt/my-fat-fs

Now, /mnt/my-fat-fs is a readable/writable part of your computer’s file system. There’s a couple of caveats you need to be aware of while working in this directory. First, some programs (like, say, text editors) may create and then delete temporary files. Second, changes to the filesystem won’t immediately be reflected in the disk image; generally, invoking sync finalizes all pending writes, though that may vary depending upon the filesytem you’re wokring with (and its implementation).

Now, check how much free space is available on that mount point. df or the like will help here. It’s going to be less than the 10M, because of the filesystem metadata.

We are now going to create files on the disk such that, in a best-first allocation strategy, there will be fragmentation. I suggest you use a unix utility or write a (short!) program that will write files of arbitrary size and contents to a destination.

Suppose there are n bytes available on the filesystem. Create three files of roughly size n/4, n/4, and n/2 (a little smaller, to account for rounding to the nearest cluster in FAT, for example), call them F1, F2, and F3. Then delete F1 and F3. Write two files to disk of size about n/3 (call them F4 and F5).

As you create each file, I suggest you give them distinctive names (that is, easily find-able via grep or strings or some other utility) , and distinctive (that is, easy-to-recognize when carving) contents. You could just fill each file with bytes of the same value (same within file, different between files) but I’d also suggest putting a counter or the like in the first 4 bytes of each 512 block of the file – this will let you see which parts got written where in the disk.

Unmount the partition:

umount /mnt/my-fat-fs

Do it again

NTFS

Repeat the above steps for NTFS. You’ll use mkfs.ntfs, and should probably name your image file and mount point something different, but there’s nothing else tricky here. Well, except that you may need to pass the --force argument to mkfs.ntfs.

Ext3

Again, this time with mkfs.ext3, which will create a journal automatically.

ZFS

This time the disk image must be at least 64M. To create the pool and mount it in one command:

zpool create onediskpool /the/full/path/to/zfs.dd

Note you have to specify the full path to your disk image; zpool otherwise assumes it’s a Linux device in /dev and will fail.

To unmount it:

zfs unmount onediskpool

ZFS, one last time

This time, create three disk images, named, say, 01.dd 02.dd and 03.dd.

Create a pool out of them in RAID-Z mode:

zpool create raidpool raidz1 /path/to/01.dd /path/to/02.dd /path/to/03.dd

Remember to unmount when you’re done.

Analysis

You now have a pile of disk images available to you: one each for FAT, NTFS, and Ext3, and four for ZFS (one for a single vdev pool; one for a three-vdev RAID-Z pool). Each contains file metadata and data in various locations. I want you to go looking for it, and write a brief report of what you find.

You can use The Sleuth Kit and / or plain old carving – you know what you’re looking for, so carving should be straightforward. You may need to carve for ZFS.

In particular, for each disk image:

  • What metadata do you expect to be able to find on disk? This includes filenames, the equivelent of directory entries, creation times, and so on.
  • What metadata could you actually locate for each of the files, both present and deleted? You DO NOT need to manually write a parser for the filesystem – use TSK or other tools, but please don’t just “eh, I grepped and didn’t find anything.” Show a (short) representative hexdump of one such piece.
  • What data (file contents) do you expect to be able to find on disk, both present or deleted? Your answer will depend upon how large your files were, the order in which you wrote them, and your OS’s allocation strategy.
  • What contents could you actually locate? (Carving it should be straightforward.)
  • Did you note any oddities (for example, multiple copies of data, or data you expected to find but could not)? If so, explain their source on the basis of your knowledge of the file systems.

(125 points in total, distributed across the five scenarios.)

What to submit

Write a short report that includes a transcript of your terminal session creating disk images and manipulating them. Include the source of any programs or scripts you wrote. Finally, include your answers to the above questions. Wrap all this into a zip file and submit it using Moodle.

As usual, group work is permissible. You might choose to make the assignment more “interesting” if you work in groups. For example, you might also deliberately corrupt some of the data (on say, Ext3 and ZFS), then see what’s recoverable before and after asking the OS to fix the filesystems (fsck for Ext3, and zpool scrub for ZFS). Or other scenarios – up to you.