01: Introduction

Welcome

Hello and welcome!

I'm Marc Liberatore liberato@cs.umass.edu and I'm your instructor for this course, COMPSCI 365 / 590F.

The most important thing to know today: the course web site is at https://people.cs.umass.edu/~liberato/courses/2017-spring-compsci365/. It is the syllabus for this class and you are expected to read it in its entirety.

Who am I?

Not a "professor." "Doctor" if you must though I prefer just "Marc." I am a member of the teaching faculty here at UMass.

Stuff I do:

  • Privacy research: my dissertation was about attacks on Tor, and ways to improve it
  • Forensics research: my research group builds tools and technologies to help law enforcement lawfully collect evidence of digital crime
  • Other research: location privacy in cell phones, bitcoin mixing, etc.
  • Teaching: I have taught various clases in this department: this course, the dread 187 of which some of you might be having flashbacks right now, crime law, AI, networking, etc.

Come by office hours (TBA; probably going to be late Wednesday mornings) if you want to chat; my office is now in room 318 of the CS building.

Who is our TA?

Hadi Zolfaghari, office hours W/Th 1--2pm, CS 207.

We will also (hopefully) have a grader or two to help, but they've not been hired yet.

What is this course? / who is it for?

The goal of forensics is to gather artifacts for refinement into evidence that supports or refutes a hypothesis about an alleged crime or policy violation. Done correctly, forensics represents the application of science to law. The techniques can also be abused to thwart privacy.

This course is a broad introduction to forensic investigation of digital information and devices. We will cover the acquisition, analysis, and courtroom presentation of information from file systems, operating systems, networks, cell phones, and the like. Students do not need experience with these systems.

We will review the use of some professional tools that automate data harvesting, however, the primary goal of the class is to understand why and from where artifacts are recoverable in these systems. Several assignments involve coding forensic tools from scratch.

For a small portion of the class, we will cover some relevant issues from the law, privacy, and current events. Thus, the class serves the well-rounded student who is eager to participate in class discussion on a variety of technical and social issues.

Content Overview

Preliminaries:

  • Basics of Forensics
    • A Motivating Example
    • Data Representation
  • Brief Introduction to Python for Forensics
  • Metadata in Data: EXIF as a case study
  • Carving Data from Files

The legal basis for forensics:

  • Forensics is science applied to law (Daubert v. Merrell Dow Pharmaceuticals; General Electric Co. v. Joiner; Kumho Tire Co. v. Carmichael)
  • Contraband and knowing possession (G. Marin, Possession of Child Pornography: Should You be Convicted When the Computer Cache Does the Saving for You?)
  • Indicia of intent (T. Howard, Don’t Cache Out Your Case: Prosecuted Child Pornography Possession Laws Based on Images Located in Temporary Internet Files)

Network investigations I:

  • Network Investigations I: Remote, Durable Proof of Possession (B. Levine et al., Efficient Tagging of Remote Peers During Child Pornography Investigations.); and/or
  • NITs: Network Investigative Techniques

Filesystem Forensics:

  • Filesystem Forensics: Master Boot Records (MBRs), partitions, volumes (Carrier, chapters 2–5)
  • FAT Filesystems (Carrier, chapters 9 and 10)
  • NTFS Filesystems (Carrier, chapters 11 and 12)

Network Investigations II:

  • Wiretapping Technology and Policy
    • S. Bellovin et al., Going Bright: Wiretapping without Weakening Communications Infrastructure
    • S. Bellovin et al., Lawful Hacking: Using Existing Vulnerabilities for Wiretapping on the Internet
  • Email Investigations

Windows Artifacts:

  • H. Carvey, Windows Forensic Analysis, available through UMass Library online
  • J. Barbara, Windows 7 Registry Forensics (seven-part series)
  • additional tools installed on EdLab machines (lnkinfo, msiecfinfo, msicfexport, liblnk, libmsiecf)

Storage Technology: Spinning platters, solid state, and carving files: - https://belkasoft.com/en/ssd-2014 - http://www.toolwar.com/2014/04/scalpel-data-carving-tools.html

Malware and Related Legal Issues (The Trojan Horse defense)

Cell Phones:

  • S. Garfinkel et al.. Using purpose-built functions and block hashes to enable small block and sub-file forensics
  • R. Walls et al., Forensic Triage for Mobile Phones with DEC0DE.
  • S. Varma et al., Efficient Smart Phone Forensics Based on Relevance Feedback

Being an Expert Witness:

  • Chapter 5 from Smith, F.C., & Bace, R.G. (2002). A Guide to Forensic Testimony: The Art and Practice of Presenting Testimony as an Expert Technical Witness. Boston, MA: Addison-Wesley.
  • Affadavit from Jayson Street (an example of an expert witness's output)

You'll notice I'm writing on the board, not using powerpoint. That's how I roll. I will occasionally (depending upon topic, frequently) bust out the laptop for some livecoding and demos, and once in a while for illustrations, but there are no slides or powerpoints available for this course. Come to class and take notes!

Prerequisites

For undergraduates: COMPSCI 220 or COMPSCI 230. CS majors only (others will need to request an override). I expect a reasonable level of programming maturity. You're going to be writing your assignments in Python, which (to my knowledge) is not required in a previous course, so you're also going to be learning it as you go. The level of Python you'll need should not be a significant challenge for you. That said, you should start going through a Python (3.5) tutorial soon.

For CS graduate students: no formal prerequisites (but informally: you should know at least as much as the undergrads). Non-CS grad students should check with me about their background before enrolling.

Who are you?

Attendance.

If I didn't call your name, you're not enrolled. This probably means you either are on the wait list, or you want to be. Make sure you talk to me after class if this is the case!

A motivating example

As a newly hired forensic investigator for Locard Forensics, Inc., you have been assigned to a team led by Mr. Locard, who tells you the following information about a case already in progress.

Anne Adams worked as a designer of toys at the Acme Toy Company for over ten years, eventually becoming a senior designer. One year ago, Nadir Toy Corp. offered Adams a position as vice-president of toy design, including a large pay raise, and she took the offer. This week, Acme learned of Nadir’s newest toy, which in their view shared too much in common with a project Adams was seen working on before she left Acme: a toy rabbit. Mr. Locard has assigned you the task of verifying his hypothesis that Adams illegally copied documents describing the projects she worked on at Acme (documents owned by Acme) from her computer before she left.

Mr. Locard worked with lawyers to create a court-ordered subpoena that, under penalty of law, requires Adams to produce all her computers and storage devices. Your task as part of Locard’s team is to focus on her USB storage device. Mr. Locard has made an exact copy of the original USB device, a process called imaging. One of the advantages of digital evidence over traditional evidence is that exact copies can be made and analyzed without disturbing the original.

Later on, we’ll explain the details of how data is imaged from a storage device, including internal hard drives, USB storage, CDROMs, and more. A copy of the acquired image from Ms. Adams’ USB storage device is on the course Web site. All of the data on Adams’ device is now contained in a file called adams.dd; we don’t need another USB storage device to examine hers. In this case, her USB key is only a container for the digital evidence and not the evidence itself. The evidence file contains all the data that was previously on the USB key.

A forensic investigation has several goals, depending on the context. Typically, the primary goals are to

  1. Determine if there is evidence that a crime, tort, or policy violation has been committed;
  2. Identify the related events and actions that occurred;
  3. And identify who might be responsible.

In many criminal investigations, the goal of the investigator may additionally include determining the motive and intent of the perpetrator, corroborating alibis of the innocent, and verifying statements of witnesses. Moreover, criminal investigators need to preserve a demonstrable link between the artifacts we find at a crime scene and our later presentation of the evidence in court.

Our focus is on digital evidence, and so we will not detail procedures for gathering other types of evidence. Note, however, that it’s rare that only digital evidence is collected from a scene. Crimes scene investigation can involve gathering of chemical, ballistic, biological remains of a crime. If you are interested in these topics, Saferstein has written an excellent introductory book.

In our particular case, our goal is to locate evidence from the USB key data that demonstrate the toy rabbit was first designed by Adams while she was still employed by Acme. We'll see this next class.

Some administrivia

Let's pause the course material to discuss some administrative stuff.

First, some words about assignments and grading.

Assignments (65%)

The majority of the workload in this course will consist of take-home assignments. These assignments will involve writing, programming, or both. Written assignments will have a series of questions, and will require that you understand basic legal and techincal concepts to answer them correctly. Some written assignments will require detailed analyses (for example, reasoning about a particular technology in the context of a law).

Programming assignments will typically involve implementing a forensic tool from scratch using Python. Typically these tools involving parsing data. We are going to essentially autopsy computers (disks, files, etc.). There will be bits everywhere and it can be overwhelming.

You can buy a cookbook and learn to be a cook; I'm going to teach you to be a chef (or at least, the skills you'll need to be one).

In this course, there's not a focus on the why (though there will be some!), mostly on the how. Other courses are for why (e.g., Computer Crime Law, police academy, law school).

Assignments are generally not collaborative: you must complete them on your own. Exceptions to this rule will be clearly noted.

We plan to give between 12 and 18 assignments (depending upon how written and programming assignments are broken up or combined into single assignments).

Each assignment will contribute a stated number of points toward the “Assignments” portion of your course grade. Each assignment may be worth a different amount of points.

Assignments have a due date, clearly marked on the course web site. Late assignments will not be accepted. Requests for extensions need to be made at least a day in advance. If you want to request an extension after a due date, I will expect a reasonable and well-documented excuse.

Midterms and exams (10/10/15%)

There will be two equally-weighted in-class midterms, dates TBA.

There will also be a cumulative final exam. You must achieve a passing grade on the final exam to pass the class.

You may not bring supplemental material to the midterms or final exam, that is, they are closed-book, and the use of notes, calculators, computers, phones, etc., is forbidden, unless otherwise explicitly stated.

Exams must be completed on your own: they are not collaborative!

Other things to note:

590F students will complete most of the same work as 365 students. On some assignments and exams, they'll have additional work / questions, and there may be 365 work/questions they won't be graded on.

Lecture attendance is not optional (except inasmuch as it's always optional; you're adults so miss class if you need to). Get notes from a friend if you miss class. More to the point, "it wasn't in the book" isn't a reasonable complaint if it was in lecture. Similarly, the book is required. You might be able to get away without it, but "it wasn't in lecture" isn't a reasonble complaint it if was in the book.

We're using Piazza for discussion.

At the start of the semester, I will permit laptops and the like in the classroom. If it becomes clear that they are being used for purposes not directly related to the class, I will ban them. It is unfair to distract other students with Facebook feeds, animated ads, and the like.

Regardless, I recommend taking notes by hand. Research suggests that students who take written notes in class significantly outperform students who use electronic devices to take notes.

Finally, note we might talk about topics never discussed in other CS contexts: murder, adult pornography, contraband (cases and images of child exploitation), etc. We will keep discussions at a high level. No slang, no denigration. Pretend you are at work. If you need to sit out one of these discussion please do so, no questions asked; they're generally only to contextualize some of our work, and we won't usually go into graphic detail. But some frank talk is unavoidable when discussing the motivation behind digital forensics.

End-of-class reminders

Read the pages on the course web site titled Overview, Policies, and Schedule; together, these constitute the course syllabus. Especially read the bit about late adds if you added after last Friday!

Assignments and their due date will go up on the web site as they become available. This includes the first few labs, the first programming assignment, and the first written assignment, all of which are now available!

Suggested reading and lecture notes will also be posted to the course web site.

If you aren't enrolled and want to be, make sure you talk to me before you leave.