Project Info

Overview

The CS 335 course project is your chance to apply state-of-the-art machine learning methods to an application that you care about or to explore some area of machine learning more deeply than we covered in class. An ambitious project may try to innovate in some area of ML.

Important Dates

Proposal: due Friday, March 22 at ~~noon~~ midnight (Piazza)
Weekly reports: due weekly thereafter by Friday at noon (Piazza)
Final presentations: (poster or oral TBD) April 25 and April 30 in class
Final report: due Tuesday, May 7 at noon

You may work by yourself or in groups of up to three students for the project. (But more is expected from larger project groups.)

What does a project look like?

Here are some possible types of projects:

The first and most common is an application project. Pick an application you are interested in (ideally, one you are passionate about!) and try to solve it with ML—either methods we learned in class or ones that you discover elsewhere.
The second is an exploration project. Pick an area of machine learning or a machine learning algorithm that we did not cover in class and explore it in depth. This should be a significant expedition into uncharted territory for all team members (i.e., don’t “explore” something you already know). It will likely involve:
- Reading one or more research papers or chapters from an advanced ML book
- Gaining a deep understanding of the method (enough to present it clearly in your own words in the project report)
- Running experiments to evalute the method
A main difference between this and an application project is that the experiments would be designed to understand and test the method and not solve a particular application. For example, imagine that you just invented the method and need to demonstrate its effectiveness. This might involve reproducing results from a research paper.
An ambitious project may try to innovate in some area of ML. A good way to approach this is like an exploration project—examine some existing state-of-the-art method and then try to develop a novel extension or variant.

Your project can also cross these boundaries. If you have any doubt about what you want to do or would like a suggestions, I highly encourage you to talk to me about your project!

The most fun and exciting projects are likely to come from applications you are passionate about. For example, if you are involved in another research project or a hobby that could somehow benefit from machine learning, these could make great projects.

You may choose to explore a purely theoretical area of machine learning (we did not touch on these areas in class, but they exist).

Inspiration

One way to gain inspiration is to look at recent papers from the main machine learning conferences (ICML and NIPS):

You can see a long list of projects from the Stanford CS 229 class here:

http://cs229.stanford.edu/projects.html

(That class is somewhat more advanced than ours and the project has a longer timeline, so you will want to calibrate your expectations based on those two facts.)

Resources

Here are some resources that may come in helpful or inspire some ideas for your project:

Kaggle: machine learning competitions
UCI Machine Learning Repository: data sets for testing new and existing methods
PyTorch: deep learning in Python
Deep Learning Book
[Google TensorFlow](https://www.tensorflow.org/
Deep learning tutorial

There are many other sources of free implementations of machine learning algorithms. If there is something in particular you are interested in, please look around and ask me.

Project Components

Proposal

The proposal is due Friday, March 15 at noon and should be submitted as a post on Piazza. I encourage you to post the entire proposal publicly. I will also accept proposals posted privately to me, but these should be accompanied by a shorter public post describing your project plan at a high level. (Rationale: you can learn and get ideas from others; this will make all projects better!)

It should be the equivalent of 1-2 pages and include the following:

Project title
List of team members
Description of your project

In the description, please address topics that will allow me (and classmates) to judge the feasibility, such as:

What data do you plan to use?
What algorithms will you implement?
What programming environment do you plan to use? (Python, TensorFlow, etc.)
What background reading will you do?

You may not know the answers to all of these questions, and some may change. That’s OK. This is not a contract, just an initial plan that should be as detailed as possible to help guide your work and so I can give you feedback.

Milestones: Weekly Status Reports

After the proposal, you should submit weekly status reports on Piazza by Friday at noon. These will be public posts. If for some reason a public post absolutely does not work for you, let me know. The weekly status reports should chronicle the progress you’ve made and the issues you have encountered.

A recommended format is as a bulleted list addressing each of the following:

Activities for the week
List of source materials you’ve read
Preliminary results
Issues you’ve encountered
Questions
Concrete next steps (for next week)
Longer-term TODO items

You could have up to a few sub-bullets for each category.

By ~early April these should start including concrete results.

The purpose of these reports are:

To make sure you are on track with your project and uncover any major problems or need to change course as soon as possible.
To share information and learn from your classmates.

Presentations

Final presentations will take place during the last one or two class sessions or possibly during a special presentation session around the last day of classes. The format is still to be determined. These will either be very short oral presentations (with slides) or poster presentations.

Final report

The final report is due at the end of finals period: Tuesday, May 7 at noon. It can, of course, be turned in earlier.

The report should be 5-8 pages long and describe your project in detail. It should be roughly organized like most research papers:

Introduction and motivation
Background
Hypotheses or questions you are trying to answer
Experiments
Discussion and conclusion

Submissions instructions will be provide here closer to the final due date.

Acknowledgments

Large sections of this document are based on an older version of Stanford’s CS 229 project guidelines.