Assignment 03: Dealing with Data

  1. Summarize the reading from Coderre (Chapter 4). Your summary can be brief (about a page / 500 words maximum; go over the limit if you need to but please don’t write me a book!). Use a style you are comfortable with (text, outline, etc.).

  2. The following file contains a SQLite3 database file: [assignment-03.zip]

    Using what you’ve learned this week, I’d like you to prepare the data for analysis by your tool of choice. Your analysis tasks are as follows:

    • Graph the number of students per course using a reasonable visualization.
    • Determine the mean and median number of students per course.
    • Determine the name(s) of the student(s) enrolled in the most courses. What are the name(s) of the student(s) and courses?
    • Bonus: Are any instructors enrolled in courses? If so, name the students that are also instructors.

    In order to perform these analyses, you will need to do some manipulation of the data using SQL (mainly a few SELECT / JOINs). You’ll also need to aggregate some data, either in SQL or using your analysis tool of choice. And of course you’ll need to be able to create a graph.

    In addition to the above questions, I want you to verify that the data seems sensical. Tell me what you check (for example, that no two different people have the same name). Tell me what discrepancies, if any, that you find, and how you resolved them in your analyses. You don’t need to go overboard here; three to five sanity checks and their result is what I’m looking for.