Special Topics – Programming in Python for Data Science


INFO197P – UMass Amherst – Spring 2020


Course Information



Emma Anderson emmaanderson@cs.umass.edu


W/F 9:05-9:55am, 3/20 – 4/26


Engineering Lab Room 325




None required

Office Hours

Fridays 10-11am, CS building room 228





In this course, each voice in the classroom has something of value to contribute. Please take care to respect the different experiences, beliefs and values expressed by students and staff involved in this course. My colleagues and I support UMass’s commitment to diversity, and welcome individuals regardless of age, background, citizenship, disability, sex, education, ethnicity, family status, gender, gender identity, geographical origin, language, military experience, political views, race, religion, sexual orientation, socioeconomic status, and work experience.



Course Description


A brief introduction to the Python programming language for students with a working knowledge of basic programming concepts.  This course is geared towards introductory data science and analytics tasks, and is intended for Informatics majors.  Prerequisite: COMPSCI 121.  Runs for 6 weeks beginning 3/20.



Course Goals and Objectives


The goal of this course is to provide hands-on experience with Python programming, with an eye towards performing basic data analytics tasks.  At the end of the course, you should be able to use the Python Numpy and Pandas packages to perform quantitative analysis on various datasets.




The University of Massachusetts Amherst is committed to providing an equal educational opportunity for all students. If you have a documented physical, psychological, or learning disability on file with Disability Services (DS), you may be eligible for reasonable academic accommodations to help you succeed in this course. If you have a documented disability that requires an accommodation, please notify me within the first two weeks of the semester so that we may make appropriate arrangements.





This is a 1-credit P/F class.  In order to pass, you must meet the following requirements:


-Attend at least 75% of the class meetings (8 out of 11 meetings; exceptions can be made for extenuating circumstances on a case-by-case basis)

-Complete all 4 short assignments, assigned in weeks 1-4

-Complete a final project


No grades will be assessed in this course; instead, assignments will be graded on completion and given written feedback.


As this is a skills-based course, class meetings will consist of short lectures followed by hands-on work with code.  As such, your attendance in class is critical.


What To Expect


Classes will be a combination of lecture and hands-on code workshop.  Please bring your laptop to class each day.  Since this is a 1-credit class, you should expect to spend 2-3 hours of time on this course outside of class each week.  This will be more heavily weighted towards the latter half of the course, so expect to spend 1-2 hours per week at first, and 3-4 hours per week towards the end as you work on a final project.




3/20: Course Introduction, print statements, types, variables Lecture


3/22: Python Lists Lecture


3/27: String & List methods Lecture


3/29: Importing packages, Numpy, introduce final project Lecture Code Video Lecture


4/3: Dictionaries, introduction to Pandas and DataFrames Pandas_Lecture.pdf


4/5: Charts and Graphs Pandas Visualization Guide


4/10: Data Cleaning


4/12: Machine Learning Techniques


4/17: Machine Learning Techniques 2


4/19: Getting data from the web


4/24: Project sharing


Resources and other fun stuff


March Madness Prediction Analysis

March Madness Raw Data

Emma's GitHub – data files used in class

Anaconda Download