Description

Computational social science is an emerging field at the intersection of computer science, statistics, and the social sciences, in which quantitative methods and computational tools are used to identify and answer social science questions. The field is driven by new sources of data from the Internet, government databases, university libraries, crowdsourcing systems, and more, as well as by recent advances in computational modeling, machine learning, statistics, and social network analysis. This course will provide an overview of computational social science, with particular emphasis on problems involving text analysis. The course will primarily consist of reading, presenting, and discussing recent papers. In addition, students taking the class for 3 credits will be expected to define and undertake a semester-long research project that relates to computational social science. Students entering the course should have a good knowledge of probability, statistics, and machine learning, as well as a strong interest in political science, public policy, and sociology. Students with diverse academic backgrounds and research interests are especially encouraged to participate.

General Information

  • Instructor: Hanna M. Wallach (wallach at cs umass edu)
  • Instructor Office Hours: by appointment only
  • Lectures: Wednesdays 12pm to 2pm, LGRC A311

Schedule

  • Wed. Jan. 19: Course overview [pdf]
  • Wed. Jan. 26: Computational social science, social network analysis. Readings:
    • Lazer et al. "Computational social science" [pdf]
    • Watts "A twenty-first century science" [pdf]
    • Watts "The 'new' science of networks" [pdf]
    • Newman "The structure and function of complex networks" [pdf]
  • Wed. Feb. 02: No class (snow day)
  • Wed. Feb. 09: Bayesian data analysis. Readings:
    • Heckerman "A tutorial on learning with Bayesian networks" [pdf]
    • MacKay "Inference, information theory and learning algorithms" (§ 23.1, 23.2, 23.3) [pdf]
  • Wed. Feb. 16: Topic modeling. (Hanna away.) Readings:
    • Steyvers & Griffiths "Probabilistic topic models" [pdf]
    • Griffiths "Gibbs sampling in the generative model of latent Dirichlet allocation" [pdf]
    • Grimmer "An introduction to Bayesian inference via variational approximations" [pdf]
  • Wed. Feb. 23: Guest lecture by Sean Gerrish. Readings:
    • Gerrish & Blei "The ideal point topic model" [pdf]
    • Poole & Rosenthal "Congress" (ch. 2)
  • Wed. Mar. 02: Quantitative political analysis. Reading:
    • Schrodt "Seven deadly sins of contemporary quantitative political analysis" [pdf]
  • Wed. Mar. 09: Text analysis for political science. Readings:
    • Quinn et al. "How to analyze political attention with minimal assumptions and costs" [pdf]
    • Grimmer "A Bayesian hierarchical topic model for political texts" [pdf]
  • Wed. Mar. 16: No class (spring break)
  • Wed. Mar. 23: Guest discussion led by Chris Smith. Readings:
    • Papachristos "Murder by Structure" [pdf]
    • Papachristos "The Small World of Murder" [html]
    • Papachristos "Rethinking Crime Epidemics" [mov]
    • Papachristos "Six Degrees of Criminal Justice" [mov]
    • Papachristos "The Evolution of Organized Crime" [mov]
  • Wed. Mar. 30: Reproducible and open research. Readings:
    • Yale Law School Roundtable on Data and Code Sharing "Reproducible Research" [pdf]
    • Stodden "Data Sharing in Social Science Repositories" [pdf]
    • Stodden "Open Science" [pdf]
  • Wed. Apr. 06: Guest lecture by Bruce Desmarais. (Hanna away.) Readings:
    • Desmarais & Cranmer "Statistical Inference for Valued-Edge Networks" [pdf]
    • Cranmer & Desmarais "Inferential Network Analysis with ERGMs" [pdf]
  • Wed. Apr. 13: Guest lecture by Ryan Acton. Readings:
    • Cerulo "Reframing Sociological Concepts for a Brave New (Virtual?) World" [pdf]
    • Gjoka et al. "Unbiased Sampling of Facebook" [pdf]
  • Wed. Apr. 20: No class (Monday schedule)
  • Wed. Apr. 27: Project presentations. // Modeling language variation. Reading:
    • Eisenstein et al. "A Latent Variable Model for Geographic Lexical Variation" [pdf]

Grade Breakdown

  • Paper reviews (40% or 80%): You must review the assigned readings every week (i.e., 1 review per reading). A paper review should consist of a 1–2-paragraph summary of the key ideas, followed by detailed comments regarding the pros and cons of the approach (with justifications) and questions/comments/thoughts about the work, etc. You should concentrate on the content (i.e., the problem, ideas, evaluation methodology, etc.). You should not comment on grammar or typos. There are instructions on how to write a review here and there is an example review here. Paper reviews are due (via email in plain text format) each week by 11:59pm on Tuesday.
  • Participation (10% or 20%): You will be required to present the assigned readings and lead the discussion during one class. You will also be graded on your general participation in discussions.
  • Semester-long project (50% or N/A): If you are taking the course for 3 credits, you must undertake a semester-long, student-proposed (but instructor-approved) project. Your project should consist of tackling an existing social science problem using novel computational methods/tools, comparing existing methods/tools for a new social science problem, or something similar. Project proposals (1 page) are due by 11:59pm on Feb. 08; status updates (1 paragraph) are due by 11:59pm on Mar. 15; project writeups (maximum 10 pages) are due by 11:59pm on Apr. 19; in-class project presentations (around 10 minutes) will take place on Apr. 27.