Computational Biology and Bioinformatics
Course Description
This course is designed to provide computer scientists with a comprehensive introduction to the field of computational biology. The course will cover the application of computational techniques to modern research challenges in biology, discussing both foundational algorithms and newly introduced methods. The necessary background on biology will be provided in order to understand the methods. The primary focus will be analysis of genomic data, including genome assembly, genome annotation, sequence alignment, phylogeny construction, mutation effect prediction, population genetics, and genotype-phenotype association studies. We will also cover gene expression analysis (RNA-seq and single-cell RNA-seq) and protein structure analysis and prediction. Throughout the course, we will emphasize the unique challenges to working with biological data. Through lectures and hands-on programming problem sets, students will develop the necessary skills to tackle computational challenges in the field of biology.
Syllabus
Organization
- Instructor: Anna G. Green
- Course number: COMPSCI 690U
- E-mail: annagreen@umass.edu (Note: Please include COMPSCI 690U in your email subject line.)
Schedule
This schedule is subject to change. Please check back frequently.
Week | Topics |
---|---|
Week 0 | Syllabus Discussion, Introduction, Lecture 0: Biological sequences as information |
Week 1 | Sequence alignment: basics and modern solutions. Needleman-Wunsch and Smith-Waterman algorithms, BLAST, evolutionary interpretation of sequence alignment |
Week 2 | DNA sequencencing technology, read mapping and variant calling, Burrows-Wheeler transform |
Week 3 | De novo genome assembly, overlap graphs, de Bruijn graphs, long read sequencing technology |
Week 4 | Genome annotation, Markov chains, Hidden Markov models for genome annotation, Viterbi algorithm |
Week 5 | Phylogenetics, continuous time markov models, Jukes-Cantor substitution model, gene trees versus species trees, outlok on molecular phylogenetics |
Week 6 | Population genetics, mutation and selection, genetic drift, tests for selection, dN/dS ratio, linkage disequilibrium |
Week 7 | Association Studies, controlling for population structure, multiple comparisons problem, heritability, interpretation of GWAS, ethical considerations |
Week 8 | Mutation effect prediction in proteins, deep mutational scan experiments, clinical variant data, classic models for mutation effect prediction, modern ML solutions for mutation effect prediction |
Week 9 | Mutation effect prediction in non-coding regions, experimental analysis of the function of non-coding regions, modern ML solutions for annotating non-coding regions |
Week 10 | Gene Expression Analysis, RNA sequencing experiments, inferring transcript abundance, single-cell RNA sequencing, correcting for sparsity, cell type inference |
Week 11 | Protein Structure Prediction, Levinthal’s paradox, homology modeling, evolution-based inference, AlphaFold |
Week 12 | Special topics |