Exam 3 Review
The third exam will be given on Wednesday, December 03 in AEBN 119 at 1900. The exam will cover material discussed in lectures from November 04 (naive Bayes classifiers, bias and variance) to December 02 (HMMs, Viterbi, and approximate inference on HMMs, if we get to it) and the corresponding chapters of Russell & Norvig listed on the schedule, with some reference to topics from the prior exam that are relevant (the introductory machine learning material, probability, and the like).
The questions below cover the topics that may appear on the exam. You should be able to answer these questions. You should also be able to use the topics they cover in combination and to apply them to specific problems.
Bias and variance
- What is bias, what is variance, and how do they differ?
- Given that bias and variance are properties of distributions rather than single estimates, where do the different values in a sample distribution come from?
- Under what conditions is the bias of a learned model virtually guaranteed to be large? Under what conditions is the variance of a learned model guaranteed to be large?
- What causes bias and variance to be “traded off” against each other?
Multiple comparisons
- What are the three necessary and sufficient conditions for a multiple comparison procedure?
- What are two examples of multiple comparison procedures from every day life?
- What is the effect of overfitting in the context of an algorithm for learning classification trees?
- Why can overfitting be thought of as the result of a multiple comparison procedure?
Naive Bayes classifiers
- What are the two types of components of a naive Bayes classifier?
- How do naive Bayes classifiers differ from full Bayes nets? Why are they considered “naive”?
- What are the advantages and disadvantages of naive Bayes classifiers in comparison to other classifiers (for example, classification trees)?
- What is the simplest way to learn the conditional probability tables in naive Bayes classifiers (or in fact, any Bayes net)? What common problem is encountered in practice and how is this problem dealt with?
Classification trees
- What are the advantages and disadvantages of classification trees in comparison to other classifiers (e.g., naive Bayes classifiers)?
- What is the approach used in most algorithms for learning classification trees?
- What is the effect of pruning on tree complexity? On bias? On variance?
Artificial neural networks
- What are the types of artificial neural networks, and how do they relate to one another?
- What are the advantages and disadvantages of ANNs in comparison to other classifiers?
- What is the approach used for learning ANN structure? For learning ANN weights?
- Why have large ANNs and so-called “deep learning” become more useful lately?
Continuous data and non-parametric modeling
- What is the difference between parametric and non-parametric modeling?
- What is kernel density estimation? Why might it be used rather than, say, fitting to a normal distribution?
- What is the k-nearest-neighbor classifier? On what types of data sets can it be used? How does it compare to other types of classifiers in terms of model complexity?
- What approach is used to fit models for linear regression?
Unsupervised learning and clustering
- What is clustering, and why (in what contexts) is it useful?
- How does the k-means algorithm work? What types of clusters can it (and can it not) detect?
Reasoning over time and Markov models
- What is a Markov process? What is the Markov assumption? What do the nodes in a Markov model represent?
- What is the stationary distribution of a Markov process, and how do you solve for it?
- What do the nodes in a hidden Markov model represent?
- What is the forward algorithm, and what problem does it solve?
- What is the Viterbi algorithm, and what problem does it solve?