10701 Introduction to Machine Learning (original) (raw)

Syllabus and (tentative) Course Schedule

Date	Lecture	Topics	Readings and useful links	Anouncements
Module 1: Supversived Learning
Thu 9/3	Lecture 1 (Eric, Ziv): Intro to probability, MLE -Slides	Introduction of the course Basic probability Maximum likelihood estimate
Tue 9/8	No class
Thu 9/10	Lecture 2 (Ziv): Classification, kNN -Slides,Video	Optimal decision using Bayes rule Types of classifiers Effect of values of k on kNN classifiers Probabilistic interpretation of kNN
Tue 9/15	No class
Thu 9/17	Lecture 3 (Ziv): Decision trees -Slides (updated),Video	Discriminative classifiers Entropy Information gain Building decision trees	PS1 out
Tue 9/22	Lecture 4 (Ziv): Naïve Bayes -Slides,Quiz,Video	Problems with estimating full joints Advantages of Naïve Bayes assumptions Applications to discrete and continuous cases Problems with Naïve Bayes classifiers	Mitchell, 6.1-6.10
Thu 9/24	Lecture 5 (Ziv): Linear regression -Slides,Video	Basic model Solving linear regression Error in linear regression Advanced regression models	Bishop, 3.1
Tue 9/29	Lecture 6 (Ziv): Logistic regression -Slides,Video	Logistic regression vs. linear regression Sigmoid funcion MLE via gradient ascent Regularization Logistic regression for multiple classes	Bishop, 4.2-4.3
Thu 10/1	Lecture 7 (Eric): Perceptron, Neural networks -Slides,Video	Perceptron Multilayer Perceptron Backpropagation "Deep" Learning Convolutional Neural Networks Layer-wise Pre-training	Bishop, Ch. 3 Bengio, Learning Deep Architectures for AI Krizhevsky et al., ImageNet Classification with Deep Convolutional Neural Networks	PS2 out,PS1 due (10/2)
Tue 10/6	Lecture 8 (Eric): Deep learning, SVM -Slides1,Slides2,Video	"Deep" Learning Convolutional Neural Networks Support Vector Machines	Deep Learning: Salakhutdinov, Learning Learning Deep Generative Models Ioffe and Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift Bahdanau et al., Neural machine translation by jointly learning to align and translate SVM: Burges, A Tutorial on Support Vector Machines for Pattern Recognition Schölkopf and Smola, Learning with Kernels
Thu 10/8	Lecture 9 (Eric): SVM -Slides,Annotated,Video	Duality, KKT condition Kernel trick Sequential Minimal Optimization (SMO)	Platt, Fast Training of Support Vector Machines using Sequential Minimal Optimization Chang and Lin, LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology Fan et al., LIBLINEAR: A Library for Large Linear Classification	Proposal due
Tue 10/13	Lecture 10 (Eric): Evaluating classifiers, Bias-variance decomposition -Slides,Video	Bias-variance decomposition Structural risk minimization Ways to avoid overfitting	Vapnik, V. (1999) An Overview of Statistical Learning Theory, IEEE transactions on Neural Networks, Vol. 10, pp. 988-99.E.P. Xing, M.I. Jordan and R.M. Karp (2001), Feature selection for high-dimensional genomic microarray data, Proceedings of the Eighteenth International Conference on Machine Learning.Andrew Y Ng. On Feature Selection: Learning with Exponentially many Irrelevant Features as Training Examples. In Proceedings of the Fifteenth International Conference on Machine Learning, 1998.
Thu 10/15	Lecture 11 (Eric): Ensemble learning - Boosting, Random Forests -Slides,Video	Combing weak learners Bagging and random forest AdaBoost, algorithem and generalization bounds Gradient boosting	Freund and Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting Schapire et al., Boosting the margin: A new explanation for the effectiveness of voting methods Breiman, Random Forests Friedman, Greedy Function Approximation: A Gradient Boosting Machine Chen, XGBoost, an efficient implementation of gradient boosting	PS3 out,PS2 due (10/16)
Module 2: Unsupversived Learning
Tue 10/20	Lecture 12, 13 (Ziv): Unsupervised learning - clustering -Slides,Video	Hierarchical clusteringK-means and Gaussian mixture modelsNumber of clusters	Bishop, Ch. 9Optional: Mitchell, 6.12
Thu 10/22
Tue 10/27	Lecture 14 (Ziv): Semi-supervised learning -Slides,Video	Re-weightingEM, data augmentationCo-trainingDetect overfitting
Thu 10/29	Lecture 15 (Eric): Learning theory -Slides,Annotated,Video	Realizable vs agnosticPAC learning in finite concept classSample complexity	Mitchell, Ch. 7Vapnik, V. (1999) An Overview of Statistical Learning Theory, IEEE transactions on Neural Networks, Vol. 10, pp. 988-99.	PS4 out,PS3 due (10/30)
Tue 11/3	Lecture 16 (Eric): VC dimension -Slides,Video	Sample complexity for infinite concept classesVC dimension as a complexity measureStructural risk minimization	Ch. 3, An Introduction to Computational Learning Theory, M. Kearns and U. Vazirani
Module 3: Probabilistic Representation and Modeling
Thu 11/5	Lecture 17 (Eric): Graphical models, Bayes nets -Slides,Video		An introduction to graphical models, M. I. Jordan	Midway report due
Tue 11/10	Lecture 18 (Eric): Bayes nets -Slides,Video		Bishop, 8.4
Thu 11/12	Lecture 19 (Eric): Undirected graphical modelsVideo		PS4 due (11/13)
Sun 11/15	Midterm review at Doherty 2315, 4-6pm
Tue 11/17	No class, Midterm at DH 2315, 5pm
Thu 11/19	Lecture 20 (Ziv): HMM -Slides,Video		PS5 out
Tue 11/24	Lecture 21 (Ziv): HMM inference -Slides
Thu 11/26	No class
Tue 12/1	Lecture 22 (Eric): MDPs, Reinforcement learning -Slides		Sutton, Richard and Barto, Andrew: Reinforcement Learning: an introduction, MIT Press, Cambridge, MA, 1998	PS5 due
Thu 12/3	Lecture 23 (Eric): Topic models -Slides		D. Blei et al., Latent Dirichlet Allocation T. Griffiths and M. Steyvers, Finding Scientific Topics Eric's ACL 2012 Tutorial. Topic Models, Latent Space Models, Sparse Coding, and All That: A systematic understanding of probabilistic semantic extraction in large corpus
Fri 12/4	Poster session, 2:30pm at NSH Atrium
Module 4: Applications of ML
Tue 12/8	Lecture 24 (Ziv): Computational biology -Slides		Final report due (12/11)