Machine learning has become a critical mathematical tool for a variety of fields that involve big data such as computer vision, natural language processing and bioinformatics. Machine learning research aims to build computer systems that learn from experience. Learning systems are not directly programmed by a person to solve a problem, but instead they develop their own program based on examples of how they should behave, or from trial-and-error experience trying to solve the problem. These systems require learning algorithms that specify how the system should change its behavior as a result of experience. Researchers in machine learning develop new algorithms, and try to understand which algorithms should be applied in which circumstances. This course will focus on the machine learning methods that have proven valuable and successful in practical applications. This course will contrast the various methods, with the aim of explaining the circumstances under which each is most appropriate. We will also discuss basic issues that confront any machine learning method.
Credit:  This course is adopted from Rich Zemel's and Raquel Urtasun's CSC411 course.
Prerequisites: You should understand basic probability and statistics, (STA 107, 250), and college-level algebra and calculus. For example it is expected that you know about standard probability distributions (Gaussians, Poisson), and also how to calculate derivatives. Knowledge of linear algebra is also expected, and knowledge of mathematics underlying probability models (STA 255, 261) will be useful. For the programming assignments, you should have some background in programming (CSC 270), and it would be helpful if you know Matlab or Python. Some introductory material for Matlab will be available on the course website as well as in the first tutorial.
The information sheet for the class is available here.
This class uses piazza. On this webpage, we will post announcements and assignments. The students will also be able to post questions and discussions in a forum style manner, either to their instructors or to their peers.
Please sign up here in the beginning of class.
There is no required textbook for this course. There are several recommended books. We will recommend specific chapters from two books: Introduction to Machine Learning by Ethem Alpaydin, and Pattern Recognition and Machine Learning by Chris Bishop. Additional readings and material will be posted in the schedule table as well as the resources section.
Each student is expected to complete three assignments. There is also a mid-term and a final exam. Participation in discussions in class is highly encouraged.
Assignments | 40% (3 assignments, first two worth 12.5%, last 15%) |
Mid-Term Exam | 25% |
Final Exam | 35% |
The best way to learn about a machine learning method is to program it yourself and experiment with it. So the assignments will generally involve implementing machine learning algorithms, and experimentation to test your algorithms on some data. You will be asked to summarize your work, and analyze the results, in brief (3-4 page) write ups. The implementations may be done in any language, but Matlab and Python are recommended. A brief tutorial on Matlab is included here. You may also use Octave.
Collaboration on the assignments is not allowed. Each student is responsible for his or her own work. Discussion of assignments and programs should be limited to clarification of the handout itself, and should not involve any sharing of pseudocode or code or simulation results. Violation of this policy is grounds for a semester grade of F, in accordance with university regulations.
The schedule of assignments is included in the syllabus. Assignments are due at the beginning of class/tutorial on the due date. Because they may be discussed in class that day, it is important that you have completed them by that day. Assignments handed in late but before 5 pm of that day will be penalized by 5% (i.e., total points multiplied by 0.95); a late penalty of 10% per day will be assessed thereafter. Extensions will be granted only in special situations, and you will need a Student Medical Certificate or a written request approved by the instructor at least one week before the due date.
Submission: Solutions to the assignments should be submitted through MarkUs.
Plagiarism: We take plagiarism very seriously. Everything you hand in to be marked, namely assignments and exams, must represent your own work. Read How not to plagiarize.
There will be a mid-term in class on TBA, which will be a closed book exam on all material covered up to that point in the lectures, tutorials, required readings, and assignments.
The final will not be cumulative, except insofar as concepts from the first half of the semester are essential for understanding the later material.
We expect students to attend all classes, and all tutorials. This is especially important because we will cover material in class that is not included in the textbook. Also, the tutorials will not only be for review and answering questions, but new material will also be covered.
The course will cover classification, regression, neural networks, mixture models, reinforcement learning.
Linear classification and regression |
Multi-class classification |
Decision trees |
Neural Networks |
Clustering |
Mixture of Gaussians |
Principal Component Analysis |
Support Vector Machines |
Ensemble Methods |
Reinforcement Learning |
Date | Topic | Reading | Slides | Assignments |
---|---|---|---|---|
Jan 11 | Course Introduction | lecture1 | ||
Jan 13 | Linear Regression | Bishop, Chapter 1.0-1.1; 3.1 | lecture2 | |
Jan 15 | Tutorial: Review on Probability | tutorial1 | ||
Jan 18 | Linear Classification | Bishop: Pages 179-195 | lecture3 | |
Jan 20 | Logistic Regression | Bishop: Pages 203-207 | lecture4 | |
Jan 25 | Non-parametric Methods | Bishop: Pages 120-127 | lecture5 | |
Jan 27 | Decision Trees | lecture6 | ||
Jan 29 | Tutorial: Gradient Descent, KNN | tutorial2 | Assignment 1: out Jan 31, due Feb 12, noon, 2016 | |
Feb 1 | Multi-class Classification | Bishop 4.1.2, 4.3.4 | lecture7 | |
Feb 3 | Tutorial: K-NN and Decision Trees | tutorial3 | ||
Feb 5 | Probabilistic Classifiers I | Bishop, 4.2.2 | lecture8 | |
Feb 8 | Probabilistic Classifiers II | Bishop: Pages 380-381 | lecture9 | |
Feb 10 | Neural Networks I | Bishop, 5.1 - 5.3 | lecture10 | |
Feb 22 | Tutorial: Naive Bayes | tutorial4 | ||
Feb 24 | Tutorial: Midterm Review | tutorial5 | ||
Feb 26 | Tutorial: Neural Networks | tutorial6 | ||
Mar 2 | Neural Networks II | lecture11 | ||
Mar 7 | Clustering | Bishop 9.1 | lecture12 | Assignment 2: out Mar 5, due Mar 18, 6pm, 2016 |
Mar 9 | Mixture of Gaussians | Bishop 9.2, 9.3 | lecture13 | |
Mar 11 | Tutorial: Clustering | tutorial7 | ||
Mar 14 | PCA & Autoencoders | Bishop 12.1 | lecture14 | |
Mar 16 | Tutorial: PCA | tutorial8 | ||
Mar 18 | Support Vector Machines | Bishop: Chapter 7, pages 325-337 | lecture15 | |
Mar 21 | Kernels | lecture16 | Assignment 3: out Mar 22, due April 9, 6pm, 2016 | |
Mar 23 | Ensemble Methods I | Bishop 14.2 - 14.3 | lecture17 | |
Mar 28 | Ensemble Methods II | lecture18 | ||
Apr 1 | Tutorial: SVM | tutorial9 | ||
Apr 4 | Reinforcement Learning | lecture19 |
The challenges are not part of the assignments. They are meant to make your learning experience faster and more fun. Your participation is absolutely optional.
You are given data for 1000 movies, 700 of which are in the training set and 300 are for testing. The information for each movie is: cast, director(s), writer(s), storyline (short description), synopsis (long description), keywords, budget. Not all data is available for each movie. You also have ground-truth ratings and genres. These are to be used only for training and for evaluation.
The release of Kung Fu Panda 3 is Jan 27, 2016, and no rating info is currently available. Can you predict its rating automatically (and not by going to the cinema and watch it)? Submit your solutions by March 1. Winner(s) announced in the beginning of March (when enough votes are collected on IMDb)!