CSC420: Introduction to Image Understanding

Everyone has large photo collections these days. How can you intelligently find all pictures in which your dog appears? How can you find all pictures in which you are frowning? Can we make cars smart, e.g., can the car drive you to school while you finish your last homework? How can a home robot understand the environment, e.g., switch on a tv when being told so and serve you dinner? If you take a few pictures of your living room, can you reconstruct it in 3D (which allows you to render it from any new viewpoint and thus allows you to create a "virtual tour" of your room)? Can you reconstruct it from one image alone? How can you efficiently browse your home movie collection, e.g. find all shots in which Tom Cruise is chasing a bad guy?

This class is an introduction to fundamental concepts in image understanding, the subdiscipline of artificial intelligence that tries to make the computers "see". It will survey a variety of interesting vision problems and techniques. Specifically, the course will cover image formation, features, object and scene recognition and learning, multi-view geometry and video processing. Since Kinect is popular these days, we will also try to squeeze recognition with RGB-D data into the schedule. The goal of the class will be to grasp a number of computer vision problems and understand basic approaches to tackle them for real-world applications.

Prerequisites: A second year course in data structures (e.g., CSC263H), first year calculus (e.g., MAT135Y), and linear algebra (e.g., MAT223H) are required. Students who have not taken CSC320H will be expected to do some extra reading (e.g., on image gradients). Matlab will extensively used in the programming excercises, so any prior exposure to it is a plus (but not a requirement).

Time and Location

Winter 2022
Day:      Monday
Time:    9am-11am
Room: online
Tutorials: TUT0101 on Monday 11am-12pm (online), TUT0102 on Monday 12-1pm (online)

Instructor

Sanja Fidler
Email:   fidler@cs...
Homepage: http://www.cs.toronto.edu/~fidler
Office: all office hours will be online -- please send email
Office hours: Mon at 11am-12.30pm

TAs

Wenzheng Chen, Jun Gao, Leili Goli, Parsa Mirdehghan

When emailing us, please put CSC420 in the subject line.

Information Sheet

The information sheet for the class is available here.

Programming Language(s)

You are expected to do some programming assignments for the class. You can code in either Python, Matlab, or C. The tutorials will be given in Python.

Useful tutorials:
- Matlab tutorial written by Prof. Stefan Roth
- Python tutorial written especially for this class by Olessia Karpova who took CSC420 in 2015. Thank you Olessia!

Forum

We will use Quercus for announcements, posting of assignments, QA and discussions.

We will not directly follow any textbook, however, we will require some reading in the textbook below. Additional readings and material will be posted in the schedule table as well as the resources section.

Website

"Computer Vision: Algorithms and Applications"

Richard Szeliski

Springer, 2010

The textbook is freely available online and provides a great resource for introduction to computer vision.
We will be reading the Sept 3, 2010 version.

Each student is expected to complete four assignments which will be in the form of problem sets and programming problems, and complete a project.

Detailed Requirements

Assignments

Assignments will be given every two weeks. They will consist of problem sets and programming problems with the goal of deepening your understanding of the material covered in class. All solutions and programming should be done individually. There will be four assignments altogether, each worth 15% of the final grade.

Deadline: The solutions to the assignments should be submitted by 11.59pm on the date they are due. Anything from 1 minute late to 24 hours will count as one late day.

Lateness policy: Each student will be given a total of 3 free late days. This means that you can hand in three of your assignments one day late, or one assignment three days late. It is up to you to make a good planning of your work. After you have used your 3 day budget, your late assignments will not be accepted.

Plagiarism: We take plagiarism very seriously. Everything you hand in to be marked, namely assignments and projects, must represent your own work. Read How not to plagiarize.

Project and Oral Exam

Each student will be given a topic for the project. You will be able to choose from a list of projects, or propose your own project which will need to be discussed and approved by your instructor. You will need to hand in a report and give a presentation. During the presentation the instructor will ask questions about class material as well as individual assignments. The grade will heavily depend on how well the material is defended and how well the class material is understood by the student.

Grading

The final grade will be computed as follows:

`Assignments`	60% (4 assignments, each worth 15%)
`Project and Oral Exam`	40% (report, presentation: 30%, oral exam: 10%)

The course will cover image formation, feature representation and detection, object and scene recognition and learning, multi-view geometry and video processing. Since Kinect is popular these days, we will also try to squeeze recognition with RGB-D data into the schedule.

Tentative Syllabus (click to Expand / Collapse)

Image Processing
`Linear filters`
`Edge detection`
Features and matching
`Keypoint detection`
`Local descriptors`
`Matching`
Low-level and Mid-level grouping
`Segmentation`
`Region proposals`
`Hough voting`
Recognition
`Face detection and recognition`
`Object recognition`
`Object detection`
`Part-based models`
`Image labeling`
Geometry
`Image formation`
`Stereo`
`Multi-view reconstruction`
`Kinect`
Video processing
`Motion`
`Action recognition`

close Tentative Schedule

Schedule

Date	Topic	Reading	Slides	Additional material	Assignments
Jan 10	Course Introduction		lecture1.pdf tutorial1.zip
Image Processing
Jan 10	Linear Filters	Szeliski book, Ch 3.2	lecture2.pdf	code: finding Waldo, smoothing, convolution
Jan 17	Edge Detection	Szeliski book, Ch 4.2	lecture3.pdf	code: edges with Gaussian derivatives	Assignment 1: due Jan 31, 11.59pm, 2022. Submit on MarkUs
Jan 23	Edge Detection	Szeliski book, Ch 4.2	lecture4.pdf
Jan 23	Image Pyramids	Szeliski book, Ch 3.5	lecture5.pdf
Features and Matching
Jan 31	Neural Networks	lecturer: Jun Gao	lecture6.pdf
Features and Matching
Feb 07	Keypoint Detection: Harris Corner Detector	Szeliski book, Ch 4.1.1 pages: 209-215	lecture7.pdf		Assignment 2: due Feb 26, 11.59pm, 2022
Feb 14	Keypoint Detection: Scale Invariant Keypoints	Szeliski book, Ch 4.1.1 pages: 216-222	lecture8.pdf
Feb 28	Local Descriptors: SIFT, Matching	Szeliski book, Ch 4.1.2 Lowe's SIFT paper	lecture9.pdf		Assignment 3: due March 14, 11.59pm, 2022
Mar 7	Robust Matching, Homographies	Szeliski book, Ch 6.1	lecture10.pdf
Geometry
Mar 7	Camera Models	Szeliski, 2.1.5, pp. 46-54 Zisserman & Hartley, 153-158	lecture11.pdf
Mar 14	Camera Models contd.
Mar 21	Stereo: Parallel Optics		lecture12.pdf tutorial_depthmap.zip		Projects: due April 17, 11.59pm, 2022 Assignment 4: due March 31, 11.59pm, 2022
Mar 28	Stereo: General Case	Szeliski book, Ch. 11.1 Zisserman & Hartley, 239-261	lecture13.pdf epipolar_geometry.zip
Mar 28	Fast Retrieval	Sivic & Zisserman, Video Google	lecture14.pdf

Forsyth and Ponce, Computer Vision: A Modern Approach
Richard Hartley and Andrew Zisserman, Multiple View Geometry in Computer Vision
Kristen Grauman and Bastian Leibe, short book on Visual Object Recognition
Christopher M. Bishop, Pattern Recognition and Machine Learning
Tom Mitchell, Machine Learning
Stanford course on Convolutional Neural Networks

Short tutorial on getting started with Matlab
CV online: wiki with computer vision concepts
Li Fei-Fei, Rob Fergus, Antonio Torralba, Tutorial on Recognizing and Learning Object Categories
Andrew Moore, Tutorial on Support Vector Machines
Sebastian Nowozin, Christoph H. Lampert, Structured Learning and Prediction in Computer Vision (advanced Machine Learning Tutorial)

Tutorial on writing fast Matlab code
OpenCV library: open source computer vision library (c++)
VLfeat: open source computer vision library (Matlab, c)
LIBSVM: A Library for Support Vector Machines (Matlab, Python)
Structured Edge Detection Toolbox: Very fast edge detector (up to 60 fps) with high accuracy
Object Detection: Many state-of-the-art neural network models implemented in Tensorflow
Object detection code with Deformable Part-based Models
Caffe: Deep learning features for image classification
Bundler: Structure from Motion (SfM) for Unordered Image Collections
SLIC superpixels: Very fast superpixel code
UCM superpixels: Accurate superpixels
Selective Search: code for computing bottom-up region proposals

MS-COCO: Microsoft COCO dataset: detection, segmentation, captioning
PASCAL VOC: Object recognition dataset
ADE20k: Scene parsing benchmark
Matterport3D: Indoor 3D scenes
KITTI: Autonomous driving dataset
Cityscapes: Autonomous driving dataset
Reconstruction Meets Recognition Challenge (RMRC): Indoor recognition with RGB-D data
LabelMe: Open image annotation tool
ImageNet: Large-scale object dataset

Neural Network Playground: Play with neural nets in your browser
Neural Network Demos: Lots of Neural nets demos
VQA: Visual question answering demo
Image captioning: Demo for automatic image description
Pix2pix: Generate funny images of cats
Faceapp: Modify faces in interesting ways
Neural style: Change style of your images
Photosynth: View the world in 3D
Google goggles app
Clothing parser demo by Tamara Berg's group
A list of online computer vision demos

The Computer Vision Industry, a list of CV companies maintained by David Lowe

CSC 420: Introduction to Image Understanding

Winter 2022

Course Overview

Course Information

Time and Location

Winter 2022

Instructor

Sanja Fidler

TAs

Wenzheng Chen, Jun Gao, Leili Goli, Parsa Mirdehghan

Information Sheet

Programming Language(s)

Useful tutorials:

Forum

Textbook

"Computer Vision: Algorithms and Applications"

Richard Szeliski

Springer, 2010

Requirements and Grading

Detailed Requirements

Assignments

Project and Oral Exam

Grading

Syllabus

Tentative Syllabus (click to Expand / Collapse)

Schedule

Schedule

Resources