Everyone has large photo collections these days. How can you intelligently find all pictures in which your dog appears? How can you find all pictures in which you are frowning? Can we make cars smart, e.g., can the car drive you to school while you finish your last homework? How can a home robot understand the environment, e.g., switch on a tv when being told so and serve you dinner? If you take a few pictures of your living room, can you reconstruct it in 3D (which allows you to render it from any new viewpoint and thus allows you to create a "virtual tour" of your room)? Can you reconstruct it from one image alone? How can you efficiently browse your home movie collection, e.g. find all shots in which Tom Cruise is chasing a bad guy?

This class is an introduction to fundamental concepts in image understanding, the subdiscipline of artificial intelligence that tries to make the computers "see". It will survey a variety of interesting vision problems and techniques. Specifically, the course will cover image formation, features, object and scene recognition and learning, multi-view geometry and video processing. Since Kinect is popular these days, we will also try to squeeze recognition with RGB-D data into the schedule. The goal of the class will be to grasp a number of computer vision problems and understand basic approaches to tackle them for real-world applications.

Prerequisites: A second year course in data structures (e.g., CSC263H), first year calculus (e.g., MAT135Y), and linear algebra (e.g., MAT223H) are required. Students who have not taken CSC320H will be expected to do some extra reading (e.g., on image gradients). Matlab will extensively used in the programming excercises, so any prior exposure to it is a plus (but not a requirement).

back to top

When emailing us, please put CSC420 in the subject line.

Information Sheet

The information sheet for the class is available here.

Programming Language(s)

You are expected to do some programming assignments for the class. You can code in either Python, Matlab, or C. The tutorials will be given in Python.



Forum

We will use Quercus for announcements, posting of assignments, QA and discussions.

back to top

We will not directly follow any textbook, however, we will require some reading in the textbook below. Additional readings and material will be posted in the schedule table as well as the resources section.

back to top

Each student is expected to complete four assignments which will be in the form of problem sets and programming problems, and complete a project.

Assignments

Assignments will be given every two weeks. They will consist of problem sets and programming problems with the goal of deepening your understanding of the material covered in class. All solutions and programming should be done individually. There will be four assignments altogether, each worth 15% of the final grade.

Deadline: The solutions to the assignments should be submitted by 11.59pm on the date they are due. Anything from 1 minute late to 24 hours will count as one late day.

Lateness policy: Each student will be given a total of 3 free late days. This means that you can hand in three of your assignments one day late, or one assignment three days late. It is up to you to make a good planning of your work. After you have used your 3 day budget, your late assignments will not be accepted.

Plagiarism: We take plagiarism very seriously. Everything you hand in to be marked, namely assignments and projects, must represent your own work. Read How not to plagiarize.

Project and Oral Exam

Each student will be given a topic for the project. You will be able to choose from a list of projects, or propose your own project which will need to be discussed and approved by your instructor. You will need to hand in a report and give a presentation. During the presentation the instructor will ask questions about class material as well as individual assignments. The grade will heavily depend on how well the material is defended and how well the class material is understood by the student.

The final grade will be computed as follows:

Assignments60%
(4 assignments, each worth 15%)
Project and Oral Exam40%
(report, presentation: 30%, oral exam: 10%)

back to top

The course will cover image formation, feature representation and detection, object and scene recognition and learning, multi-view geometry and video processing. Since Kinect is popular these days, we will also try to squeeze recognition with RGB-D data into the schedule.

Image Processing
Linear filters
Edge detection
Features and matching
Keypoint detection
Local descriptors
Matching
Low-level and Mid-level grouping
Segmentation
Region proposals
Hough voting
Recognition
Face detection and recognition
Object recognition
Object detection
Part-based models
Image labeling
Geometry
Image formation
Stereo
Multi-view reconstruction
Kinect
Video processing
Motion
Action recognition
close Tentative Schedule

back to top

DateTopicReading SlidesAdditional materialAssignments
Jan 11Course Introduction lecture1.pdf
tutorial1.zip
Image Processing
Jan 11Linear FiltersSzeliski book, Ch 3.2lecture2.pdf
code: finding Waldo, smoothing, convolution
Jan 18Edge DetectionSzeliski book, Ch 4.2lecture3.pdfcode: edges with Gaussian derivatives Assignment 1: due Jan 31, 11.59pm, 2021. Submit on MarkUs
Jan 25Edge DetectionSzeliski book, Ch 4.2lecture4.pdf
Feb 01Image PyramidsSzeliski book, Ch 3.5lecture5.pdf
Features and Matching
Feb 01Keypoint Detection: Harris Corner DetectorSzeliski book, Ch 4.1.1
pages:   209-215
lecture6.pdf
Assignment 2: due Feb 16, 11.59pm, 2021
Feb 08Keypoint Detection: Scale Invariant KeypointsSzeliski book, Ch 4.1.1
pages:   216-222
lecture7.pdf
tutorial3.ipynb
Feb 22Local Descriptors: SIFT,
Matching
Szeliski book, Ch 4.1.2
Lowe's SIFT paper
lecture8.pdf
tutorial4.zip
code: SIFT code
Mar 1Research presentation by Amlan Kar, Seung Kim and Hirotaka IshiharaAssignment 3: due March 16, 11.59pm, 2021
Mar 8Robust Matching, HomographiesSzeliski book, Ch 6.1lecture9.pdf
tutorial5.zip
Geometry
Mar 8Camera ModelsSzeliski, 2.1.5, pp. 46-54
Zisserman & Hartley, 153-158
lecture10.pdf
Mar 15Homography revisitedlecture11.pdf
(hi-res version)
Mar 15Stereo: Parallel Optics lecture12.pdf
tutorial_depthmap.zip
Mar 22Stereo: General CaseSzeliski book, Ch. 11.1
Zisserman & Hartley, 239-261
lecture13.pdf
(hi-res version)
epipolar_geometry.zip
Assignment 4: due March 30, 11.59pm, 2021
Recognition
Mar 29Recognition: OverviewGrauman & Leibe, Visual Object Recognitionlecture14.pdf Projects: due April 15, 11.59pm, 2021
Mar 29Fast RetrievalSivic & Zisserman, Video Googlelecture16.pdf
Apr 05Implicit Shape ModelB. Leibe et al., Robust Object Detection with Interleaved Categorization and Segmentationlecture17.pdf

back to top


back to top