CSC 2541: Visual Perception for Autonomous Driving

This class is a graduate course in visual perception for autonomous driving. The class will briefly cover topics in localization, ego-motion estimaton, free-space estimation, visual recognition (classification, detection, segmentation), etc

Prerequisites: A good knowledge of statistics, linear algebra, calculus is necessary as well as good programming skills. A good knowledge of computer vision and machine learning is strongly recommended.

August 12th: Course webpage has been created

Time and Location

Winter 2016

Day: Tuesday
Time: 1:00-3:00pm
Room: BA2145

Instructors

Raquel Urtasun

Email: urtasun@cs dot toronto dot edu
Homepage: http://www.cs.toronto.edu/~urtasun
Office hours: TBA

When emailing me, please put CSC2541 in the subject line.

Each student will need to write two paper reviews each week, present once or twice in class (depending on enrollment), participate in class discussions, and complete a project (done individually or in pairs).

Grading

The final grade will consist of the following
`Participation` (attendance, participation in discussions, reviews)	20%
`Presentation` (presentation of papers in class)	20%
`Project` (proposal, final report, presentation)	60%

Detailed Requirements

Paper reviewing

Every week (except for the first two) we will read 2 to 3 papers. The success of the discussion in class will thus be due to how prepared the students come to class. Each student is expected to read all the papers that will be discussed and write two detailed reviews about the selected two papers. Depending on enrollment, each student will need to also present a paper in class. When you present, you do not need to hand in the review.

Deadline: The reviews will be due one day before the class.

Structure of the review
`Short summary of the paper`
`Main contributions`
`Positive and negatives points`
`How strong is the evaluation?`
`Possible directions for future work`

Presentation

Depending on enrollment, each student will need to present a few papers in class. The presentation should be clear and practiced and the student should read the assigned paper and related work in enough detail to be able to lead a discussion and answer questions. Extra credit will be given to students who also prepare a simple experimental demo highlighting how the method works in practice.

A presentation should be roughly 45 minutes long (please time it beforehand so that you do not go overtime). Typically this is about 30 slides. You are allowed to take some material from presentations on the web as long as you cite the source fairly. In the presentation, also provide the citation to the papers you present and to any other related work you reference.

Deadline: The presentation should be handed in one day before the class (or before if you want feedback).

Structure of presentation:
`High-level overview with contributions`
`Main motivation`
`Clear statement of the problem`
`Overview of the technical approach`
`Strengths/weaknesses of the approach`
`Overview of the experimental evaluation`
`Strengths/weaknesses of evaluation`
`Discussion: future direction, links to other work`

Project

Each student will need to write a short project proposal in the beginning of the class (in January). The projects will be research oriented. In the middle of semester course you will need to hand in a progress report. One week prior to the end of the class the final project report will need to be handed in and presented in the last lecture of the class (April). This will be a short, roughly 15-20 min, presentation.

The students can work on projects individually or in pairs. The project can be an interesting topic that the student comes up with himself/herself or with the help of the instructor. The grade will depend on the ideas, how well you present them in the report, how well you position your work in the related literature, how thorough are your experiments and how thoughtful are your conclusions.

Coming soon

Schedule (tentative)

Date	Topic	Readings	Presenters	Slides
Jan 12, 19	Introduction		Raquel Urtasun	intro
Feb 2	Stereo	Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches, CVPR 2015 [PDF] [code] J. Zbontar and Y. LeCun Stereo Processing by Semi-Global Matching and Mutual Information, PAMI 2008 [PDF] H. Hirschmueller Efficient Joint Segmentation, Occlusion Labeling, Stereo and Flow Estimation, ECCV 2014 [PDF] [code] K. Yamaguchi, D. McAllester and R. Urtasun	Wenjie Luo	stereo
Feb 2, 9	Optical Flow	Non-Local Total Generalized Variation for Optical Flow Estimation, ECCV 2014 [PDF] R. Ranftl, K. Bredies and T. Pock Large displacement optical flow: Descriptor matching in variational motion estimation, PAMI 2011 [PDF] T. Brox and J. Malik FlowNet: Learning Optical Flow with Convolutional Networks, ICCV 2015 [PDF] A. Dosovitskiy, P. Fischer, E. Ilg, P. Häusser, C. Hazırbaş, V. Golkov, P. Smagt, D. Cremers,T. Brox A Quantitative Analysis of Current Practices in Optical Flow Estimation and The Principles Behind Them, IJCV 2011 [PDF] D. Sun, S. Roth and M. Black	Shenlong Wang	motion
Feb 9, 16	Optical Flow / Scene Flow	EpicFlow: Edge-Preserving Interpolation of Correspondences for Optical Flow, CVPR 2015 [PDF] [code] J. Revaud, P. Weinzaepfel, Z. Harchaoui and C. Schmid DeepFlow: Large displacement optical flow with deep matching, ICCV 2013 [PDF] P. Weinzaepfel, J. Revaud, Z. Harchaoui and C. Schmid Robust Monocular Epipolar Flow Estimation, CVPR 2013 [PDF] K. Yamaguchi, D. McAllester and R. Urtasun 3D Scene Flow Estimation with a Piecewise Rigid Scene Model, CVPR 2015 [PDF] C. Vogel, K. Schindler and S. Roth	Min Bai	scene_flow
Feb 16	Visual Odometry	Visual-lidar Odometry and Mapping: Low- rift, Robust, and Fast, ICRA 2015 [PDF] J. Zhang and S. Singh StereoScan: Dense 3d Reconstruction in Real-time, IV 2011 [PDF] A. Geiger, J. Ziegler and C. Stiller Real-time stereo visual odometry for autonomous ground vehicles, IROS 2008 [PDF] A. Howard	Patric McGarey	visual odometry
Feb 23	SLAM	DTAM: Dense tracking and mapping in real-time, ICCV 2011 [PDF] Newcombe, R., Lovegrove, S., Davison, A Large-scale direct slam with stereo cameras, IROS 2015 [PDF] J. Engel, J. Stuckler, and D. Cremers LSD-SLAM: Large-scale direct monocular SLAM, ECCV 2014 [PDF] J. Engel, T. Schöps, and D. Cremers Relative continuous-time SLAM, International Journal of Robotics Research 2015 [PDF] S. Anderson, K. MacTavish, and T. Barfoot Full STEAM ahead: Exactly sparse gaussian process regression for batch continuous-time trajectory estimation on SE (3), IROS 2015 [PDF] Newcombe, R., Lovegrove, S., Davison, A Long-term 3D map maintenance in dynamic environments, ICRA 2014 [PDF] F. Pomerleau, P. Krusi, F. Colas, P. Furgale and R. Siegwart	Kirk MacTavish, Lingzhu Xiang	SLAM
March 1	Free-Space Estimation	Chapter 9 of Probabilistic Robotics Book [PDF] S. Thrun, W. Burgard, D. Fox Free Space Computation Using Stochastic Occupancy Grids and Dynamic Programming In Workshop Dynamical Vision ICCV 2007 [PDF] H. Badino, U. Franke and R. Mester The Stixel World - A Compact Medium Level Representation of the 3D-World DAGM 2009 [PDF] H. Badino, U. Franke and D. Pfeiffer	Hao Wu	Free-Space
March 8	2D Object Detection	Rich feature hierarchies for accurate object detection and semantic segmentation CVPR 2014 [PDF] R. Girshick, J. Donahue, T. Darrell, J. Malik Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks NIPS 2015 [PDF] S. Ren, K. He, R. Girshick, J. Sun Deep Residual Learning for Image Recognition ArXiv Dec 2015 [PDF] K. He, X. Zhang, S. Ren, J. Sun	Renjie Liao	2D Detection
March 8	3D Object Detection	Data-Driven 3D Voxel Patterns for Object Category Recognition CVPR 2015 [PDF] Y. Xiang, W. Choi, Y. Lin and S. Savarese 3D Object Proposals for Accurate Object Class Detection NIPS 2015 [PDF] X. Chen, K. Kundu, Y. Zhu, A. Berneshawi, H. Ma, S. Fidler and R. Urtasun	Zhen Li	3D Detection
March 15	Semantic Segmentation	Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs ICLR 2015 [PDF]; [Code];nbsp L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation ArXiv 2015 [PDF] [Project] V. Badrinarayanan, A. Kendall, R. Cipolla Joint Semantic Segmentation and 3D Reconstruction from Monocular Video ECCV 2014 [PDF] [Project] A. Kundu, Y. Li, F. Dellaert, F. Li, and J. M. Rehg	Stefania Raimondo	Semantic Segmentation
March 15	Instance-level Segmentation	Instance Segmentation of Indoor Scenes using a Coverage Loss ECCV 2014 [PDF] N. Silberman, D. Sontag, R. Fergus Instance-Level Segmentation with Deep Densely Connected MRFs Arxiv Dec 2015 [PDF] Z. Zhang, S. Fidler, and R. Urtasun	Mengye Ren	Instance-level Segmentation
March 22	Tracking	Global Data Association for Multi-Object Tracking Using Network Flows CVPR 2008 [PDF] L. Zhang and R. Nevatia Multiple Object Tracking using K-Shortest Paths Optimization PAMI 2011 [PDF] J. Berclaz, F. Fleuret, E. Turetken, and P. Fua Multi-target tracking by discrete-continuous energy minimization PAMI 2016 [PDF] A. Milan, K. Schindler, S. Roth	Wenjie Luo	Tracking
March 29	Place Recognition	FAB-MAP: Probabilistic localization and mapping in the space of appearance IJRR 2008 [PDF] M. Cummins and P. Newman Place recognition with ConvNet landmarks: Viewpoint-robust, condition-robust, training-free RSS 2015 [PDF] N. Sünderhauf, S. Shirazi, and A. Jacobson Convolutional networks for real-time 6-DOF camera relocalization ArXiv 2015 [PDF] A. Kendall, M. Grimes, and R. Cipolla	Valentin Peretroukhin	Place Recognition

CSC2541:

Visual Perception for Autonomous Driving

Winter 2016

Course overview

NEWS

Course Information

Time and Location

Winter 2016

Instructors

Raquel Urtasun

Requirements