CSC466-2305 Numerical Methods for Optimization Problems
Course information for current students:
- 30-4-2024: A3 marks are posted on MarkUs.
Course marks and individual assignment marks are posted
on the
CDF (teach.cs) student secure site.
Course marks are unofficial and waiting for approval by the department/faculty.
Have a nice summer!
- 28-3-2024: A2 marks are posted on MarkUs and on the
CDF (teach.cs) student secure site.
Please note that the latter has all marks (incl test).
Please check for any inconsistencies.
Since we did not have the chance to discuss A2,
some notes on A2 are posted here.
- 23-3-2024: Assignment 3 has been posted.
Updated on 28-3-2024. Please re-load.
Updated on 29-3-2024. Please re-load.
We will not do the last week of classes,
as I will be away at a conference.
In-person office hours will be
Wednesday 27 March 2024, 1-2 PM,
Friday 29 March 2024, 3-4 PM.
I will be available through Zoom during the last week
of classes, but you need to send me a note ahead of time,
to arrange for remote office hours on specific
mutually agreeable times on an individual basis.
Please take a look at A3 so that you can ask
the most important questions during the coming week, in-person.
- 11-3-2024: Clarifications on A2
have been posted on the bulletin board.
- 4-3-2024: Term tests have been marked and marks
are available at the
CDF (teach.cs) student secure site.
Note that midterm marks are NOT available on MarkUs.
In class (5-4-2024), I will discuss some of the solutions.
- 4-3-2024:
Extra and regular office hours:
Wednesday, March 6, 1-2 PM,
Friday, March 8, 3-4 PM,
Monday, March 11, 4:30-5:30 PM,
Wednesday, March 13, 1-2 PM.
- 16-2-2024: Information on term test.
- 14-2-2024: Assignment 2 has been posted.
Due Fri 15 Mar 2024, 6 PM.
A1 has been marked and marks are available on MarkUs.
On Thursday, 15 Feb 2024, I will discuss some sensitive points on A1.
For the exam, I will soon post some past questions.
Office hours for exam:
Wednesday 21 February 2024, 1-2 PM.
Friday 23 February 2024, 3-4 PM.
Monday 26 February 2024, 4:30-5:30 PM.
I will post more extra office hours when A2 is close to due.
- 31-1-2024: An example plot in matlab.
This allows the user, through the code of matlab, to control
the properties of the plot.
While you can do this through the graphical interface of matlab,
when the results/data are adjusted, and the plot is re-done,
the graphical interface does not allow to quickly
set the same parameters for various plot properties.
The above code sets the parameters with the code just after the plot.
- 29-1-2024: MarkUs is now open.
- 18-1-2024: Assignment 1 has been posted.
Due Fri 2 Feb 2024, 6 PM.
Extra and regular office hours:
Monday, January 22, 4:30-5:30 PM,
Wednesday, January 24, 1-2 PM,
Friday, January 26, 3-4 PM,
Monday, January 29, 4:30-5:30 PM,
Wednesday, January 31, 1-2 PM.
-
The course starts on Tuesday, January 9, 2024, 3-5 PM, in MP 134.
Note that we start at 3:10 PM.
-
No classes
the week 19-23 February 2024 (Winter break / Reading week -- A&S calendar).
-
No classes
the last week (2 and 4 April 2024) (I have to be away for a conference).
-
There will be no formal tutorials.
Tutorial slots will be used for lectures, so that
we go at a slower pace.
-
Bulletin board for csc466-2305 Winter 2024
-
Important note on the use of bulletin boards:
No parts of or whole answers to the assignment/exam problems
should be posted to the boards (or anywhere else),
even after the assignment is due.
Questions and answers (even written by you) should never
be shared with anyone or anywhere.
Any violation of this rule will bring trouble to the poster.
Please use judgement before posting.
Any questions posted on the bulletin boards should be general enough
and should not reveal intermediate or final results (correct or wrong).
If unsure, ask by e-mail to instructor.
-
All assignments and exams are to be done individually by each student.
See the course outline
about academic integrity and additional information.
- Here is a latex example file,
with associated files
spyalt.eps, spyblock.eps,
trochoid1.m,
assign.bib, to compile correctly,
and output 466-2305a1.pdf.
You can use it for assignments, but you may also use your own latex template.
Please always use font size 12 and linespread 1.1, as shown in the sample file.
Do NOT use dark background in any page or figure.
See the course outline for more details on presentation.
-
CDF/Teaching Labs:
Please see http://www.cdf.toronto.edu
or http://www.teach.cs.toronto.edu
(they are the same), especially the
Resources, Intro to new students tab, and the
Using Labs, Remote Access Server tab.
-
MATLAB, etc remotely on cdf:
If you want to run matlab and other applications remotely
on teach.cs (CDF), and you are using Windows or Mac OS
on your computer/laptop, you will need the so-called
X forwarding on your computer/laptop.
X forwarding allows you to run any application, including matlab,
remotely on teach.cs (CDF) and the output/display/plots/etc
be shown on your computer/laptop.
See
https://www.teach.cs.toronto.edu/using_cdf/x2go.html
for a way to have X forwarding on your computer/laptop.
The basic Matlab is free for students.
See https://www.mathworks.com/academia/tah-portal/university-of-toronto-676468.html
So you can also install it on your computer/laptop,
and run it locally.
You can also consider free (perpetually) alternatives to Matlab,
such as Octave.
-
To use matlab on cdf, through the linux shell, you need to login via ssh
(on an xterm or other terminal in unix, linux, mac, or cygwin,
and on putty in windows,
or through some free X forwarding application such as mobaxterm, or X2Go)
with a command such as
ssh -X user@cdf.toronto.edu
or
ssh -l user -X -f wolf.cdf.toronto.edu xterm
where ``user'' is your cdf username, then, once on cdf, run
/usr/local/bin/matlab
or
/usr/local/bin/matlab -nodesktop
Within matlab, you may want to go to a certain directory, say ~/matlab,
and for this you can use the unix shell command
cd ~/matlab
within matlab. You may also want to have a startup.m file
in that directory, to always run some standard commands
(e.g. format compact) every time you start matlab.
-
The textbook for this course is
Jorge Nocedal and Steven Wright,
Numerical Optimization,
Springer NY, 2006
This book is available from the UT library as an e-book.
-
Another book we will use in the first few lectures is
Michael Heath,
Scientific Computing: an introductory survey,
SIAM 2018 or McGraw-Hill Inc. 2002+.
This book is available from
https://my.siam.org/Store/Product/viewproduct/?ProductId=30156464
The list price is approximately $94.00 USD, but
there is also a 30% discount for SIAM members
(students are able to take advantage of this if they sign-up
for a free SIAM Student membership).
https://www.siam.org/Membership/Join-SIAM/Individual-Members/Student
Textbook web page (see educational modules)
Material to be covered in the course
(with textbook sections in parentheses)
2024-01-09 (2 hours)
1. Introduction
1.1 Scope of the course
1.2 Vectors and matrices [N+W A.1]
1.3 Eigenvalues and eigenvectors [N+W A.1]
2024-01-11 (1 hour)
1.4 Norms and inner products [N+W A.1]
Condition number of a matrix [N+W A.1]
2024-01-16 (2 hours)
1.5 Rate of convergence of sequences [N+W A.2]
1.6 Convexity [N+W Ch 1]
1.7 Gradient, Hessian, Jacobian, directional derivative, steepest descent direction [N+W A.2]
1.8 Mean Value Theorem, Taylor's Theorem [N+W A.2]
1.9 Global, local, etc, minimizers, conditions for optimality [N+W 2.1]
2024-01-18 (1 hour)
1.10 Finite difference approximations to derivatives [~N+W 8.1]
1.11 O(h) and O(n) notations, convergence of a discretization method [N+W A.2]
2024-01-23 (2 hours)
2.1 Optimization problems and methods (general) [N+W Ch 1]
2.2 1D opt: Golden section search method [Heath 6.4]
2.3 1D opt: Successive parabolic interpolation [Heath 6.4]
2.4 1D opt: Newton's [Heath 6.4]
2024-01-25 (1 hour)
2.5 Optimization methods in multiple dimensions (general) [N+W 2.2]
2.6 multi-dim opt: Steepest descent direction [N+W 2.2]
2.7 multi-dim opt: Descent directions [N+W 2.2]
2.8 multi-dim opt: Newton's direction [N+W 2.2]
2.9 multi-dim opt: Quasi-Newton's directions, and approximations to Hessian inverse [N+W 2.2]
2024-01-30 (2 hours)
2.9 multi-dim opt: Quasi-Newton's directions, and approximations to Hessian inverse [N+W 2.2] (end)
2.10 multi-dim opt: Conjugate gradient direction [N+W 2.2]
2.11 Quadratic function, gradient, Hessian and linear system
3.1 Line-search methods - picking a stepsize [N+W 3.1]
3.2 The Wolfe conditions, sufficient decrease and curvature [N+W 3.1]
3.3 The strong Wolfe conditions [N+W 3.1]
Existence of intervals of stepsizes satisfying the (strong) Wolfe conditions
2024-02-01 (1 hour)
3.4 The Goldstein conditions [N+W 3.1]
3.5 Simple backtracking algorithm for stepsize [N+W 3.1]
2024-02-06 (2 hours)
3.6 Convergence of line-search methods [N+W 3.2]
Zoutendjik theorem
3.7 Steepest descent with exact line search [N+W 3.3]
Rate of convergence
3.8 Newton's convergence and rate [N+W 3.3]
3.9 Quasi-Newton methods' convergence and rate [N+W 3.3]
2024-02-08 (1 hour)
3.10 Newton's: modifications to Hessian [N+W 3.4]
Eigenvalue modification
Diagonal shift
Modified Cholesky factorization
Modified symmetric indefinite factorization
2024-02-13 (2 hours)
3.11 More on backtracking algorithms for calculating the stepsize [N+W 3.5]
Initial condition
Algo to satisfy the Wolfe conditions
4.1 Trust region methods [N+W 4.0]
Validity of model and region, generic algorithm
4.2 Constrained minimization of quadratic - complementarity problem [N+W 4.0]
4.3 Constrained minimization of quadratic - Cauchy point [N+W 4.1]
4.4 Constrained minimization of quadratic - Dogleg method [N+W 4.1]
4.5 Constrained minimization of quadratic - two-dim subspace minimization [N+W 4.1]
4.6 Convergence of trust region - reduction by Cauchy point and other [N+W 4.2]
2024-02-15 (1 hour)
A1 discussion
reading week
2024-02-27 (2 hours)
term test
2024-02-29 (1 hour)
4.6 Convergence of trust region - reduction by Cauchy point and other [N+W 4.2]
4.7 Convergence of trust region to stationary points [N+W 4.2]
4.8 Constrained minimization of quadratic - complementarity problem [N+W 4.3]
4.9 Convergence of trust region (rate) [N+W 4.4]
4.10 Other (scaling, norms) [N+W 4.5]
2024-03-05 (2 hours)
5. Conjugate gradient method [N+W 5.]
5.1 Intro and motivation
5.2 Conjugate direction method
5.3 Gram-Schmidt orthogonalization
5.4 Properties of conjugate direction methods
2024-03-07 (1 hour)
5.5 Conjugate gradient method
first version of CG algorithm
efficient/practical version of CG algorithm
convergence of CG method (rate)
comparison with steepest descent
convergence of CG method: eigenvalues
2024-03-12 (2 hours)
5.6 Preconditioning
ILU, IC, SSOR
5.7 Preconditioned conjugate gradient (PCG) algorithm
5.8 Nonlinear conjugate gradient methods
Fletcher-Reeves, Polak-Ribi`ere, PR+, FR-PR,
Hestenes-Stiefel, Y. Dai - Y. Yuan, W. W. Hager - H. Zhang
Restarts and quality control
5.9 FR convergence, PR+, FR-PR, DY, HZ
PR convergence and non-convergence
2024-03-14 (1 hour)
6. (more on) Quasi-Newton methods
6.1 The DFP method -- derivation of update
6.2 The BFGS method -- update, algorithm, initialization, pitfalls
2024-03-19 (2 hours)
6.2 The BFGS method -- update, algorithm, initialization, pitfalls (cont)
6.3 The damped BFGS update
6.4 The SR1 method -- update, properties
6.5 The Broyden class methods -- update, restricted class, properties
6.6 Convergence of BFGS -- restricted Broyden, rate
2024-03-21 (1 hour)
7. Large-scale optimization
7.1 Inexact Newton methods
7.2 Convergence of inexact Newton, rate
7.3 Line search Newton-CG (truncated Newton) method
7.4 Hessian-free computation
2024-03-26 (2 hours)
7.5 Trust region Newton-CG (CG-Steihaug) method
7.6 Limited memory quasi-Newton methods: L-BFGS, two-loops recursion
7.7 Relation between L-BFGS and nonlinear CG (Hestenes-Stiefel)
7.8 Sparse quasi-Newton updates
7.9 Partially separable functions
2024-03-28 (1 hour)
9. Derivative-free optimization
9.1 Model-based methods
9.2 Simplex-based methods: Nedler-Mead method
no more classes!
Notes and handouts:
Note on use of notes:
Notes will be available when the course starts.
While it may be convenient to study for the course by reading the notes,
it should be made clear that the notes are not there to substitute
the textbook or any relevant book. Notes are always more condensed
and give less overall information than books.
Notes with math notation, etc, are difficult to read online.
It may be preferable for some of you to print out the 4-page style notes
on paper (preferably double-sided).
Access to the data below requires that
you type in your CDF (teaching labs) username
(same as UTorId)
and last 5 digits of your student number as password.
This password (for accessing the website) cannot be reset.
Lecture notes
- Matrices, vectors, spaces,
one page,
[photo:orthogonal, monotone, M- matrices]
- Eigenvalues, eigenvectors, decompositions,
one page,
[photo: eigval/vct of matrix polynomial,
eigval/vct of matrix polynomial],
[photo: Jordan block],
[photo: SVD, QR, Gram-Schmidt]
- Inner products, norms, condition numbers,
one page,
[photo: A^(1/2), cond_num of trid(-1, 2, -1)],
[photo: machine epsilon, RCOND warning],
[photo: pd, spd, psd, spsd matrices],
[photo: orthogonal matrices],
- Preliminary: convergence rate, convexity, gradient, Hessian, directional derivative, Taylor's, minimizer, optimality conditions,
finite differences, O(h) and O(n),
one page,
[photo: directional derivative],
[photo: steepest descent direction],
[photo: Taylor's, isolated local min],
[photo: first-order necessary],
[photo: second-order suffcient],
[photo: convex, local min, global],
[photo: FD approx to u''],
[photo: FD approx to u''],
[photo: FD stencils],
[photo: O(h), O(n), order of discretization],
- Intro to optimization, 1D optimization,
one page,
[photo: residual and error],
[photo: golden section search method],
[photo: convergence, number of iterations in GSS],
[photo: quadratic interpolation],
[photo: orders of convergence of SPI and secant],
[photo: methods for solving nonlinear scalar equs],
[photo: (1D) bisection for solving f(x) = 0],
[photo: 1D secant for solving f(x) = 0],
[photo: 1D Newton for solving f(x) = 0],
[photo: simple root, root multiplicity],
- Multi-dimensional optimization -- directions,
one page,
[photo: Hessian updates, secant equ],
- Line search methods,
one page,
[photo: descent direction, secont equation],
[photo: gradient of quadratic],
[photo: gradient and Hessian of quadratic],
[photo: Wolfe, Goldstein],
[photo: proof of Lemma],
[photo: proof of Lemma],
[photo: proof of Lemma],
[photo: backtracking alg],
[photo: convergence of line search (Thm)],
[photo: convergence of line search (Thm)],
[photo: convergence of line search (Thm)],
[photo: convergence of quasi-Newton (Thm, cond num)],
[photo: convergence of quasi-Newton (Thm, misc)],
[photo: convergence rate of SD (orthogonality)],
[photo: convergence rate of SD],
[photo: diagonal shift, Hermite cubic interp],
[photo: unique polynomial interp]
- Trust region methods,
one page,
[photo: generic trust region algo],
[photo: constrained minimizer of linear],
[photo: dogleg step],
[photo: Cauchy point reduction]
[photo: TT, q1]
[photo: TT, q4]
- Conjugate gradient methods,
one page,
[photo: A-conjugate, A^{1/2} norm, Gram-Schmidt],
[photo: A-conjugate implies lin.indep., pk^T r0 = pk^T rk],
[photo: rk+1 = rk = alphak A pk],
[photo: rk^T pi = 0],
[photo: derive betak],
[photo: rk^T ri = 0],
[photo: alternative betak, distribution of eigenvalues],
- Quasi-Newton methods,
one page,
[photo: curvature condition for qNewton, damped BFGS],
[photo: damped BFGS],
[photo: damped BFGS],
- Large-scale optimization,
one page,
[photo: derivative-free applicaiton of Hessian to vector, h_min],
- Derivative-free optimization,
one page,
[photo: DFO matlab, constrained to unconstrained via penalty],
Assignments
Other