CSC 121 - Computer Science for Statistics (Jan-Apr 2017)

Solutions to all assignments are now posted below.

Grades for your marked asssignments, as they become available, will be viewable on MarkUs.

Office hours will continue until the final exam, at the usual time of Wednesdays 10:30-11:30, in SS6026A. There are no longer TA office hours.

CSC121 is an introduction to programming aimed at students studying to be statisticians, or who otherwise expect to use statistics extensively in their work. It uses the R language, which is widely used for statistical research and applications. It is meant to be accessible to students with no programming experience. Although it is motivated by statistical applications, no specific knowledge of statistics is required to take this course.

Other options for introductory programming are CSC120, which is meant for general science students, and CSC108, which is meant for students planning on taking further computer science courses, starting with CSC148. Both of these other courses use the Python language. It is possible to take CSC148 and later CS courses after taking CSC121, without taking CSC108, but to do so you will have to pick up some material on your own, so it's better to take CSC108 if you know now that you will be taking later CS courses. You should consult the CS department for advice if you take CSC121 and then discover that you want to take more CS courses.

Instructor: Radford Neal, Office: SS6026A / PT390, Email: radford@cs.utoronto.ca

Office hours: Wednesdays 10:30-11:30, in SS6026A.

Lectures and Tutorials/Labs:

Lectures: Tuesdays and Fridays, 1:10-2:00pm, MB128.
Tutorials/Labs: Wednesdays, 1:10-3:00, BA3185 (and nearby rooms).

Labs will start January 11. The lab the first week will be to make sure you know how to log into the computers, start up R using Rstudio, and do simple things in R. If you have a laptop computer, you should bring it, so you can also learn how to install R and Rstudio on it. Later labs will involve non-credit excercises, which will prepare you for assignments, the mid-term test, and the final exam.

Evaluation:

15% - Three small assignments (5% each), due at start of lectures on January 31, February 14, and February 28.
20% - Two large assignments (10% each), due at start of lectures on March 17 and April 4.
18% - Mid-term test (50 minutes), written during lecture time on March 3.
47% - Final exam (scheduled by the Faculty).

You must get at least 40% on the final exam to pass the course. If you don't, your final grade will be less than 50 regardless of your other grades in this course.

Assignments will be done by students individually.

Assignments may be handed in late by the date and time specified on the handout with a 20% penalty. Later assignments will not be accepted.

Students with a valid medical or other excuse will of course not be penalized for late assignments, or for missing the mid-term test. You must, however, contact the instructor (eg, by email) as soon as possible (preferrably beforehand) if you are not able to hand in an assignment or write the mid-term test. If you miss the final exam for a valid reason, you should petition the Faculty to write a deferred final exam.

Getting and Using R:

You will use R via Rstudio on the CS teaching computer system, on which you should automatically have an account if you are registered in this course. To start Rstudio, you need to open a terminal window, and then type "rstudio".

You can also install R and Rstudio (for free) on your own laptop or desktop computer. You can download R from r-project.org, and Rstudio from rstudio.com.

The lab in the first week (January 11, 1:10-3:00pm) will help you get started on these things.

Documentation on R:

There is no textbook for this course (lecture slides are below). However, here are links to some on-line documentation on R:

Introduction to R (R Core Team).
R Language Definition (R Core Team).
Introduction to some aspects of R (Chi Yau).
Advanced R (Hadley Wickham).

Lecture Slides:

Slides for the lectures will be posted here as they become available.

Week 1: Introduction, numeric and character data, arithmetic, mathematical functions, string functions (paste and substring), variables and assignment, vectors, c, length, vector arithmetic and subscripting, plotting vectors. The iris demo program is here.

Week 2: R scripts done with source, functions scan, mean, sd, abline, defining functions, multiple steps with curly brackets, input and output with readLines and cat.

Week 3: Using if, for, and while, logical data and comparison operations, sequences with :.

Week 4: Lists, more on logical values, AND, OR, and NOT.

Week 5: Making vectors with rep and seq, matrices, matrix indexing, cbind and rbind, perspective and contour plots, named and optional arguments.

Week 6: Random number generation, runif, sample, set.seed, random walks. Environments, local and global variables and assignments.

Week 7: Names, attributes of objects, classes, data frames, NA and NaN.

Week 8: Vectors as subscripts, some design flaws in R.

Week 9: Functions of vectors, creating a plot in stages.

Week 10: Loops with repeat and break, return, recursion, apply and lapply.

Week 11: Factors, dates, object-oriented programming, table, modeling with lm, useful functions for vectors. For the drawing example, here are the functions and script.

Week 12: Program speed.

Week 13: R packages, More on testing, source code control, other programming languages.

Lab Exercises:

January 11: Logging in to the CS teaching computers, and then starting up R. Installing R and RStudio on your laptop (if you bring one). Simple operations in R.
January 18: handout. Solutions: simple exercises, first temp/deaths script, second temp/deaths script.
January 25: handout, solutions.
February 1: handout, Solution: function definitions, test script.
February 8: handout, Solutions: function definitions, test script.
February 15: handout.
March 1: handout, solutions.
March 8: handout. Explains how to use knitr, which you'll need for Large Assignment 1.
March 15: handout, solutions.
March 22: handout, solutions..
March 29: In this lab, you can catch up on lab exercises you haven't finished yet.
April 5: handout.

Assignments:

Small Assignment 1: handout, tests (you must add your own tests to these). Solution: function definitions, test script, output of test script.

Small Assignment 2: handout, tests (you must add your own tests to these). Solution: function definitions, test script, output of test script.

Small Assignment 3: handout, data file. Solution: function definitions, test script, output of test script, script looking at weather data, output of weather script.

Large Assignment 1: handout. Solution: function definitions, knitr script, knitr output.

Large Assignment 2: handout.
Part 1 Solution: function definitions, knitr script, knitr output.
Part 2 Solution: function definitions, knitr script, knitr output.

Midterm test:

Here are the test paper and answers. Of course the programming questions have more than one possible correct answer.

The six questions on the midterm were each marked out of 18, for a total of 108 possible marks. However, the mark received will be treated as if it were out of 100. (Marks over 100 will count as over 100%, but you can't get a final course grade over 100%.)

Final exam:

The Faculty of Arts & Science final exam schedule is here.

Past exams for the R section of CSC120 are here and here.

Web pages for previous versions of this course:

This course was previously taught as a special section of CSC120. Here are the web pages for past versions.

Spring 2016
Spring 2015