CSC 120 - Computer Science for the Sciences (R section, L0201, Jan-Apr 2016)

You should now have received emails with all the marked assignments you handed in. All marks (except final course mark) are now on Portal. The Quiz2 mark shown is the unadjusted mark. The assignment marks do not include any late penalty that was applied. All marks are out of 100, except for the final exam, which is out of 110.

After reviewing the marking for Assignment 3, I have changed some of the marks assigned by the TAs. In particular, the TAs sometimes took off marks for not setting the random seed to your ID for every simulation, which was not required, and for not dividing the simulation function into smaller functions, which is a stylistic judgement with no clear right or wrong choice. I will be submitting mark change requests for students whose final mark was increased by this remarking. Note that the remark did not always increase the assignment grade, even when marks were added, because the remark also changed how I adjusted for differences between the TAs. (I kept the old marks for students whose new mark was actually lower.)

CSC 120 L0201 is an introduction to programming using the R language, which is widely used for statistical applications. It is designed for students with no programming experience, who are either planning to specialize in statistics or who expect to use statistics extensively while studying other fields.

Students in other areas of science may wish to take section L0101 of CSC 120, which uses the Python language. Students who plan on taking further courses in CS may wish to take CSC 108 instead of CSC 120, and then take CSC 148. It is possible to take CSC 148 after CSC 120 without taking CSC 108, but to do so you will have to pick up some material on your own.

Instructor: Radford Neal, Office: SS6026A / PT384, Email: radford@cs.utoronto.ca

Office hours: Thursdays, 2:10 to 3:00, in SS6026A.

Lectures and Tutorials/Labs:

Lectures: Tuesdays and Fridays, 3:10-4:00pm, MP202.
Tutorials/Labs: Wednesdays, 1:10-3:00, CDF lab, room BA3185 (& others).

Labs will start January 13. The lab the first week will be to make sure you know how to log into the CDF computers, start up R using Rstudio, and do simple things in R. If you have a laptop computer, you should bring it, so you can also learn how to install R and Rstudio on it. Later labs will involve non-credit excercises, which will prepare you for assignments, quizzes, and the exam.

Evaluation:

21% - Three 30-minute quizzes (7% each), written during lectures on January 29, February 26, and March 18.
30% - Three assignments (10% each), due at start of lectures on February 26, March 22, and April 8.
49% - Final exam (scheduled by the Faculty).

You must get at least 40% on the final exam to pass the course. If you don't, your final grade will be less than 50 regardless of your other grades in this course.

Assignments will be done by students individually.

Assignments may be handed in late by the date and time specified on the handout with a 20% penalty. Later assignments will not be accepted.

Students with a valid medical or other excuse will of course not be penalized for late assignments, or for missing a quiz. You must, however, contact the instructor (eg, by email) as soon as possible (preferrably beforehand) if you are not able to hand in an assignment or write a quiz.

Getting and Using R:

You will use R via Rstudio on the CDF computer system, on which you should automatically have an account if you are registered in this course. To start Rstudio, you need to open a terminal window, and then type "rstudio".

You can also install R and Rstudio (for free) on your own laptop or desktop computer. You can download R from r-project.org, and Rstudio from rstudio.com.

The lab in the first week (January 13, 1:10-3:00pm) will help you get started on these things.

Documentation on R:

There is no textbook for this course (lecture slides are below). However, here are links to some on-line documentation on R:

Introduction to R (R Core Team).
R Language Definition (R Core Team).
Introduction to some aspects of R (Chi Yau).
Advanced R (Hadley Wickham).

Lecture Slides:

Slides for the lectures will be posted here as they become available.

Week 1: Introduction, numeric and character data, arithmetic, mathematical functions, string functions (paste and substring), variables and assignment, vectors, c, length, vector arithmetic and subscripting, plotting vectors. The iris demo program is here.

Week 2: R scripts done with source, functions scan, mean, sd, abline, defining functions, multiple steps with curly brackets, input and output with readLines and cat.

Week 3: Using if, for, and while, logical data and comparison operations, sequences with :.

Week 4: Making vectors with rep and seq, matrices, matrix indexing, cbind and rbind, perspective and contour plots, named and optional arguments, more on plots, random number generation, with runif, sample, and set.seed.

Week 5: Lists, using lists to return multiple values, more on random number generation, random walks.

Week 6: Environments, local and global variables and assignments.

Week 7: Names, attributes of objects, classes, data frames, NA and NaN, numeric and logical vectors as subscripts.

Week 8: More on vector operations, lapply and sapply, some design flaws in R.

Week 9: Loops with repeat and break, return, recursion.

Week 10: Object-oriented programming.

Week 11: Functions for standard distributions, fitting linear models with lm, R packages.

Week 12: Testing, source code control, other languages and implementations. Also, here is the Tuesday lecture's example program, for solving last year's assignment 2: function definitions, testing script, knitr script for data set 1, knitr script for data set 2.

Lab Exercises:

January 13: Logging in to CDF, and then starting up R using RStudio. Installing R and RStudio on your laptop (if you bring one). Simple operations in R.
January 20: handout. Solution scripts: Part A, Part B.
January 27: handout, solutions.
February 3: handout, function definitions, script to try them.
February 10: handout, random walk function, function definitions, script to try them.
February 24: handout, Part I solution, Part II solution.
March 2: handout.
March 9: handout, solutions.
March 16: handout, solutions.
March 23: handout.
March 30: handout, solutions.
April 5: handout.

Assignments:

Assignment 1: handout. Clarification: Note that the find_pairings function should, for each random set of initial pairings, call improve_pairings repeatedly (not just once), until improve_pairings is not able to reduce the total distance.
Solution: function definitions, test script, text output, plot1, plot2.

Assignment 2: handout.
Solution: function definitions, script, plots (with average errors).

Assignment 3: handout.
Solution: function definitions, script, output of script. Note that this is just an example solution. There are many other good ways of organizing a program for this problem.

Quizes:

Quiz 1: questions, plus answers.

Quiz 2: questions, plus answers. Mark adjustment: The mark written on your quiz paper will be adjusted by adding 8% of the mark if it is less than 75, 6% of the mark if it is between 75 and 79, 3% of the mark if it is between 80 and 84, and nothing if it is 85 or above.

Quiz 3: questions, plus answers.

Web page for a previous version of the course:

Spring 2015