CSC 120 - Computer Science for the Sciences (R section, L0201, Jan-Apr 2015)

CSC 120 L0201 is an introduction to programming using the R language, which is widely used for statistical applications. It is designed for students with no programming experience, who are either planning to specialize in statistics or who expect to use statistics extensively while studying other fields.

Students in other areas of science may wish to take section L0101 of CSC 120, which uses the Python language. Students who plan on taking further courses in CS may wish to take CSC 108 instead of CSC 120, and then take CSC 148. It is possible to take CSC 148 after CSC 120 without taking CSC 108, but to do so you will have to pick up some material on your own.

Instructor: Radford Neal, Office: SS6026A / PT384, Email: radford@cs.utoronto.ca

Office hours: Mondays, 11:10 to 12:30, in SS6026A.

Lectures and Tutorials/Labs:

Lectures (L0201): Tuesdays and Fridays, 3:10-4:00pm, MP202.
Tutorials/Labs (T0201): Wednesdays, 1:10-3:00, CDF lab, room BA3185.

If you want to be in the R section of CSC 120, be sure to register in L0201 and in T0201.

The lab the first week will be to make sure you know how to log into the CDF computers, start up R using Rstudio, and do simple things in R. If you have a laptop computer, you should bring it, so you can also learn how to install R and Rstudio on it. Later labs will involve non-credit excercises, which will prepare you for assignments, quizzes, and tests.

Evaluation:

12% - Three 25-minute quizzes (4% each), written during lectures on January 23, February 13, and March 13.
20% - One 50-minute test, written during lecture time (in the usual lecture room) on Tuesday, February 24.
30% - Three assignments (10% each), due on February 10, March 17, and April 2 (see handouts for exact times).
38% - Final exam (scheduled by the Faculty).

You must get at least 40% on the final exam to pass the course. If you don't, your final grade will be less than 50 regardless of your other grades in this course.

Assignments will be done by students individually.

Assignments may be handed in late by the date and time specified on the handout with a 20% penalty. Later assignments will not be accepted.

Students with a valid medical or other excuse will of course not be penalized for late assignments, or for missing a quiz or test. You must, however, contact the instructor (eg, by email) as soon as possible (preferrably beforehand) if you are not able to hand in an assignment or write a test or quiz.

Getting and Using R:

You will use R via Rstudio on the CDF computer system, on which you should automatically have an account if you are registered in this course. (At the moment, to start Rstudio, you need to open a terminal window, and then type "rstudio"; an item in the menus may be created soon.)

You can also install R and Rstudio (for free) on your own laptop or desktop computer. You can download R from r-project.org, and Rstudio from rstudio.com.

The lab in the first week (Jan 7, 1:10-3:00pm) will help you get started on these things.

Documentation on R:

There is no textbook for this course (lecture slides are below). However, here are links to some on-line documentation on R:

Introduction to R (R Core Team).
R Language Definition (R Core Team).
Introduction to some aspects of R (Chi Yau).
Advanced R (Hadley Wickham).

Lecture Slides:

Slides for the lectures will be posted here as they become available.

Week 1: Introduction, numeric and character data, arithmetic, mathematical functions, string functions paste and substring), variables and assignment, vectors, c, length, vector arithmetic and subscripting, plotting vectors.

Week 2: R scripts done with source, functions scan, mean, sd, abline, defining functions, multiple steps with curly brackets, input and output with readLines and cat.

Week 3: Using if, for, and while, logical data and comparison operations, sequences with :.

Week 4: Lists, using lists to return multiple values, specifying function arguments by name, default values for arguments, more on plot, plus points, lines, and text.

Week 5: Environments, local and global variables and assignments.

Week 6: Making vectors with rep and seq, matrices, matrix indexing, perspective and contour plots.

Week 7: Attributes of objects, classes, data frames.

Week 8: NA and NaN, names for elements/rows/columns.

Week 9: Random number generation.

Week 10: Operations on vectors, using vectors as indexes, some design flaws in R.

Week 11: Object-oriented programming, loops with repeat and break, recursion, R features for Assignment 3.

Week 12: Functions for standard distributions, fitting linear models with lm, R packages.

Lab Exercises:

January 14: handout, R script for 1st part, R script for 2nd part.
January 21: handout, solutions.
January 28: handout, R function, R script, output of script with knitr::spin
February 4: handout (also the output of solution script), function definitions, script.
February 11: handout, functions, script, output of script.
February 25: handout.
March 4: handout, functions, script, output of script. Plus a finished peak function, and output.
March 11: handout, random walk function from lecture, Solution: functions, script, output of script.
March 18: handout, Solution: R code, output.
March 25: handout, functions, test script.
April 1: handout, solutions.

Assignments:

Assignment 1: handout.
Solution: function definitions, script, knitr::spin output of script.

Assignment 2: handout.
Solution: function definitions, script for data set 1, output of script 1, script for data set 2, output of script 2.

Assignment 3: handout.
Solution: function definitions, test script, output of test script, script for supplied data set, output of script.

Mid-term test:

The test and the answers. Each of the eight questions was marked out of 13, for a total of 104 possible marks, but I have counted it as if it were out of 100.

Quizes:

Quiz 1: questions, plus answers.
Quiz 2: questions, plus answers.
Quiz 3: questions, plus answers.