The first lecture will be on September 10!
Overview:
Bioinformatics is an exciting, novel, area which seeks to address biological problems with various computational
methods. One of the main challenges faced by those working in bioinformatics is the sheer size of datasets being
analyzed. This course will explore the various kinds of biological data that is being generated by novel
methodologies and the computational methods that can be used to analyze them.. Unlike the typical computer science
course, ours will focus not on general techniques or theoretical algorithms, but on the best way to parse,
process, and analyze a particular type of data. The types of data covered in the class will include genome
sequence data, protein structure data, microarray data, and biological networks. Various methods that we will draw
upon to better analyze these will include efficient data representation, parallelization of independent processes,
the use of UNIX and other standard tools to simplify data analysis, the use of clustering and classification
techniques and other standard algorithms to analyze the datasets, and rudimentary statistical techniques to
evaluate the results.
Expected Background:
Students should be familiar with algorithms
(at least CSC 373 level), basic probability theory, and basic molecular biology.
Organization:
The course will be organized into four three-week "chapters", each of which will concentrate on one of four data
types: sequence, structure, microarrays, and networks. For each of the chapters we will present a number of
methods, tools and techniques.
Grading:
Each of the first three "chapters" will be associated with a homework assignment worth 20% of the grade each.
Course participation, over the course of the semester, will constitute 10% of the grade. At the end of the course
there will be a final project (worth 30% of the grade). More details about the project will be available at a
later date.
Administrative details:
None for now