CSC2427: Algorithms in Molecular Biology

Winter 2006

(Irrelevant official name: Topics in Graph Theory)
Lectures: WF 10-11 in Sydney Smith (SS) 1084

Instructor: Michael Brudno
Office: Pratt (PT) 286C
Office Hours: Th 10-11am and by appointment


Presentation order for the projects. 20 minutes for groups of 2 or more, 15 minutes for groups of one. The presentation will be from 9am to 11am on April 5,7, and 12. Note that we will start at 9am sharp (and not at 9:10). Coffee and cookies will be served to all who come!

9-11am April 5
Hertel, Hertel & MirmohammadiNew Algorithms for Genome Assembly
Hyndman & GermannMotif Identification with Variable-length Random Projections
Marcini & SamulowtizApproximate Alignment of Protein-Protein Interaction Networks
Chung & MussoDynamic Markov Clustering
HuThree Strategeies for multi-compartamental prediction of protein subcellular localizations
TaylorDetecting Interspecific REcombination in DNA Sequence Alignment with Phylogenetic Products of Experts
9-11am April 7
Jiang & YanAnalysis of Single Nucleotide Patterns in Disease Genes
Lan & LeeThe Relationship Between Sequence Similarity and Gene Function
El Ayoubi & JumaMaximum Parsimony Phylogeny using Integer Programming
Diamantis & SarkasFast Homology Search
SalaProtein Structural Alignment
SnoekUsing Conditional Random Fields for Gene Prediction
9-11am April 12
NatarjanA Discriminative Approach for Predicting Protein Backbone Structures using Dynamic Condiotional Random Fields
RayProbabilistic Methods for Fast Integration of Multiple Sources of Biological Information
LangProtein Constellations
Abramson & TiloSubstitution Models in Protein-protein Interaction prediction
Dimitromanolakis & PapagelisDijkstra's Algorithm for Sequence Alignment
Babak & HuangBayesian Learning of microRNA Regulatory Networks through prediction of Transcriptome Changes due to microRNA Regulation.

General information

This graduate course will cover some exciting algorithms that have been developed to analyze biological data, including sequences (comparison and assembly), structures (RNA and protein), and networks (e.g. gene interaction networks). While the emphasis of the class will be on discrete algorithms, we occasionally will talk about probabilistic models (such as HMMs and Bayesian nets), and the interplay between discrete and probabilistic models. The course is intended for computer science graduate students, and all of the required biology will be explained in the class. Students in biological and related sciences with a strong computational background are encouraged to participate.

Expected Background:
Students should be familiar with algorithms (at least CSC 373 level), basic probability theory, and some machine learning.

The basic requirements for the class will be a course project (50% of the grade), four homework assignments (45% total), and class participation (5%).

Administrative details:
This course is offered under the name "Topics in Graph Theory" due to a lack of a more appropriate name/number. It will NOT concentrate on graph theory (though we certainly will use graphs). Once a course with an appropriate name/number is approved it will be possible to retroactively change the transcripts of students to indicate the new course. The class will satisfy the 2c breadth.