Michael Brudno &mdash Research |
||||||
| | home | research | publications | teaching | CV | personal | | ||||||
The Group
Current Projects:Members of the group work on a diverse set of topics, ranging from Theory, to Machine Learning, to Systems (on the computer science side) and from Genome analysis to PPI networks and Protein structure (on the biology side). The projects below are some of the most active research areas within the group.Algorithms for Genome AssemblyIn a recent paper with Paul Medvedev, Konstantinos Georgiou, and Gene Myers we analyzed the complexity of several popular assembly paradigms, as well as the problem of assembly of double-stranded DNA molecules rather than single-stranded strings. Following up on this work, with Paul Medvedev we developed an algorithm for genome assembly with short, mated reads via convex optimization, and the all-pairs shortest path algorithm. It was published at RECOMB 2008. I am now working on expanding our assembly framework in several directions, including developing algorithms for assembly of a diploid organism with Nilgun Donmez and assembly of color-space (AB SOLiD dibase sequencing) data with Taya Santare.Alignment & Mapping of Short Reads to a GenomeTogether with Stephen Rumble (with contributions from Adrian Dalca and Marc Fiume, and in a collaboration with Arend Sidow and his group) we have been working on SHRiMP -- the SHort Read Mapping Program. SHRiMP can align short reads to a reference genome quickly and accurately, while allowing for insertions/deletions. It also comes with special color-space options to handle reads made by the AB SOLiD technology. Adrian Dalca and I have been working to generalize the sequence alignment scoring schemes into a common framework we call "Rectangle Scoring". Adrian implemented an alignment program for any rectangle scoring scheme in the FRESCO Package. Vlad Yanovsky is exploring more efficient algorithms for genome indexing and sequence alignment. I am also still maintining the LAGAN Alignment toolkit (see the Past projects section below).Genome VariationSeveral members of our group are exploring the variation present among the individuals of a certain species. In collaboration with Alexey Kondrashov and Yegor Bazykin, Nilgun Donmez explored the genome of Ciona savingyi for evidence of positive selection. Elango Cheran and Seunghak Lee are exploring algorithms to detect large scale (structural) variation in the human genomeSnowflock: Parallelization with Virtual Machines In collaeoration with Andres Lagar Cavilla and Eyal de Lara, Joe Whitney and Stephen Rumble have been working to enable the use of Virtual Machines for parallelizable applications. You can read about it in this tech report.Past projects:Ciona genome co-assemblyWith Arend Sidow and Kerrin Small at Stanford we assembled the Ciona savignyi genome. Assembling the Ciona genome was especially difficult because of its high polymorphism rate - 5%, or 50 fold higher than in humans. Hence when the genome is given to a regular assembly algorithm the result is two genomes, as different as human and macaque and enriched for misassemblies from being sequenced together.DNA AlignmentI led the development of the LAGAN toolkit, which consists of several algorithms for sequence alignment. LAGAN was developed in Serafim Batzoglou's lab at Stanford; Chuong Do, Sanket Malde, Michael F. Kim and Mukund Sundararajan have contributed to various programs in the package. LAGAN has been cited in over 50 publications in the year and a half since it appeared, and has been incorporated into several packages for biological sequence alignment. Seven hundred users from more than thirty countries have used LAGAN over 7,000 times through its website, and 130 users have subscribed to receive updates about the program.
LAGAN proper consists of three main parts:
Whole Genome AlignmentsWorking within the Rat Genome Consortium we developed some of the first methods for multiple alignment of whole genomes, and applied them to the comparison and analysis of the rat genome. More recently I worked on developing methodologies for whole genome synteny mapping using the Shuffle-LAGAN algorithm.
Protein Sequence AlignmentI participated in the development of the ProbCons protein aligner that was written by Chuong (Tom) Do. This aligner combines the ideas of consistency introduced in previous programs such as DIALIGN and T-COFFEE, with a maximum expected accuracy parse of the alignment pair-HMM that leads to results more accurate than other alignment tools, but with no heuristics.Alternative Splicing RegulationAlternative splicing is an important regulatory mechanism known to be used in about half of all mammalian genes. During this process an exon present in DNA may be left out of the mature mRNA, and hence will not be converted into a protein. This mechanism can be used to tailor the protein to the current needs of the cell, and many of the known alternative splicing exons are either tissue-specific or development-specific. With John Conboy, Inna Dubchak, and Mikhail Gelfand we worked on identification of enhancers of alternative splicing.Alignment VisualizationMy work in sequence alignment has lead me to think extensively about methods to interpret the resulting alignments for the biologist. This interest has lead to my participation in both VISTA and Phylo-VISTA projects with Inna Dubchak, Nameeta Shah, Kelly Frazer and many others.
|
||||||