How do we infer which genes orchestrate various processes in the cell? How did humans migrate out of Africa and spread around the world? In this class, we will see that these two seemingly different questions can be addressed using similar algorithmic and machine learning techniques arising from the general problem of dividing data points into distinct clusters.
One of the first organisms to be domesticated by humans was yeast. Saccharomyces yeast is remarkable because it can not only convert the glucose in grapes into ethanol (which we then consume as wine), but it can also invert its own metabolism, consuming the ethanol it just produced in a process called the diauxic shift. To find genes implicated in the diauxic shift, we will learn about clustering algorithms that will divide yeast genes into distinct groups based on their patterns of regulatory behavior. A similar method can be applied to distinguish normal and tumor cells, an approach that led to diagnostic tests like MammaPrint for predicting the return of cancer after chemotherapy.
We can also apply clustering algorithms to identify the genetic foundation of human population structure and discover which populations have contributed to your own genome. To do so, we will need to power up clustering algorithms using a powerful computational approach called principal component analysis.
In the end of the course, a Bioinformatics Application Challenge will let you apply real bioinformatics software to cluster a biological Big Data.
Introduction to Clustering Algorithms
Welcome to class! At the beginning of the class, we will see how algorithms for clustering a set of data points will help us determine how yeast became such good wine-makers. At the bottom of this email is the Bioinformatics Cartoon for this chapter, courtesy of . How did the monkey lose a wine-drinking contest to a tiny mammal? Why have Pavel and Phillip become cavemen? And will flipping a coin help them escape their eternal boredom until they can return to the present? Start learning to find out!
Graded: Week 1 Quiz
Graded: Open in order to Sync Your Progress: Stepik Interactive Text for Week 1
Advanced Clustering Techniques
Welcome to week 2 of class! This week, we will see how we can move from a “hard” assignment of points to clusters toward a “soft” assignment that allows the boundaries of the clusters to blend. We will also see how to adapt the Lloyd algorithm that we encountered in the first week in order to produce an algorithm for soft clustering. We will also see another clustering algorithm called “hierarchical clustering” that groups objects into larger and larger clusters.
Graded: Week 2 Quiz
Graded: Open in order to Sync Your Progress: Stepik Interactive Text for Week 2
Introductory Algorithms in Population Genetics
Graded: Week 3 Quiz
ENROLL IN COURSE