Skip to the content.

CMSE 410/890: Bioinformatics and Computational Biology

Description

This course is an introduction to the inner-workings of methods in bioinformatics and computational biology: analytical techniques, algorithms, and statistical/machine-learning approaches developed to address key questions in biology and medicine.

In this course, students will also learn how to formulate problems for quantitative inquiry, design computational projects, think critically about data & methods, do reproducible research, and communicate findings.

Note
Open to both undergraduate and graduate students. Counts toward the CMSE minor, graduate certificates, and dual PhD. Please email Heather Johnson at john1451@msu.edu for an override.

Prerequisites
CMSE 201 or CMSE 301-304 or equivalent with programming experience and two semesters of introductory biology (LB 144 and 145 OR BS 161 and 162 OR BS 181H and 182H, or equivalent). Statistics at the level of STT 231 is strongly recommended.

Basically, it would be assumed that you:

Instructor Contact Information

Arjun Krishnan | … :———— | :———— Affiliation | Dept. Computational Mathematics, Science, and Engineering</br>Dept. Biochemistry and Molecular Biology Office | 2507H Engineering Building Contact | Email: arjun@msu.edu</br>Twitter: @compbiologist</br>Website: https://thekrishnanlab.org

[ Top ]

Course Outline and Materials

Major Topics

Biological topics

  1. Genome assembly & annotation
  2. Sequence alignment & pattern finding
  3. Comparative genomics; Phylogenomics
  4. Genetic variation & quantitative genetics
  5. Regulatory genomics
  6. Functional genomics
  7. Single-cell genomics
  8. Molecular dynamics; Structure prediction
  9. Modeling cellular pathways
  10. Whole-cell models; Digital evolution
  11. Biological networks

Computaitonal / Analytical topics

Primers

Conducting a Bioinfo / CompBio Project: A Practical Primer in 3-parts:

This document contains links to a bunch of excellent resources for brushing-up your Unix, Python/R, Statistics, and Biology.

[ Top ]

Schedule, Location, Calendar, and Office Hours

S/L | Info :———— | :———— Schedule | Mon, Wed, and Fri</br>11:00 am - 12:10 pm Location | 351 Natural Sciences Bldg

Calendar

This calendar contains the class schedule and the links to the lecture slides and reading materials. Download the detailed schedule as a PDF.

Day Date Module Topic Learning Materials
Day 01 Mon, Jan 06 Introduction, Overview, and Refreshers Course overview [Lecture]
Day 02 Wed, Jan 08 Introduction, Overview, and Refreshers Refresher 1: Concepts in statistics & probability  
Day 03 Fri, Jan 10 Introduction, Overview, and Refreshers Refresher 2: Concepts in model building + Intro to Bioinformatics & Computational Biology [Lecture]
Day 04 Mon, Jan 13 Genome assembly & annotation Assembly with de Bruijin graphs [Lecture]
Day 05 Wed, Jan 15 Genome assembly & annotation Gene prediction with Hidden Markov models [Lecture]
Day 06 Fri, Jan 17 Genome assembly & annotation Paper discussion; HMM continued Velvet: Algorithms for de novo short read assembly using de Bruijn graphs [Journal] [PDF]
  Mon, Jan 20 No class Need an extra hour (or two 30-minute slots) to compensate  
Day 07 Wed, Jan 22 Sequence alignment & pattern finding Dynamic programming; Substitution matrices [Lecture]
Day 08 Fri, Jan 24 Sequence alignment & pattern finding Paper discussion; Basic Local Alignment Search Tool Basic local alignment search tool [PDF]</br>Steps used by the BLAST algorithm [PDF]
Day 09 Mon, Jan 27 Comparative genomics; Phylogenomics Whole genome alignment; Suffix trees [Lecture]
Day 10 Wed, Jan 29 Comparative genomics; Phylogenomics Molecular evolution; Tree construction [Lecture]
Day 11 Fri, Jan 31 Comparative genomics; Phylogenomics Paper discussion Whole-genome alignment:</br>- MUMmer1: Alignment of whole genomes [Journal] [PDF]</br>- MUMmer2: Fast algorithms for large-scale genome alignment and comparison [Journal] [PDF]</br>- MUMmer3: Versatile and open software for comparing large genomes [Journal] [PDF]</br>- MUMmer4: A fast and versatile genome alignment system [Journal] [PDF]
Day 12 Mon, Feb 03 Genetic variation & quantitative genetics GWAS, Regularized linear regression [Lecture]
Day 13 Wed, Feb 05 Genetic variation & quantitative genetics Statistical inference, Multiple testing [Lecture]
Day 14 Fri, Feb 07 Genetic variation & quantitative genetics Polygenic risk score; Paper discussion  
Day 15 Mon, Feb 10 Regulatory genomics ChIP-seq & Expectation-Maximization [Lecture]
Day 16 Wed, Feb 12 Regulatory genomics Expectation-Maximization & Gibbs Sampling [Lecture]
Day 17 Fri, Feb 14 Regulatory genomics Paper discussion - What are DNA sequnence motifs? [Journal] [PDF]</br>- How does DNA sequence motif discovery work? [Journal] [PDF]</br>- What is the Expectation Maximization algorithm? [Journal] [PDF]</br>- Practical Strategies for Discovering Regulatory DNA Sequence Motifs [Journal] [PDF]
Day 18 Mon, Feb 17 Functional genomics Distance measures; Clustering [Lecture]
Day 19 Wed, Feb 19 Functional genomics Differential expression; Functional enrichment analysis  
Day 20 Fri, Feb 21 Functional genomics Paper discussion A module map showing conditional activity of expression modules in cancer [Journal] [PDF]
Day 21 Mon, Feb 24 Conducting a Bioinfo / CompBio Project: A Practical Primer in 3-parts Organizing and managing a CompBio project [Lecture]
Day 22 Wed, Feb 26 Conducting a Bioinfo / CompBio Project: A Practical Primer in 3-parts Kickstarting and getting help in a CompBio project [Lecture]
Day 23 Fri, Feb 28 Conducting a Bioinfo / CompBio Project: A Practical Primer in 3-parts Kickstarting and getting help in a CompBio project [Lecture]
  Mon, Mar 02 No class Spring break  
  Wed, Mar 04 No class Spring break  
  Fri, Mar 06 No class Spring break  
Day 24 Mon, Mar 09 Bioinformatics & Computational Biology Co-work Sessions Diff. time: 10a–12:15p + Diff. location: Holmes Hall, Classroom W5  
Day 25 Wed, Mar 11 Bioinformatics & Computational Biology Co-work Sessions Diff. time: 10a–12:15p + Diff. location: Holmes Hall, Classroom W5  
Day 26 Fri, Mar 13 Bioinformatics & Computational Biology Co-work Sessions Diff. time: 10a–12:15p + Diff. location: Holmes Hall, Classroom W5  
Day 27 Mon, Mar 16 Mid-course project presentations Lightning talks  
Day 28 Wed, Mar 18 Mid-course project presentations Lightning talks  
Day 29 Fri, Mar 20 Mid-course project presentations Lightning talks  
Day 30 Mon, Mar 23 Single-cell genomics Dimensionality reduction [Lecture]
Day 31 Wed, Mar 25 Single-cell genomics Supervised machine learning [Lecture]
Day 32 Fri, Mar 27 Single-cell genomics Paper discussion The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells [Journal] [PDF]
Day 33 Mon, Mar 30 Protein structure prediction & dynamics Maximum entropy modeling [Lecture]
Day 34 Wed, Apr 01 Protein structure prediction & dynamics Molecular dynamics [Lecture]
Day 35 Fri, Apr 03 Protein structure prediction & dynamics Paper discussion Evolutionarily conserved networks of residues mediate allosteric communication in proteins [Journal] [PDF]
Day 36 Mon, Apr 06 Modeling cellular pathways Dynamical modeling [Lecture]
Day 37 Wed, Apr 08 Modeling cellular pathways State Space, Bifurcation [Lecture]
Day 38 Fri, Apr 10 Modeling cellular pathways Paper discussion [Lecture]</br></br>Construction of a genetic toggle switch in E. coli [Journal] [PDF]
Day 39 Mon, Apr 13 Whole-cell models; Digital evolution Genome-scale metabolic models; Constraint-based modeling [Lecture]
Day 40 Wed, Apr 15 Whole-cell models; Digital evolution Constraint-based modeling [Lecture]
Day 41 Fri, Apr 17 Whole-cell models; Digital evolution Paper discussion; Artificial life - Network-based prediction of human tissue-specific metabolism [Journal] [PDF]</br>- Integration of expression data in genome-scale metabolic network reconstructions (Mini supplementary reading to help with the main paper) [Journal] [PDF]</br></br>[Lecture]
Day 42 Mon, Apr 20 Biological networks Network representation and topology [Lecture]
Day 43 Wed, Apr 22 Biological networks Condition-specificity of networks; Network recontruction [Lecture]
Day 44 Fri, Apr 24 Biological networks Paper discussion; Network propagation Genomic analysis of regulatory network dynamics reveals large topological changes [Journal] [PDF]</br></br> [Lecture]
Day 45 Thu, Apr 30 Final project poster presentations Poster presentations; Diff. time: 12:45pm - 2:45pm  

Project deadlines

| Task | Due date | |:—–|———:| | Project profile | Wed, Jan 15 | | Project topic | Fri, Jan 31 | | Project pre-proposal | Fri, Feb 07 | | Project proposal | Wed, Feb 19 | | Proposal reviews | Fri, Feb 28 | | Mid-term project presentations | Mon, Mar 16; Wed, Mar 18; Fri, Mar 20 | | Mid-course project report | Fri, Mar 27 | | Final project report | Fri, Apr 24 | | Final project poster presentations | Thu, Apr 30 |

Office Hours

Tue and Wed: 5–6p.

I will block these time from my schedule and be present in my office.

Couple of things to note:

  1. While I’m happy to chat with you in person, many times, just sending me a message on Slack with your questions/concerns might work as well. So, if you have specific Qs in mind, just shoot me a message and let’s see if we can resolve it then and there.
  2. If you would indeed like to meet in person, please try to meet me during this time. But, don’t worry if you can’t make it during this window for some reason. Again, just send me a message on Slack and we’ll find a time that works for both of us.

[ Top ]

Website and Communication

Course website

This GitHub repo will serve as the course website.

Communication

The primary mode of communication in this course (including major announcements), will be the course Slack account. All of you should have invitations to join this account in your MSU email.

Emails
Although the bulk of the communication will take place via Slack, at times (rarely), we will send out important course information via email. This email is sent to your MSU email address (the one that ends in “@msu.edu”). You are responsible for all information sent out to your University email account, and for checking this account on a regular basis.

[ Top ]

Course Activities

Assignments

For each topic, you will be given an assignment after the topic’s first “Lecture” class on Monday that you are required to work on and submit before beginning of the “paper discussion” class on Friday the same week. Links to the assignment will be posted on this page next to the topic on the Calendar and specific instructions will be posted on Slack.

Class Participation

In general:

Paper discussion

You will also take turns to present the assigned paper during each topic’s “Paper discussion” class. Make sure you sign-up.

Semester Project and Presentation

A major goal of this course is to prepare you for performing original research in computational biology, and for effectively presenting your ideas and research. The semester project will serve as the most practical way to do exactly that.

Projects can take any one of the following flavors:

The outcomes of this semester-long project should include:

  1. Well-documented code to:
    • download and process the data
    • perform the computational analysis and generate all the results
    • visualize the results as various plots
  2. Detailed final report containing the following sections:
    • Abstract
    • Introduction
    • Data and Methods
    • Results and Discussion
    • Limitations and Future Directions
    • References
    • Glossary
  3. A poster that describes your project - motivation, exact problem, approach, results, discussion & conclusions, limitations & future direcrtions, acknowledgements.

There are several project deadlines throughout the course that will help you stay on track, enabling you to complete a substantial project.

  1. Describe your previous research, areas of research interest in bioinformatics / computational-biology, type of project that best fits your interests. Post this description in a profile that lets your classmates know you. Project profile due Wed, Jan 15.

  2. Discuss with Arjun (and any other PI) and read recent papers. Briefly describe project ideas. Project topic due Fri, Jan 31.

  3. Prepare a two-page pre-proposal (Page1: text; Page2: figures & references). Project pre-proposal due Fri, Feb 07.

  4. Write full proposal. Project proposal due Wed, Feb 19.
    • Length: 5-pages (incl. figures & ref; sections listed below)
    • Sections:
      • Background, goals, & significance (what is the problem you are hoping to address; what is the current approach & its limitations; what will you do & why is it likely to succeed; if successful, what is the broader impact)
      • Datasets (what datasets will you use; where are they from; what exactly do they contain; how are they formatted)
      • Computational methods/approach (what are the analytical methods; what are the specific software implementations you’ll use; include your flowchart here)
      • Evaluation plan (how will you evaluate the results that you get; think in terms of how to test if a) your approach is working correctly without errors and b) your results make quantitative/biological sense and are meaningful)
      • Potential challenges & alternative approaches (what are some assumptions you are making that can fail; what are some potential limitations of your dataset or approach that might prevent you from achieving your aforementioned goals; what will you do as alternatives if you hit those limitations)
      • Specific milestones (what is the list of specific results/outcomes you will work on getting)
  5. Review proposals. Discuss proposal with Arjun. Reviews due Fri, Feb 28.

  6. Mid-course project presentations on Mon, Mar 16, Wed, Mar 18, and Fri, Mar 20.
    • Address peer evaluations, revise aims, scope, and list of final goals & deliverables. Meet with Arjun about reviews, revised plan, and progress.
    • In addition to the usual things – background, problem, approach, etc. – I would like you to also present the following:
      • Clear flowchart of approach.
        • Raw data → Preprocessing & quality control → Preliminary/exploratory analysis → Analysis/Model-building steps → Expected outcomes.
      • Thorough exploration and sanity checks of data.
        • Tables & plots to showcase various aspects of your datasets/problem.
      • Method/software.
        • Usage & I/O format for each.
      • Preliminary analysis
        • With simple baselines, samples datasets, and toy examples.
  7. Continue making substantial progress on proposed milestones. Write mid-course project report. Mid-course project report due Fri, Mar 27.

  8. Complete milestones, finalize results, figures, write-up in conference publication format. As part of the report, comment on your overall project experience. Final project report due Fri, Apr 26.

  9. Final project presentations will take place on Thu, Apr 30 12:45pm – 2:45pm in 351 Natural Sciences Bldg.

[ Top ]

Grading Information

Activity | Percentage :—– | ———: Assignments | ~35% Class participation | ~15% Project | ~50%

Grading scale

Point | Percentage —-: | ———: 4.0 | ≥ 90% 3.5 | ≥ 85% 3.0 | ≥ 80% 2.5 | ≥ 75% 2.0 | ≥ 70% 1.5 | ≥ 65% 1.0 | ≥ 60% 0.0 | < 60%

Note: Grades will not be curved. Your grade is based on your own effort and progress, not based on competition with your classmates.

[ Top ]

Attendance, Conduct, Honesty, and Accommodations

Class Attendance & Presence

This class is heavily based on material presented and worked on in class, and it is critical that you attend and participate fully every week! Therefore, class attendance is absolutely required. An unexcused absence will result in zero points for the day. Arriving late or leaving early without prior arrangement with the instructor of your session counts as an unexcused absence. Note that if you have a legitimate reason to miss class (such as job, graduate school, or medical school interviews), you must arrange this ahead of time to be excused from class. Three unexcused absences will result in the reduction of your grade by one step (e.g., from 4.0 to 3.5), with additional absences reducing your grade further at the discretion of the course instructor.

Code of Conduct

All conduct should serve the singular goal of sustain a friendly, supportive, and fun environment where we can do our best work and have a great time doing it.

Respectful and responsible behavior is expected at all times, which includes not interrupting other students, turning your cell phone off, refraining from non-course-related use of electronic devices, and not using offensive or demeaning language in our discussions. Flagrant or repeated violations of this expectation may result in ejection from the classroom, grade-related penalties, and/or involvement of the university Ombudsperson.

I am unequivocally dedicated to providing a harassment-free experience for everyone, regardless of gender, gender identity and expression, age, sexual orientation, disability, physical appearance, body size, race, or religion (or lack thereof). We will not tolerate harassment of colleagues in any form. Behaviors that could be considered discriminatory or harassing, or unwanted sexual attention, will not be tolerated and will be immediately reported to the appropriate MSU office (which may include the MSU Police Department).

Academic honesty

Intellectual integrity is the foundation of the scientific enterprise. In all instances, you must do your own work and give proper credit to all sources that you use in your papers and oral presentations – any instance of submitting another person’s work, ideas, or wording as your own counts as plagiarism. This includes failing to cite any direct quotations in your essays, research paper, class debate, or written presentation. The MSU College of Natural Science adheres to the policies of academic honesty as specified in the General Student Regulations 1.0, Protection of Scholarship and Grades, and in the all-University statement on Integrity of Scholarship and Grades, which are included in Spartan Life: Student Handbook and Resource Guide. Students who plagiarize will receive a 0.0 in the course. In addition, University policy requires that any cheating offense, regardless of the magnitude of the infraction or punishment decided upon by the professor, be reported immediately to the dean of the student’s college.

It is important to note that plagiarism in the context of this course includes, but is not limited to, directly copying another student’s solutions to in-class or homework problems; copying materials from online sources, textbooks, or other reference materials without citing those references in your source code or documentation, or having somebody else do your pre-class work, in-class work, or homework on your behalf. Any work that is done in collaboration with other students should state this explicitly, and have their names as well as yours listed clearly.

More broadly, we ask that students adhere to the Spartan Code of Honor academic pledge, as written by the Associated Students of Michigan State University (ASMSU): “As a Spartan, I will strive to uphold values of the highest ethical standard. I will practice honesty in my work, foster honesty in my peers, and take pride in knowing that honor is worth more than grades. I will carry these values beyond my time as a student at Michigan State University, continuing the endeavor to build personal integrity in all that I do.”

Accomodations

If you have a university-documented learning difficulty or require other accommodations, please provide me with your VISA as soon as possible and speak with me about how I can assist you in your learning. If you do not have a VISA but have been documented with a learning difficulty or other problems for which you may still require accommodation, please contact MSU’s Resource Center for People with Disabilities (355-9642) in order to acquire current documentation.

[ Top ]

Going Online

Folks, hope all of you are doing ok in these surely difficult circumstances. I hope you are taking all the precautions to keep you and everyone around you safe and unaffected:

Since we are going completely online for the rest of our semester including the finals, I am making a bunch of adjustments to deliver this class as effectively possible and I will work on helping you master the remaining course material as best I (and you) can in these circumstances. There is a lot in here but read through these carefully.

Permanent Zoom Meeting ID

The link is available in the #GoingOnline post on our class Slack.

All Schedules

The google sheet shared on Slack contains the schedule/assignments for:

2. Zoom Best Practices

3. Midterm Presentations – March 16, 18, 20

We are going to use the Zoom meeting room.

4. Lectures

We are going to use the Zoom meeting room.

5. Student Paper Presentations on Fridays

We are going to use the Zoom meeting room and the style will be similar to how you did your mid-term presentations.

6. Class Participation

I hope you understand that I’m looking for ways to ensure you’re present and paying attention.

7. Office Hours

We are going to use Slack video calls to do all our office hours.

8. Final Presentations – April 30

Some preliminary thoughts here. This will be solidified soon.

9. Class Timeline

At the outset:

However, I want to qualify this wish:

Finally, all the physical distancing we are doing to mitigate the spread of infection can lead to a lot of stress. The isolation is not great for mental health and the confinement is not great for physical health. To help you a little in mitigating these effects, here are some additional things we will do as a class.

10. Virtual Meetings with Your Learning Group

Take advantage of the in-class learning groups you are part of. I am asking you to meet remotely with your learning group once every week – perhaps on Thursday or before class on Friday – over a video call. You can coordinate the time works for all of you and how you will do the video call. When you meet online, you can:

11. Scrum

I have created a #scrum channel in our class Slack. Let’s use this channel to do a daily scrum to help each other keep their work – any school/professional work, not just what’s related to this class – on-track. Include studying, preparation, reading, etc.

12. Health & Workout

I have created a #health-workout channel on our class Slack. Let’s use this channel to share our ideas and post our daily plans/goals for keeping up our mental and physical health.

13. Fun

I have created a #fun channel on our class Slack. Let’s use this channel to share your ideas/plans for engaging in a hobby or just disengaging with a Netflix show.

14. Frequently Asked Questions

Q. Can I connect to zoom by dialing a phone number?
A. Yes, additional details are in the #GoingOnline post on Slack.

[ Top ]