BMB 961-301: Gaps, Missteps, and Errors in Statistical Data Analysis
- Description
- Prerequisites
- Instructor Contact Information
- Course Outline and Materials
- Schedule, Location, Calendar, and Office Hours
- Website and Communication
- Course Activities
- Grading Information
- Attendance, Conduct, Honesty, and Accommodations
Description
This is an advanced short (1-credit) course designed to:
- Discuss common misunderstandings & typical errors in the practice of statistical data analysis.
- Provide a mental toolkit for critical thinking and enquiry of analytical methods and results.
Classes will involve lectures, discussions, hands-on exercises, and homework about concepts critical to the day-to-day use and consumption of quantitative/computational techniques. Please use this course flyer to help spread the word.
Prerequisites
This is not an introductory course in statistics or programming. We will assume: 1) Familiarity with basic statistics & probability. 2) Ability to do basic data wrangling, analyses, & visualization using R or Python.
- Strongly recommended MSU courses: CMSE 201 and CMSE 890 Sec 301-or-304 and Sec 302.
- Check out some recommended online preparatory materials listed below that you can use to refresh all these concetps.
Registration
If you are not a declared Biochemistry graduate student, please contact Arjun (arjun@msu.edu) and then submit the online override request form found here. For more specific information, please visit this page.
Courve Survey
Please fill-out the course survey to help me know better about your background, motivation, etc.
Instructor Contact Information
Arjun Krishnan | … :———— | :———— Affiliation | Dept. Computational Mathematics, Science, and Engineering</br>Dept. Biochemistry and Molecular Biology Office | 2507H Engineering Building Contact | Email: arjun@msu.edu</br>Twitter: @compbiologist</br>Website: https://cmse.msu.edu/directory/faculty/arjun-krishnan/
[ Top ]
Course Outline and Materials
Major Topics
(subject to changes)
- P-value & P-hacking
- Multiple hypothesis correction
- Estimation of error & uncertainty
- Statistical power / Underpowered statistics / Sample size calculation
- Pseudoreplication
- Confounding variables & batch effects
- Circular analysis
- Regression to the mean & stopping rules
- Confirmation & survivorship bias
- Base rates & Permutation test
- Describing different distributions
- Continuity errors & model abuse
- Visualization challenges
- Researcher degrees of freedom
- Data sharing / Hiding data
- Reproducible research
- Difference in significance & significant differences
Recommended Preparatory Materials
I’m recommending the open & free versions of all the materials below.
Python
- Learning Python the Hard Way
- If you are new to programming and want to learn Python…
- MIT OpenCourseWare (video lectures + notes + assignments)
- If you already know programming in some other language and want to learn Python, computational and data science …
- Introduction to Computer Science and Programming in Python
- Introduction to Computational Thinking and Data Science
- Data Visualization with Python
R
- Programming with R
- If you are new to programming and want to learn R…
- R for Data Science
- An awesome book on using R for data analysis and visualization!
- R Cheatsheets
- Great set of visually-appealing cheatsheet.
Probability and statistics
- Khan Academy – Statistics
- Great Short Videos on Probability and Statistics
- Think Stats (book + code + solutions)
- Introduction to Probability and Statistics (designed for Python programmers but can very much be used with other languages).
[ Top ]
Schedule, Location, Calendar, and Office Hours
Nov 5 – Dec 5, 2018
S/L | Info |
---|---|
Schedule | Mon and Wed</br>12:40-2:00 pm |
Location | 202 Biochemistry Building |
Calendar
Date | Topic | Content | Learning Materials :————————: | :——————————- | :—————————– | :———————— Nov 05 (M) | Introduction & Overview | Course overview | Lecture 1 [PDF] Nov 07 (W), Nov 12 (M) | Topic 1: Statistical hypothesis testing | P-value & P-hacking; Multiple hypothesis correction; Estimation of error & uncertainty | Lectures 2 and 3 [PDF] Nov 14 (W), Nov 19 (M) | Topic 2: Experimental design | Statistical power / underpowered statistics; Sample size calculation; Pseudoreplication; Confounding variables & batch effects | Lectures 4 and 5 [PDF] Nov 21 (W), Nov 26 (M) | Topic 3: Unknown variables, Cognitive biases, & Base rate | Circular analysis; Regression to the mean & stopping rules; Confirmation & survivorship bias; Permutation test | Lectures 6 and 7 [PDF] Nov 28 (W), Dec 03 (M) | Topic 4: Descriptive statistics, Modeling, Visualization | Describing different distributions; Continuity errors & model abuse; Visualization challenges | Lectures 8 and 9 [PDF] Dec 05 (W) | Topic 5: Reproducibility | Researcher degrees of freedom; Data sharing / Hiding data; Reproducible research; Roundup | Lecture 10 [PDF]
Final exam
Item | Due date :— | ——-: Final exams | Mon, Dec 10
Office Hours
Wednesday 8–9am Thursday 5–6pm
I will block these times from my schedule and be present in my office (2507H Engineering).
Couple of things to note:
- While I’m happy to chat with you in person, many times, just sending me a message on Slack with your questions/concerns might work as well. So, if you have specific Qs in mind, just shoot me a message and let’s see if we can resolve it then and there.
- If you would indeed like to meet in person, please try to meet me during this time. But, don’t worry if you can’t make it during this window for some reason. Again, just send me a message on Slack and we’ll find a time that works for both of us.
[ Top ]
Website and Communication
Course website
This GitHub repo will serve as the course website.
Communication
The primary mode of communication in this course (including major announcements), will be the course Slack account https://bmb961-statgaps-nov18.slack.com. All of you should have invitations to join this account in your MSU email.
Emails
Although the bulk of the communication will take place via Slack, at times (rarely), we will send out important course information via email. This email is sent to your MSU email address (the one that ends in “@msu.edu”). You are responsible for all information sent out to your University email account, and for checking this account on a regular basis.
[ Top ]
Course Activities
Assignments
For each topic, you will be assigned reading materials after the topic’s first class that you are required to read. The links to these materials will be posted on this page next to the topic on the Calendar along with instructions on a specific analysis in the paper you should pay special attention to.
For each topic, you will be assigned a reading material after the topic’s 1st class (Wed) that you are required to read. Along with this, you might be given a data analysis assignment that you have to complete. Links to these materials will be posted on this page next to the topic on the Calendar.
Each completed assignment is due before the topic’s second class (following Wed).
Class Participation
In general:
- Do the assignments and additional readings.
- Show up to class.
- Work in groups during in-class discussion sessions.
- No one will have the perfect background: Ask questions about computational or biological concepts.
- Correct me when I am wrong.
Final Exam
A major goal of this course is to prepare you for performing statisitcal data analysis with care, and for presenting your ideas and findings effectively. The final exam will serve as a practical way to do exactly that.
[ Top ]
Grading Information
Activity | Percentage :—– | ———: Assignments | ~50% Class participation | ~25% Final Exam/Project | ~25%
[ Top ]
Attendance, Conduct, Honesty, and Accommodations
Class Attendance
This class is heavily based on material presented and worked on in class, and it is critical that you attend and participate fully every week! Therefore, class attendance is absolutely required. Arriving late, leaving early, or not showing-up for a whole class without prior arrangement with the instructor counts as an unexcused absence. Note that if you have a legitimate reason to miss class (such as job, graduate school, or medical school interviews), you must arrange this ahead of time to be excused from class. More than two unexcused absences will impact your grade at the discretion of the course instructor.
Code of Conduct
All conduct should serve the singular goal of sustain a friendly, supportive, and fun environment where we can do our best work and have a great time doing it.
- Do work that you’re proud of, from the smallest piece of writing/code to the final exams.
- Be supportive of your classmates; respect each others’ strengths, weaknesses, differences, and beliefs.
- Communicate openly & respectfully with everyone in the class.
- Ask for help; at the same time, respect and appreciate others’ time and effort.
Respectful and responsible behavior is expected at all times, which includes not interrupting other students, turning your cell phone off, refraining from non-course-related use of electronic devices, and not using offensive or demeaning language in our discussions. Flagrant or repeated violations of this expectation may result in ejection from the classroom, grade-related penalties, and/or involvement of the university Ombudsperson.
I am unequivocally dedicated to providing a harassment-free experience for everyone, regardless of gender, gender identity and expression, age, sexual orientation, disability, physical appearance, body size, race, or religion (or lack thereof). We will not tolerate harassment of colleagues in any form. Behaviors that could be considered discriminatory or harassing, or unwanted sexual attention, will not be tolerated and will be immediately reported to the appropriate MSU office (which may include the MSU Police Department).
Academic honesty
Intellectual integrity is the foundation of the scientific enterprise. In all instances, you must do your own work and give proper credit to all sources that you use in your papers and oral presentations – any instance of submitting another person’s work, ideas, or wording as your own counts as plagiarism. This includes failing to cite any direct quotations in your essays, research paper, class debate, or written presentation. The MSU College of Natural Science adheres to the policies of academic honesty as specified in the General Student Regulations 1.0, Protection of Scholarship and Grades, and in the all-University statement on Integrity of Scholarship and Grades, which are included in Spartan Life: Student Handbook and Resource Guide. Students who plagiarize will receive a 0.0 in the course. In addition, University policy requires that any cheating offense, regardless of the magnitude of the infraction or punishment decided upon by the professor, be reported immediately to the dean of the student’s college.
It is important to note that plagiarism in the context of this course includes, but is not limited to, directly copying another student’s solutions to in-class or homework problems; copying materials from online sources, textbooks, or other reference materials without citing those references in your source code or documentation, or having somebody else do your pre-class work, in-class work, or homework on your behalf. Any work that is done in collaboration with other students should state this explicitly, and have their names as well as yours listed clearly.
More broadly, we ask that students adhere to the Spartan Code of Honor academic pledge, as written by the Associated Students of Michigan State University (ASMSU): “As a Spartan, I will strive to uphold values of the highest ethical standard. I will practice honesty in my work, foster honesty in my peers, and take pride in knowing that honor is worth more than grades. I will carry these values beyond my time as a student at Michigan State University, continuing the endeavor to build personal integrity in all that I do.”
Accomodations
If you have a university-documented learning difficulty or require other accommodations, please provide me with your VISA as soon as possible and speak with me about how I can assist you in your learning. If you do not have a VISA but have been documented with a learning difficulty or other problems for which you may still require accommodation, please contact MSU’s Resource Center for People with Disabilities (355-9642) in order to acquire current documentation.
[ Top ]