Course syllabus
Graduate Seminar in Statistics
Graduate seminar provides a venue for MS students to explore research trends in statistics through reading and discussion of recent papers; readings are selected based on relevance, influence, and student and faculty research interests. In Winter 2025, selected readings explore methodology and statistical thinking in the era of large and complex datasets and prediction algorithms.
Instructor: Trevor Ruiz (he/him/his) [email] [website]
Class meetings: 10:10am — 11:00am R 180-331
Office hours: [by appointment]
Catalog Description: Topics in advanced statistics selected by the faculty. Discussion of current research papers in statistics and implementation of methods.
Learning outcomes
The goal of the graduate seminar is to enable successful students to:
[L1] Investigate and discuss current research in the statistics field
[L2] Implement current statistical methods in a modern computing language
[L3] Solve statistical problems in current research
[L4] Communicate statistical ideas related to current research
Assessments
As this is a discussion-oriented class, you will be assessed based on your preparation, participation, and attendance as outlined below.
Synopses. In advance of each class meeting you’ll be expected to prepare and submit a short synopsis of the reading(s) — no more than one page — summarizing at least one major contribution and/or idea and articulating at least one discussion question. You can skip at most one synopsis without penalty; additional exceptions will be considered on a case-by-case basis. Synopses need not be comprehensive or detailed in nature, but should aim to convey central ideas in plain(ish) language.
Minimum effort: one paragraph and one question related to the reading(s).
Satisfactory effort: clear summary of one or more ideas or contributions in the reading(s); two or more thoughtful questions.
Strong effort: clear summary that effectively identifies the central ideas/contributions in the reading(s); two thoughtful questions that probe the content of the paper(s) or its application/extension to other contexts or its relation to other work.
Participation. You are expected to contribute to discussions by (a) asking/answering questions and sharing thoughts to an appropriate extent in class and (b) commenting on shared copies of readings. In addition, you are expected to lead discussion on at least one occasion. Responsibility for leading discussions will rotate among students in the class and will consist of presenting a short introduction/overview — no more than 10 minutes — of the reading(s) summarizing the main ideas. In this leadership role you may also choose to provide instead an alternative discussion stimulus as appropriate, such as a demonstration, activity, or presentation of related work or specialized background helpful in understanding the paper. Like synopses, introductory presentations need not be comprehensive, but should aim to convey central ideas clearly.
Minimum effort: participate in discussion occasionally; comment on papers occasionally; when serving as discussion leader, prepare and present an introduction to the assigned reading(s).
Satisfactory effort: participate in discussion often and comment on papers often; when serving as discussion leader, prepare and present a clear introduction that provides a set of starting points for discussion by identifying a few main ideas or contributions in the assigned reading(s).
Strong effort: participate in discussion often and comment on most papers with thoughtful and helpful contributions; when serving as discussion leader, prepare and present a clear introduction that effectively identifies a few central ideas in the assigned reading(s), contextualizes them by drawing connections with prior/related/familiar work/applications/methods, and poses interesting discussion prompts.
Attendance. Attendance will be recorded each meeting. You are expected to attend each meeting, but must not miss more than two meetings. Exceptions to this all-but-two policy will be granted for excusable absences only.
Minimum effort: attend at least 7 meetings.
Satisfactory effort: attend at least 7 meetings.
Strong effort: attend at least 7 meetings.
Your grade will be based on the extent to which you meet the expectations above. Consistently minimum efforts will earn you a C or better; consistently satisfactory efforts will earn you a B or better; consistently strong efforts will earn you an A.
Tentative schedule
Subject to change.
Week 1 (1/6/25)
Introductions & logistics; no reading.
Week 2 (1/13/25)
Interpretable ML.
Allen, G. I., Gan, L., & Zheng, L. (2023). Interpretable machine learning for discovery: Statistical challenges and opportunities. Annual Review of Statistics and Its Application.
Koh, P. W., & Liang, P. (2017). Understanding black-box predictions via influence functions. Proceedings of the 34th International Conference on Machine Learning.
Week 3 (1/21/25)
TBD.
Week 4 (1/28/25)
Clustering methods for high-dimensional data.
Bouveyron, C., Girard, S., & Schmid, C. (2007). High-dimensional data clustering. Computational Statistics & Data Analysis.
Soltanolkotabi, M., Elhamifar, E., & Candès, E. J. (2014). Robust subspace clustering. The Annals of Statistics.
Witten, D. M., & Tibshirani, R. (2010). A framework for feature selection in clustering. Journal of the American Statistical Association.
Week 5 (2/3/25)
Statistics and society.
Mitchell, S., Potash, E., Barocas, S., D’Amour, A., & Lum, K. (2021). Algorithmic fairness: Choices, assumptions, and definitions. Annual Review of Statistics and Its Application.
Schneider, C. R., Kerr, J. R., Dryhurst, S., & Aston, J. A. (2023). Communication of Statistics and Evidence in Times of Crisis. Annual Review of Statistics and Its Application.
Week 6 (2/10/25)
Estimation from nonrandom samples with big data.
- Meng, X. L. (2018). Statistical paradises and paradoxes in big data (i) law of large populations, big data paradox, and the 2016 us presidential election. The Annals of Applied Statistics.
Week 7 (2/17/25)
Causal inference.
D’Agostino McGowan, L., Gerke, T., & Barrett, M. (2024). Causal inference is not just a statistics problem. Journal of Statistics and Data Science Education.
Ding, P., & Li, F. (2018). Causal inference. Statistical Science.
Imbens, G. W. (2024). Causal inference in the social sciences. Annual Review of Statistics and Its Application.
Week 8 (2/24/25)
Conformal prediction: distribution free predictive inference.
Lei, J., Robins, J., & Wasserman, L. (2013). Distribution-free prediction sets. Journal of the American Statistical Association.
Lei, J., G’Sell, M., Rinaldo, A., Tibshirani, R. J., & Wasserman, L. (2018). Distribution-free predictive inference for regression. Journal of the American Statistical Association.
Week 9 (3/3/25)
Variable selection with knockoffs
Barber, R. F., & Candès, E. J. (2015). Controlling the false discovery rate via knockoffs. The Annals of Statistics.
Barber, R. F., Candès, E. J., & Samworth, R. J. (2020). Robust inference with knockoffs. The Annals of Statistics.
Week 10 (3/10/25)
TBD
Policies
Time commitment
STAT590 is a one-credit course, which corresponds to a minimum time commitment of 3 hours per week, including class meetings, reading, assignment, and other preparations.
Because you can expect to take leadership roles with respect to class discussions at certain scheduled times during the quarter, you should also expect the distribution of workload to be a little uneven. Please take this into consideration when planning ahead.
Attendance and absences
Regular attendance is essential for success in the course and required per University policy. Absences should be excusable, but you do not need to notify me unless you will miss a meeting for which you are in a leadership role; if so, please email me with as much notice as possible.
In general, you may not miss more than two class meetings for the quarter; please get in touch with me if extenuating circumstances arise that require an exception to this policy.
Communication and email
I encourage you to utilize class meetings and office hours to ask questions or discuss matters related to the course, since that is the only certain means of obtaining a response within a guaranteed time frame.
I respond to most email within 24 weekday hours, but I cannot guarantee this response time and I occasionally miss messages altogether (though I try not to). I don’t answer emails at night or on weekends, so while you are welcome to write me outside of business hours, please don’t expect a reply until the following business day. I also sometimes get behind on answering emails, so please wait a few days (preferably one week if it’s not pressing) before sending a follow up or reminder.
Grades and assessments
Per University policy, faculty have final responsibility for grading criteria and grading judgment and have the right to alter student assessment or other parts of the syllabus during the term. It is not appropriate to attempt to negotiate scores or final grades. Once the term has concluded, final grades will only be changed in the case of clerical errors, without exception. If you feel your grade is unfairly assigned at the end of the course, you have the right to appeal it according to the procedure outlined here.
Accommodations
It is University policy to provide, on a flexible and individualized basis, reasonable accommodations to students who have disabilities that may affect their ability to participate in course activities or to meet course requirements. Accommodation requests should be made through the Disability Resource Center (DRC).
Conduct and Academic Integrity
You are expected to be aware of and adhere to University policy regarding academic integrity and conduct. Detailed information on these policies, and potential repercussions of policy violations, can be found via the Office of Student Rights & Responsibilities (OSRR).