Graduate Seminar in Statistics

Updated

February 5, 2025

Announcements

Submit your reading synopses [here] one day in advance of class; we’ll use the same form for the entire quarter.

Graduate seminar provides a venue for MS students to explore research trends in statistics through reading and discussion of recent papers; readings are selected based on relevance, influence, and student and faculty research interests. In Winter 2025, selected readings explore methodology and statistical thinking in the era of large and complex datasets and prediction algorithms.

Read the [course syllabus].

Instructor: Trevor Ruiz (he/him/his) [email]

Class meetings: 10:10am — 11:00am R 180-331

Office hours: by appointment

Copies of readings can be found in the class [drive folder]. Please note that readings and topics are subject to change throughout the quarter.

Please submit your reading synopses [here] one day in advance of class; we’ll use the same form for the entire quarter.

Week 1 (1/6/25)

Introductions & logistics; no reading.

Week 2 (1/13/25)

Interpretable ML. Discussion leaders: Brandon Kim, Kyle Bistrain.

  • Allen, G. I., Gan, L., & Zheng, L. (2023). Interpretable machine learning for discovery: Statistical challenges and opportunities. Annual Review of Statistics and Its Application.

  • Koh, P. W., & Liang, P. (2017). Understanding black-box predictions via influence functions. Proceedings of the 34th International Conference on Machine Learning.

Suggested further reading:

  • Alfeo, A. L., Zippo, A. G., Catrambone, V., Cimino, M. G., Toschi, N., & Valenza, G. (2023). From local counterfactuals to global feature importance: efficient, robust, and model-agnostic explanations for brain connectivity networks. Computer Methods and Programs in Biomedicine.

  • Wang, H., Fu, T., et al. (2023). Scientific discovery in the age of artificial intelligence. Nature.

Week 3 (1/21/25)

No meeting.

Week 4 (1/28/25)

Clustering methods for high-dimensional data. Discussion leaders: Andrew Kerr, Brendan Callender.

  • Bouveyron, C., Girard, S., & Schmid, C. (2007). High-dimensional data clustering. Computational Statistics & Data Analysis.

  • Soltanolkotabi, M., Elhamifar, E., & Candès, E. J. (2014). Robust subspace clustering. The Annals of Statistics.

Week 5 (2/3/25)

Statistics and society. Discussion leaders: Lily Cook, Dylan Le.

  • Mitchell, S., Potash, E., Barocas, S., D’Amour, A., & Lum, K. (2021). Algorithmic fairness: Choices, assumptions, and definitions. Annual Review of Statistics and Its Application.

  • Schneider, C. R., Kerr, J. R., Dryhurst, S., & Aston, J. A. (2023). Communication of Statistics and Evidence in Times of Crisis. Annual Review of Statistics and Its Application.

Week 6 (2/10/25)

Estimation from nonrandom samples with big data; more clustering. Discussion leaders: Zach Felix, Jacob Perez.

  • Meng, X. L. (2018). Statistical paradises and paradoxes in big data (i) law of large populations, big data paradox, and the 2016 us presidential election. The Annals of Applied Statistics.

  • Witten, D. M., & Tibshirani, R. (2010). A framework for feature selection in clustering. Journal of the American Statistical Association.

Week 7 (2/17/25)

Conformal prediction. Discussion leaders: Rachel Roggenkemper, Liam Quach.

  • Lei, J., Robins, J., & Wasserman, L. (2013). Distribution-free prediction sets. Journal of the American Statistical Association.

  • Lei, J., G’Sell, M., Rinaldo, A., Tibshirani, R. J., & Wasserman, L. (2018). Distribution-free predictive inference for regression. Journal of the American Statistical Association.

Week 8 (2/24/25)

Causal inference. Discussion leaders: Kendall Hipes, Lana Huynh.

  • D’Agostino McGowan, L., Gerke, T., & Barrett, M. (2024). Causal inference is not just a statistics problem. Journal of Statistics and Data Science Education.

  • Imai, K., & Jiang, Z. (2023). Principal fairness for human and algorithmic decision-making. Statistical Science.

Suggested further reading:

  • Ding, P., & Li, F. (2018). Causal inference. Statistical Science.

  • Imbens, G. W. (2024). Causal inference in the social sciences. Annual Review of Statistics and Its Application.

Week 9 (3/3/25)

Variable selection with knockoffs. Discussion leaders: Daniel Erro, Nathan Greenfield.

  • Barber, R. F., & Candès, E. J. (2015). Controlling the false discovery rate via knockoffs. The Annals of Statistics.

  • Barber, R. F., Candès, E. J., & Samworth, R. J. (2020). Robust inference with knockoffs. The Annals of Statistics.

Week 10 (3/10/25)

TBD