Machine Learning Theory
Date:
Seminar at Online meeting & Business School 312, Southern University of Science and Technology, Shenzhen, China
When do machine learning algorithms work and why? How do we formalize what it means for an algorithm to learn from data? How do we use mathematical thinking to design better machine learning methods? This course focuses on developing a theoretical understanding of the statistical properties of learning algorithms.
Overview
- Time: Thursdays at 7:00 PM
- Location: Business School 312 with Online Meeting
- Hosts: Hao Zeng, Huajun Xi
- Course website: STATS214 / CS229M: Machine Learning Theory
- Course videos: Stanford CS229M: Machine Learning Theory - Fall 2021 - YouTube
- Reference notes: CS229M
Dial-in
- Tencent Meeting: 479-9733-5495
- Meeting room: Business School 314
Schedule
Jul 18, Hao Zeng, Lecture 1
Overview; supervised learning; empirical risk minimization (Section 1 of the notes); asymptotic analysis (subset of Section 2); concentration inequalities (Sec 3.1).
Jul 25, Zicheng Xie, Lecture 2
Uniform convergence; finite hypothesis class; discretizing infinite hypothesis spaces (Sections 4.1–4.3); Hoeffding inequality (Sections 3.2–3.4).
- Cao, Y., Chen, Z., Belkin, M., & Gu, Q. (2022). Benign Overfitting in Two-layer Convolutional Neural Networks. NeurIPS 2022. https://openreview.net/forum?id=pF8btdPVTL_
- Chaudhuri, K., & Hsu, D. (2011). Sample Complexity Bounds for Differentially Private Learning. COLT 2011. https://proceedings.mlr.press/v19/chaudhuri11a.html
- Hopkins, M., Kane, D. M., Lovett, S., & Mahajan, G. (2022). Realizable Learning is All You Need. COLT 2022. https://proceedings.mlr.press/v178/hopkins22a.html
Aug 1, Huajun Xi, Lecture 3
Rademacher complexity; empirical Rademacher complexity (Sections 4.4 and 4.5); McDiarmid’s inequality (Section 3.5).
- Bartlett, P. L., & Mendelson, S. (2002). Rademacher and Gaussian Complexities: Risk Bounds and Structural Results. JMLR 3:463–482.
- Mohri, M., & Rostamizadeh, A. (2008). Rademacher Complexity Bounds for Non-I.I.D. Processes. NeurIPS 21. Sachs, S., van Erven, T., Hodgkinson, L., Khanna, R., & Şimşekli, U. (2023). Generalization Guarantees via Algorithm-Dependent Rademacher Complexity. COLT 2023.
- Zhu, J., Gibson, B., & Rogers, T. T. (2009). Human Rademacher Complexity. NeurIPS 22.
- Li, S., & Liu, Y. (2023). Distribution-Dependent McDiarmid-Type Inequalities for Functions of Unbounded Interaction. ICML 2023.
- Maurer, A., & Pontil, M. (2021). Concentration Inequalities under Sub-Gaussian and Sub-Exponential Conditions. NeurIPS 34.
Aug 8, Hao Zeng, Lecture 4
Generalization bounds for neural networks; refined generalization bounds; connections to kernel methods (Sections 5.3–5.4).
Aug 15, Zicheng Xie, Lecture 5
Covering number approach; Dudley theorem (and implications) (Section 4.6).
Aug 29, Huajun Xi, Lecture 6
Generalization bounds for deep nets (Section 5.5).
