Machine Learning Theory

Date: July 18, 2024

Seminar at Online meeting & Business School 312, Southern University of Science and Technology, Shenzhen, China

When do machine learning algorithms work and why? How do we formalize what it means for an algorithm to learn from data? How do we use mathematical thinking to design better machine learning methods? This course focuses on developing a theoretical understanding of the statistical properties of learning algorithms.

Overview

Time: Thursdays at 7:00 PM
Location: Business School 312 with Online Meeting
Hosts: Hao Zeng, Huajun Xi
Course website: STATS214 / CS229M: Machine Learning Theory
Course videos: Stanford CS229M: Machine Learning Theory - Fall 2021 - YouTube
Reference notes: CS229M

Dial-in

Tencent Meeting: 479-9733-5495
Meeting room: Business School 314

Schedule

Jul 18, Hao Zeng, Lecture 1

Overview; supervised learning; empirical risk minimization (Section 1 of the notes); asymptotic analysis (subset of Section 2); concentration inequalities (Sec 3.1).

Jul 25, Zicheng Xie, Lecture 2

Uniform convergence; finite hypothesis class; discretizing infinite hypothesis spaces (Sections 4.1–4.3); Hoeffding inequality (Sections 3.2–3.4).

Cao, Y., Chen, Z., Belkin, M., & Gu, Q. (2022). Benign Overfitting in Two-layer Convolutional Neural Networks. NeurIPS 2022. https://openreview.net/forum?id=pF8btdPVTL_
Chaudhuri, K., & Hsu, D. (2011). Sample Complexity Bounds for Differentially Private Learning. COLT 2011. https://proceedings.mlr.press/v19/chaudhuri11a.html
Hopkins, M., Kane, D. M., Lovett, S., & Mahajan, G. (2022). Realizable Learning is All You Need. COLT 2022. https://proceedings.mlr.press/v178/hopkins22a.html

Aug 1, Huajun Xi, Lecture 3

Rademacher complexity; empirical Rademacher complexity (Sections 4.4 and 4.5); McDiarmid’s inequality (Section 3.5).

Bartlett, P. L., & Mendelson, S. (2002). Rademacher and Gaussian Complexities: Risk Bounds and Structural Results. JMLR 3:463–482.
Mohri, M., & Rostamizadeh, A. (2008). Rademacher Complexity Bounds for Non-I.I.D. Processes. NeurIPS 21. Sachs, S., van Erven, T., Hodgkinson, L., Khanna, R., & Şimşekli, U. (2023). Generalization Guarantees via Algorithm-Dependent Rademacher Complexity. COLT 2023.
Zhu, J., Gibson, B., & Rogers, T. T. (2009). Human Rademacher Complexity. NeurIPS 22.
Li, S., & Liu, Y. (2023). Distribution-Dependent McDiarmid-Type Inequalities for Functions of Unbounded Interaction. ICML 2023.
Maurer, A., & Pontil, M. (2021). Concentration Inequalities under Sub-Gaussian and Sub-Exponential Conditions. NeurIPS 34.

Aug 8, Hao Zeng, Lecture 4

Generalization bounds for neural networks; refined generalization bounds; connections to kernel methods (Sections 5.3–5.4).

Aug 15, Zicheng Xie, Lecture 5

Covering number approach; Dudley theorem (and implications) (Section 4.6).

Aug 29, Huajun Xi, Lecture 6

Generalization bounds for deep nets (Section 5.5).

Share on

Facebook LinkedIn X (formerly Twitter)