2026 05 Talk

M. gave a talk at the Department of Industrial Engineering and Management Sciences (IEMS) at Northwestern University, titled “When Classical Optimization Meets Modern Foundation Models: New Algorithms, Theory, and Insights”. The slides can be found here. This talk included a few of our new results, including an interpretation of SGD, and making Nesterov’s lookahead momentum work as a “harness” to accelerate pretraining algorithms (see here).