Deep Learning and Optimization Seminar

Organizing Committee:

Contact & Discussion: You are welcome to join our Google group and our Slack.

Coming Up

Date (yyyy-mm-dd hh:mm)	Presenter	Topic or Paper	Materials
2024-11-15 10:00-11:00 (BJT)	Chongxuan Li, Renmin University of China	Advances in Diffusion Models	abstract.

Past Events

Date	Presenter	Topic or Paper	Materials
2024-07-12 19:00-20:00	Yichen WU, CityU HK	Meta-Continual Learning Revisited: A Second-Order Optimization Perspective	abstract, paper.
2024-06-11 09:30-10:30	Jiawei ZHAO, Caltech	GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection	abstract, paper, video.
2024-05-29 19:00-20:00	Sidak Pal Singh, ETH	Hallmarks of Optimization Trajectories in Neural Networks and LLMs: The Lengths, Bends, and Dead Ends	abstract, paper, video.
2024-01-04 16:00-17:00	Frederik Kunstner, UBC	Why does Adam work so well (especially on Transformers): it's not because of stochasticity	abstract, paper, video.
2023-12-07 10:00-11:00	Xinran GU, Tsinghua University	Understanding and Improving Local Gradient Methods for Distributed Deep Learning	abstract, video.
2023-11-22 15:00-16:00	Jiayuan YE, NUS	Initialization Matters: Privacy-Utility Analysis of Overparameterized Neural Networks	abstract, paper, video.
2023-11-14 19:00-20:00	Maksym ANDRIUSHCHENKO, EPFL	Why Do We Need Weight Decay in Modern Deep Learning	abstract, paper, video.
2023-11-10 13:00-14:00	Yao FU, University of Edinburgh	On Compression Theory for LLMs	abstract, video.
2023-11-09 10:00-11:00	Hao LIU, UC Berkeley	Large context window with blockwise parallel transformers and ring attention	abstract, paper1, paper2, video.
2023-11-01 15:00-16:00	Tianyu PANG, Sea AI Lab	On Evaluating Adversarial Robustness of Large Vision-Language Models	abstract, slides, paper1, paper2, paper3, video.
2023-10-24 11:00-12:00	Xiaoqiang LIN, NUS and Zhaoxuan WU, NUS	Use Your INSTINCT: INSTruction optimization usIng Neural bandits Coupled with Transformers	abstract, paper, website, video.
2023-10-18 13:00-14:00	Dacheng LI, UC Berkeley	Sequence level partition for long-context LLMs training and automatic parallelism	abstract, paper1, paper2, video.
2023-07-20 10:00-11:00	Libin ZHU, UCSD	Spikes in the training loss of SGD, catapults and feature learning	abstract, slides, video.
2023-07-17 16:00-17:00	Fanghui LIU, EPFL	The role of over-parameterization in machine learning - a function space perspective	abstract, slides, video.
2023-07-11 19:00-20:00	Tongtian ZHU, ZJU	Decentralize to Generalize? On the Asymptotic Equivalence of Decentralized SGD and Average-direction SAM	abstract, slides, video, paper.
2023-06-28 16:00-17:00	Francesco CROCE, University of Tübingen	How to Quickly Obtain Models Robust to Multiple and Different Threats, and Their Advantages	abstract, video.
2023-06-19 15:00-16:00	Binhang YUAN, HKUST	Accommodating LLM Training over Decentralized Computational Resources	abstract, slides, video.
2023-06-07 19:00-20:00	Bohang ZHANG, Peking University	Rethinking the expressive power of gnns via graph biconnectivity	abstract, slides, video, paper.
2023-05-31 15:30-16:30	Jiaxin SHI, Google Deepmind	Sequence Modeling with Multiresolution Convolutional Memory	abstract, paper.
2023-04-28 09:00-10:00	Ziming LIU, MIT	Physics of deep learning: Understanding grokking via the lens of physics	abstract, slides.
2023-04-24 19:30-20:30	Dingfan CHEN, CISPA	Privacy-preserving Generative Modeling	abstract, slides.
2023-04-21 16:00-17:00	Jialin LIU, DAMO Academy (US)	Towards Constituting Mathematical Structures for Learning to Optimize	abstract, slides, paper.
2023-04-12 17:00-18:00	Maksym ANDRIUSHCHENKO, EPFL	SGD with large step sizes learns sparse features	abstract, video, paper.
2022-12-01 16:00-17:30	Ligeng ZHU, MIT	Algorithm-System Co-Design for TinyML.	abstract, paper, slides, website, demo, code.