Deep Learning and Optimization Seminar
Organizing Committee:
- Dr. Tao LIN, Assistant Professor @ Westlake University
- Dr. Chen LIU, Assistant Professor @ City University of Hong Kong
- Dr. Kun YUAN, Assistant Professor @ Peking University
Contact & Discussion: You are welcome to join our Google group and our Slack.
Coming Up
Date (yyyy-mm-dd hh:mm) | Presenter | Topic or Paper | Materials |
---|---|---|---|
2024-11-15 10:00-11:00 (BJT) | Chongxuan Li, Renmin University of China | Advances in Diffusion Models | abstract. |
Past Events
Date | Presenter | Topic or Paper | Materials |
---|---|---|---|
2024-07-12 19:00-20:00 | Yichen WU, CityU HK | Meta-Continual Learning Revisited: A Second-Order Optimization Perspective | abstract, paper. |
2024-06-11 09:30-10:30 | Jiawei ZHAO, Caltech | GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection | abstract, paper, video. |
2024-05-29 19:00-20:00 | Sidak Pal Singh, ETH | Hallmarks of Optimization Trajectories in Neural Networks and LLMs: The Lengths, Bends, and Dead Ends | abstract, paper, video. |
2024-01-04 16:00-17:00 | Frederik Kunstner, UBC | Why does Adam work so well (especially on Transformers): it's not because of stochasticity | abstract, paper, video. |
2023-12-07 10:00-11:00 | Xinran GU, Tsinghua University | Understanding and Improving Local Gradient Methods for Distributed Deep Learning | abstract, video. |
2023-11-22 15:00-16:00 | Jiayuan YE, NUS | Initialization Matters: Privacy-Utility Analysis of Overparameterized Neural Networks | abstract, paper, video. |
2023-11-14 19:00-20:00 | Maksym ANDRIUSHCHENKO, EPFL | Why Do We Need Weight Decay in Modern Deep Learning | abstract, paper, video. |
2023-11-10 13:00-14:00 | Yao FU, University of Edinburgh | On Compression Theory for LLMs | abstract, video. |
2023-11-09 10:00-11:00 | Hao LIU, UC Berkeley | Large context window with blockwise parallel transformers and ring attention | abstract, paper1, paper2, video. |
2023-11-01 15:00-16:00 | Tianyu PANG, Sea AI Lab | On Evaluating Adversarial Robustness of Large Vision-Language Models | abstract, slides, paper1, paper2, paper3, video. |
2023-10-24 11:00-12:00 | Xiaoqiang LIN, NUS and Zhaoxuan WU, NUS | Use Your INSTINCT: INSTruction optimization usIng Neural bandits Coupled with Transformers | abstract, paper, website, video. |
2023-10-18 13:00-14:00 | Dacheng LI, UC Berkeley | Sequence level partition for long-context LLMs training and automatic parallelism | abstract, paper1, paper2, video. |
2023-07-20 10:00-11:00 | Libin ZHU, UCSD | Spikes in the training loss of SGD, catapults and feature learning | abstract, slides, video. |
2023-07-17 16:00-17:00 | Fanghui LIU, EPFL | The role of over-parameterization in machine learning - a function space perspective | abstract, slides, video. |
2023-07-11 19:00-20:00 | Tongtian ZHU, ZJU | Decentralize to Generalize? On the Asymptotic Equivalence of Decentralized SGD and Average-direction SAM | abstract, slides, video, paper. |
2023-06-28 16:00-17:00 | Francesco CROCE, University of Tübingen | How to Quickly Obtain Models Robust to Multiple and Different Threats, and Their Advantages | abstract, video. |
2023-06-19 15:00-16:00 | Binhang YUAN, HKUST | Accommodating LLM Training over Decentralized Computational Resources | abstract, slides, video. |
2023-06-07 19:00-20:00 | Bohang ZHANG, Peking University | Rethinking the expressive power of gnns via graph biconnectivity | abstract, slides, video, paper. |
2023-05-31 15:30-16:30 | Jiaxin SHI, Google Deepmind | Sequence Modeling with Multiresolution Convolutional Memory | abstract, paper. |
2023-04-28 09:00-10:00 | Ziming LIU, MIT | Physics of deep learning: Understanding grokking via the lens of physics | abstract, slides. |
2023-04-24 19:30-20:30 | Dingfan CHEN, CISPA | Privacy-preserving Generative Modeling | abstract, slides. |
2023-04-21 16:00-17:00 | Jialin LIU, DAMO Academy (US) |
Towards Constituting Mathematical Structures for Learning to Optimize | abstract, slides, paper. |
2023-04-12 17:00-18:00 | Maksym ANDRIUSHCHENKO, EPFL | SGD with large step sizes learns sparse features | abstract, video, paper. |
2022-12-01 16:00-17:30 | Ligeng ZHU, MIT | Algorithm-System Co-Design for TinyML. | abstract, paper, slides, website, demo, code. |