On the afternoon of July 18, Assistant Professor Huang Tianjin from the University of Exeter, UK, delivered an academic lecture to faculty and students at our college, focusing on improving the stability of large language model (LLM) training. The session was chaired by Dean Mao Qirong and attended by teachers and graduate students working in related research fields.
Titled “Optimizers for Robust Large Language Model Training,” the talk provided a systematic analysis of the widespread gradient spike phenomenon during LLM training and explained its impact on both the training process and final model performance. Professor Huang then introduced the SPAM (Spike-Aware Moment regularization) method, developed by his team, which detects and clips abnormal gradient spikes through a spike-aware regularization mechanism. This approach effectively mitigates training oscillations caused by imbalanced momentum accumulation and significantly enhances training stability. He further presented Stable-SPAM, an extended method that incorporates adaptive thresholding and dynamic scaling strategies, offering a more robust solution for low-precision training of LLMs.

Following the presentation, Professor Huang engaged in an in-depth discussion with the audience, thoughtfully answering questions on optimizer design, training dynamics analysis, and potential extensions of the proposed methods. His well-structured and insightful presentation reflected a rigorous and innovative research approach, offering valuable inspiration for faculty and students exploring optimization and stability in large-scale model training.