About me

I am a second-year Ph.D. student in Machine Learning Department, School of Computer Science, Carnegie Mellon University, advised by Prof. Yiming Yang. I received my Bachelor’s degree in computer science, with a minor in mathematics, from Peking university, where I was advised by Prof. Liwei Wang and Prof. Di He. I was a student researcher at Google Research in 2023, where I enjoyed working with Dr. Srinadh Bhojanapalli.

My research goal is to develop machine intelligence methods that better augment human intelligence. Towards this goal, I study the behavior and limitations of existing machine learning methods through both theoretical and empirical lens, guiding the development of new methods and principled model scaling. On the application side, I work on deep learning methods for mathematical reasoning, code generation, and partial differential equation solving.

My detailed CV can be found here.

Selected Publications (one per year)

2024: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models (in submission) [PDF]
Yangzhen Wu, Zhiqing Sun, Shanda Li, Sean Welleck, Yiming Yang

2023: Functional Interpolation for Relative Positions Improves Long Context Transformers (ICLR 2024) [PDF]
Shanda Li, Chong You, Guru Guruganesh, Joshua Ainslie, Santiago Ontanon, Manzil Zaheer, Sumit Sanghai, Yiming Yang, Sanjiv Kumar, Srinadh Bhojanapalli

2022: Is $L^2$ Physics-Informed Loss Always Suitable for Training Physics-Informed Neural Network? (NeurIPS 2022) [PDF]
Chuwei Wang*, Shanda Li*, Di He, Liwei Wang

2021: Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding (NeurIPS 2021) [PDF]
Shengjie Luo*, Shanda Li*, Tianle Cai, Di He, Dinglan Peng, Shuxin Zheng, Guolin Ke, Liwei Wang, Tie-Yan Liu

Blogs

July 29, 2024 CMU-MATH Team’s Innovative Approach Secures 2nd Place at the AIMO Prize (Published on CMU ML Blog)