About me

I am a third-year Ph.D. student in Machine Learning Department, School of Computer Science, Carnegie Mellon University, co-advised by Prof. Yiming Yang and Prof. Ameet Talwalkar. I received my Bachelor’s degree in computer science (summa cum laude) from Turing Class, CFCS, Peking university, with a minor in mathematics. I was a student researcher at Google Research in 2023, where I enjoyed working with Srinadh Bhojanapalli.

My research goal is to develop machine intelligence methods that better augment human intelligence. Towards this goal, I work on deep learning methods that address fundamental challenges in mathematical discovery and code generation. In particular, I’m interested in inference scaling and long context capability of language models.

My detailed CV can be found here.

Selected Publications (one per year)

2024: Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models (in submission) [PDF]
Yangzhen Wu, Zhiqing Sun, Shanda Li, Sean Welleck, Yiming Yang

2023: Functional Interpolation for Relative Positions Improves Long Context Transformers (ICLR 2024) [PDF]
Shanda Li, Chong You, Guru Guruganesh, Joshua Ainslie, Santiago Ontanon, Manzil Zaheer, Sumit Sanghai, Yiming Yang, Sanjiv Kumar, Srinadh Bhojanapalli

2022: Is $L^2$ Physics-Informed Loss Always Suitable for Training Physics-Informed Neural Network? (NeurIPS 2022) [PDF]
Chuwei Wang*, Shanda Li*, Di He, Liwei Wang

Blogs

July 29, 2024 CMU-MATH Team’s Innovative Approach Secures 2nd Place at the AIMO Prize (Published on CMU ML Blog)