About me

I am a third-year Ph.D. student in Machine Learning Department, School of Computer Science, Carnegie Mellon University, co-advised byProf. Yiming Yang and Prof. Ameet Talwalkar. I received my Bachelor’s degree in computer science (summa cum laude) from Turing Class, CFCS, Peking university, with a minor in mathematics.

I work as a student researcher this summer under the supervision of Tianle Cai. I was a student researcher in Google Research in 2023 summer, where I enjoyed working with Srinadh Bhojanapalli.

My research goal is to develop machine intelligence methods that better augment human intelligence. Towards this goal, I work on deep learning methods that address fundamental challenges in mathematical discovery and code generation. In particular, I’m interested in inference scaling and long context capability of language models.

My detailed CV can be found here.

Selected Publications (one per year)

2025 CodePDE: An Inference Framework for LLM-driven PDE Solver Generation (arXiv) [PDF]
Shanda Li, Tanya Marwah, Junhong Shen, Weiwei Sun, Andrej Risteski, Yiming Yang, Ameet Talwalkar

2024: Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models (ICLR 2025) [PDF] [talk (in Chinese)]
Yangzhen Wu, Zhiqing Sun, Shanda Li, Sean Welleck, Yiming Yang

2023: Functional Interpolation for Relative Positions Improves Long Context Transformers (ICLR 2024) [PDF]
Shanda Li, Chong You, Guru Guruganesh, Joshua Ainslie, Santiago Ontanon, Manzil Zaheer, Sumit Sanghai, Yiming Yang, Sanjiv Kumar, Srinadh Bhojanapalli

2022: Is $L^2$ Physics-Informed Loss Always Suitable for Training Physics-Informed Neural Network? (NeurIPS 2022) [PDF]
Chuwei Wang*, Shanda Li*, Di He, Liwei Wang

Blogs

July 29, 2024 CMU-MATH Team’s Innovative Approach Secures 2nd Place at the AIMO Prize (Published on CMU ML Blog)