About me

I am a second-year Ph.D. student in Machine Learning Department, School of Computer Science, Carnegie Mellon University, advised by Prof. Yiming Yang. I received my Bachelor’s degree in computer science from Peking university where I was advised by Prof. Liwei Wang and Prof. Di He. I also worked as an intern with Prof. Cho-Jui Hsieh’s group at UCLA remotely in 2021, and with Dr. Srinadh Bhojanapalli at Google in 2023.

My research area is deep learning. Recently, I’m working on algorithmic reasoning with Transformers and GNNs. I’m also interested in principled understanding and efficient scaling of large language models.

My detailed CV can be found here.

Selected Publications

[1] Functional Interpolation for Relative Positions Improves Long Context Transformers (in submission) [PDF]
Shanda Li, Chong You, Guru Guruganesh, Joshua Ainslie, Santiago Ontanon, Manzil Zaheer, Sumit Sanghai, Yiming Yang, Sanjiv Kumar, Srinadh Bhojanapalli

[2] Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding (NeurIPS 2021) [PDF]
Shengjie Luo*, Shanda Li*, Tianle Cai, Di He, Dinglan Peng, Shuxin Zheng, Guolin Ke, Liwei Wang, Tie-Yan Liu

[3] Your Transformer May Not be as Powerful as You Expect (NeurIPS 2022) [PDF]
Shengjie Luo*, Shanda Li*, Shuxin Zheng, Tie-Yan Liu, Liwei Wang, Di He

[4] Is $L^2$ Physics-Informed Loss Always Suitable for Training Physics-Informed Neural Network? (NeurIPS 2022) [PDF]
Chuwei Wang*, Shanda Li*, Di He, Liwei Wang

[5] Learning Physics-Informed Neural Networks without Stacked Back-propagation (AISTATS 2023) [PDF]
Di He, Shanda Li, Wenlei Shi, Xiaotian Gao, Jia Zhang, Jiang Bian, Liwei Wang, Tie-Yan Liu