About me

I am a first-year Ph.D. student in Machine Learning Department, School of Computer Science, Carnegie Mellon University, advised by Prof. Yiming Yang. I received my Bachelor’s degree in computer science from Peking university, where I was fortunate to work with Prof. Di He and Prof. Liwei Wang. I also worked as an intern in Prof. Cho-Jui Hsieh’s group at UCLA remotely in 2021.

My research area is machine learning. Recently, my work focuses on machine learning for science. I’m also interested in attention-based models and the Transformer architecture.

My detailed CV can be found here.

Selected Publications

[1] Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding (NeurIPS 2021) [PDF]
Shengjie Luo*, Shanda Li*, Tianle Cai, Di He, Dinglan Peng, Shuxin Zheng, Guolin Ke, Liwei Wang, Tie-Yan Liu

[2] Your Transformer May Not be as Powerful as You Expect (NeurIPS 2022) [PDF]
Shengjie Luo*, Shanda Li*, Shuxin Zheng, Tie-Yan Liu, Liwei Wang, Di He

[3] Is $L^2$ Physics-Informed Loss Always Suitable for Training Physics-Informed Neural Network? (NeurIPS 2022) [PDF]
Chuwei Wang*, Shanda Li*, Di He, Liwei Wang