Biography

Education

University of Illinois Urbana-Champaign (UIUC) Aug. 2024 – Present
Siebel School of Computing and Data Science Urbana, IL

Peking University (PKU) Sep. 2021 – Jul. 2024
Center for Data Science Beijing, China

M.S. in Data Science (Statistics)
Advisor: Prof. Liwei Wang & Prof. Mohan Chen

University of Science and Technology of China (USTC) Sep. 2017 – Jul. 2021
School of the Gifted Young (SGY) Hefei, China

B.S. in Statistics
Overall GPA: 3.99/4.3 (91.95) | Rank: 2/75 in Statistics
B.E. in Computer Science (Dual)
Overall GPA: 3.90/4.3 (91.24)

University of Washington (UW) Jul. 2018 – Aug. 2018
Department of Electrical Engineering Seattle, WA

University of California, Los Angeles (UCLA) Mar. 2023 – Dec. 2023
Research Intern, advised by Prof. Lin F. Yang Remote

Worked on reinforcement learning with heavy-tailed rewards.
- We proposed two computationally efficient algorithms for heavy-tailed linear bandits and linear MDPs, based on a novel concentration inequality for adaptive Huber regression.
- These algorithms achieve both minimax optimal and instance-dependent regret bounds.
- We provided a lower bound to demonstrate the optimality.
- We also conducted numerical experiments to corroborate the computational efficiency.
Worked on reinforcement learning with general function approximation.
- We proposed an algorithm for model-based reinforcement learning with general function approximation, which features the novel combination of weighted value-targeted regression and a high-order moment estimator.
- Our proposed algorithm achieves a both horizon-free and instance-dependent regret bound.
- It is both statistically and computationally efficient.
- We also conducted numerical experiments to validate the theoretical findings.

Peking University
Teaching Assistant Beijing, China

University of California, Los Angeles Apr. 2021 – Sep. 2021
Research Intern, advised by Prof. Lin F. Yang Remote

Worked on linear bandits with super heavy-tailed rewards.
- We proposed a generic algorithmic framework for super heavy-tailed linear bandits, which adopts a novel mean-of-medians estimator to handle the challenge of heavy-tailedness.
- We showed that our algorithmic framework is provably efficient for regret minimization.
- We also conducted numerical experiments to validate the effectiveness of our framework.