Biography

Education

Peking University (PKU) Sep. 2021 – Jul. 2024 (Exp.)
Center for Data Science Beijing, China

University of Science and Technology of China (USTC) Sep. 2017 – Jul. 2021
School of the Gifted Young (SGY) Hefei, China

  • B.S. in Statistics
    Overall GPA: 3.99/4.3 (91.95) | Rank: 2/75 in Statistics

  • B.E. in Computer Science (Dual)
    Overall GPA: 3.90/4.3 (91.24)

University of Washington (UW) Jul. 2018 – Aug. 2018
Department of Electrical Engineering Seattle, WA

  • Summer School of Global Electrical Engineering Program

Experience

University of California, Los Angeles (UCLA) Mar. 2023 – Dec. 2023
Research Intern, advised by Prof. Lin F. Yang Remote

  • Worked on reinforcement learning with heavy-tailed rewards.

    • We proposed two computationally efficient algorithms for heavy-tailed linear bandits and linear MDPs, based on a novel concentration inequality for adaptive Huber regression.

    • These algorithms achieve both minimax optimal and instance-dependent regret bounds.

    • We provided a lower bound to demonstrate the optimality.

    • We also conducted numerical experiments to corroborate the computational efficiency.

  • Worked on reinforcement learning with general function approximation.

    • We proposed an algorithm for model-based reinforcement learning with general function approximation, which features the novel combination of weighted value-targeted regression and a high-order moment estimator.

    • Our proposed algorithm achieves a both horizon-free and instance-dependent regret bound.

    • It is both statistically and computationally efficient.

    • We also conducted numerical experiments to validate the theoretical findings.

Peking University
Teaching Assistant Beijing, China

  • Machine Learning (Turing Class) Spring 2022

University of California, Los Angeles Apr. 2021 – Sep. 2021
Research Intern, advised by Prof. Lin F. Yang Remote

  • Worked on linear bandits with super heavy-tailed rewards.

    • We proposed a generic algorithmic framework for super heavy-tailed linear bandits, which adopts a novel mean-of-medians estimator to handle the challenge of heavy-tailedness.

    • We showed that our algorithmic framework is provably efficient for regret minimization.

    • We also conducted numerical experiments to validate the effectiveness of our framework.