Hi! I’m Zhi.

I am a PhD student at UC San Diego advised by Tianhao Wang.

I work on reinforcement learning and on how agent improves from its own experience. This is a loop: the agent explores and searches for good behavior in human-crafted or self-generated environments, the results are judged and credit is assigned, and what survives is learned, sometimes in context, ultimately in the weights.

I study each stage of this loop and how it fails. The stages map cleanly onto topics I care about: synthetic environment generation, learning with search, exploration in RL, credit assignment over long horizons, and plasticity in online, non-stationary RL, with continual learning as the end state the whole loop is trying to reach.

Publications

  • Lee Eon, Andrés R. Vindas-Meléndez, and Zhi Wang. Generalized Snake Posets, Order Polytopes, and Lattice-Point Enumeration. Discrete Mathematics (2026): 115072. We derived the Ehrhart and h-star polynomials for the generalized snake posets, with closed-form formulae for the two extreme cases, and obtained a monotonicity result using a novel technique that bridges combinatorics and lattice polytopes, extending several previous results.

People I’ve worked with

Tianhao Wang, Difan Zou, Andrés R. Vindas Meléndez, Eon Lee, Quan Wen, Misha Belkin.

Huge thanks to all my mentors and collaborators! Here is my CV. (last updated: Apr 2026)