
🛬 Attending
🇰🇷ICML2026, Seoul, KR
TY to support grants from:
<aside> <img src="notion://custom_emoji/0d39d0ab-438c-4f29-be70-03aa9d912057/2c928fcd-40c2-806f-bf66-007a2758f01d" alt="notion://custom_emoji/0d39d0ab-438c-4f29-be70-03aa9d912057/2c928fcd-40c2-806f-bf66-007a2758f01d" width="40px" /> Google Scholar
</aside>
<aside> <img src="notion://custom_emoji/0d39d0ab-438c-4f29-be70-03aa9d912057/2c928fcd-40c2-8076-afbe-007af223fbba" alt="notion://custom_emoji/0d39d0ab-438c-4f29-be70-03aa9d912057/2c928fcd-40c2-8076-afbe-007af223fbba" width="40px" /> Research Gate
</aside>
<aside> <img src="attachment:5a1e77b6-a69a-4a9c-8282-b0e70b4de887:Less_Wrong_LOGO.png" alt="attachment:5a1e77b6-a69a-4a9c-8282-b0e70b4de887:Less_Wrong_LOGO.png" width="40px" /> LessWrong
</aside>
<aside> <img src="notion://custom_emoji/0d39d0ab-438c-4f29-be70-03aa9d912057/2c928fcd-40c2-8012-8d55-007a1a6ff476" alt="notion://custom_emoji/0d39d0ab-438c-4f29-be70-03aa9d912057/2c928fcd-40c2-8012-8d55-007a1a6ff476" width="40px" />
</aside>
<aside> <img src="notion://custom_emoji/0d39d0ab-438c-4f29-be70-03aa9d912057/2c928fcd-40c2-8059-bfa4-007a8ad3da3f" alt="notion://custom_emoji/0d39d0ab-438c-4f29-be70-03aa9d912057/2c928fcd-40c2-8059-bfa4-007a8ad3da3f" width="40px" /> BlueSky
</aside>
<aside> <img src="notion://custom_emoji/0d39d0ab-438c-4f29-be70-03aa9d912057/2c928fcd-40c2-8059-811e-007a23ecdf0a" alt="notion://custom_emoji/0d39d0ab-438c-4f29-be70-03aa9d912057/2c928fcd-40c2-8059-811e-007a23ecdf0a" width="40px" /> X/Twitter
</aside>
The rise of reasoning agents with more autonomy is a double-edged sword that enable unique failure modes: agentic reasoning can be very invalid logically or factually (reasoning not done right) and even when they are valid, agents can work against you in their decision-making and interaction with each other (reasoning done right, but for the wrong purpose).
My research focus on making AI smarter and making smarter AI safer, especially in multi-agent settings (which I believe is an inevitable trend for future deployment at scale):
Line A: Reasoning Done Right (How to Make AI Smarter): My vertical focus on the science of evaluation to better inform post-training, often with the help of actionable interpretability methods. Ultimately, this line of work contributes to AI4Science that accelerate scientific discovery, where I’m particularly intrigued by AI application in fields where physics and chemistry collide, such as (exo-)planetary and environmental science.
I try to follow several principles in developing evaluation benchmarks:
Line B: Reasoning For Good (How to Make Smarter Agents Safer?): Reasoning and Agency also enabled novel threat model such as deception, scheming and collusion. My vertical focus on identifying key triggers/minimum conditions that suppress/encourage such agentic misalignment. Recently, I’ve been trying to probe
Eventually, I believe the best way to ensure scalable agentic safety is consequence-aware reasoning (CoI, Chain-of-Implication) similar to how legal deterrence work for humans.