news

May 07, 2025 My work on AgentDojo, a benchmark for AI agent safety, has recently won first prize in SafeBench competition (50 000 USD)
May 02, 2025 MathConstruct: Challenging LLM Reasoning with Constructive Proofs has been accepted to ICML 2025!
Apr 15, 2025 I am presenting 2 papers at ICLR 2025 in Singapore: Language Models are Advanced Anonymizers and MathConstruct: Challenging LLM Reasoning with Constructive Proofs.
Feb 07, 2025 We released matharena.ai, a website for evaluating LLMs on latest math competitions.