May 07, 2025 | My work on AgentDojo, a benchmark for AI agent safety, has recently won first prize in SafeBench competition (50 000 USD) |
May 02, 2025 | MathConstruct: Challenging LLM Reasoning with Constructive Proofs has been accepted to ICML 2025! |
Apr 15, 2025 | I am presenting 2 papers at ICLR 2025 in Singapore: Language Models are Advanced Anonymizers and MathConstruct: Challenging LLM Reasoning with Constructive Proofs. |
Feb 07, 2025 | We released matharena.ai, a website for evaluating LLMs on latest math competitions. |