Mislav Balunović

Senior Research Scientist at Google DeepMind

prof_pic.jpg

I am Senior Research Scientist at Google DeepMind. Previously, I’ve created MathArena, a benchmark for evaluating LLMs on latest math competitions which is now used by leading labs such as Google DeepMind, xAI and Microsoft. I received my PhD at ETH Zurich during which my work has been featured in media such as Ars Technica, Forbes and Wired. During high school I won gold medal at IMO and silver medals at IOI. I also spent time in industry working on securing AI agents at a startup, and doing internships at Twitter, Facebook and SigOpt.

news

Sep 01, 2025 I joined Google DeepMind as a Senior Research Scientist.
May 07, 2025 My work on AgentDojo, a benchmark for AI agent safety, has recently won first prize in SafeBench competition (50 000 USD)
May 02, 2025 MathConstruct: Challenging LLM Reasoning with Constructive Proofs has been accepted to ICML 2025!
Apr 15, 2025 I am presenting 2 papers at ICLR 2025 in Singapore: Language Models are Advanced Anonymizers and MathConstruct: Challenging LLM Reasoning with Constructive Proofs.
Feb 07, 2025 We released matharena.ai, a website for evaluating LLMs on latest math competitions.

selected publications

  1. arXiv
    matharena.png
    MathArena: Evaluating LLMs on Uncontaminated Math Competitions
    Mislav Balunović, Jasper Dekoninck, Ivo Petrov, Nikola Jovanović, and Martin Vechev
    arXiv, 2025
  2. arXiv
    bluff.png
    Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
    Ivo Petrov, Jasper Dekoninck, Lyuben Baltadzhiev, Maria Drencheva, Kristian Minchev, Mislav Balunović, Nikola Jovanović, and Martin Vechev
    arXiv, 2025
  3. MathConstruct: Challenging LLM Reasoning with Constructive Proofs
    Mislav Balunovic, Jasper Dekoninck, Nikola Jovanovic, Ivo Petrov, and Martin T. Vechev
    In ICML, 2025