Emily de Oliveira Santos

I work in project operations at Surge AI. My background is in pure mathematics, specifically category theory. I co-authored Humanity's Last Exam, published in Nature, and contributed problems to FrontierMath.

Press

I occasionally give interviews about my work.

FrontierMath

FrontierMath is a benchmark of exceptionally challenging mathematics problems covering most major branches of modern mathematics. Problems in this dataset require PhD+ level expertise and take mathematicians several hours — or even days — to solve.

A few months after FrontierMath's initial release, Epoch AI announced an extension of the project called FrontierMath (Tier 4). Problems in this set were created through several-week research projects by mathematics professors and postdocs, representing the most extreme difficulty level in the benchmark.

I was one of the major contributors of problems across all tiers of FrontierMath, including Tier 4. I am also a co-author of the paper.

Humanity's Last Exam

I contributed questions to Humanity's Last Exam, a benchmark of 2,500 expert-level questions designed to test the limits of large language models across dozens of academic subjects. In particular:

  • I received two Top 550 prizes and one Top 50 prize.
  • One of my questions appears among the six example problems included in the Nature paper (as Figure 3) and the eight on the project website.

Research

Projects

Contact