Maxproof

(arxiv.org)

64 points | by ilreb 2 hours ago

3 comments

daquisu 52 minutes ago
"I thought it was interesting and a bit underappreciated that the fraction of gold medalists at the 2025 IMO (72/630 = 11.4%) is the highest it’s been since 1981.
Crudely, IMO gold medals are awarded to the highest-scoring 1/12 of contestants.1 However, because scores are integers up to 42 and there’s no provision for tiebreaking, it’s possible for a lot of contestants to be tied around the threshold. In that case, either all of them get a gold medal or none do, and the fraction of gold medalists might deviate substantially from 1/12. That’s what happened this year: 46 contestants all won a gold medal by scoring exactly 35 points.
In fact, bizarrely, 35 is the mode of the scores this year; the last time the modal score was a gold medal score was in 1994. And, of course, 35 is the same score claimed by AI systems from Google, OpenAI, and others."
From https://blog.vero.site/post/imo-2025
[-]
- quibono 44 minutes ago
  I was under the impression that IMO is conducted in an official "exam" capacity, on site and in a very formal setting. So I find it hard to believe _direct_ LLM usage would be a factor Then again - it very well could be a factor in the training and preparation? I imagine "Write me a prep document for the IMO" will surface all kinds of interesting things from the training set.
thierrydamiba 9 minutes ago
Is the harness more valuable than the weights?
pfannl 46 minutes ago
The real AGI test is apparently not solving the IMO, but getting caught in the same scoring traffic jam as 46 teenagers.