Maxproof

(arxiv.org)

64 points | by ilreb 2 hours ago

3 comments

  • daquisu 52 minutes ago
    "I thought it was interesting and a bit underappreciated that the fraction of gold medalists at the 2025 IMO (72/630 = 11.4%) is the highest it’s been since 1981.

    Crudely, IMO gold medals are awarded to the highest-scoring 1/12 of contestants.1 However, because scores are integers up to 42 and there’s no provision for tiebreaking, it’s possible for a lot of contestants to be tied around the threshold. In that case, either all of them get a gold medal or none do, and the fraction of gold medalists might deviate substantially from 1/12. That’s what happened this year: 46 contestants all won a gold medal by scoring exactly 35 points.

    In fact, bizarrely, 35 is the mode of the scores this year; the last time the modal score was a gold medal score was in 1994. And, of course, 35 is the same score claimed by AI systems from Google, OpenAI, and others."

    From https://blog.vero.site/post/imo-2025

    • quibono 44 minutes ago
      I was under the impression that IMO is conducted in an official "exam" capacity, on site and in a very formal setting. So I find it hard to believe _direct_ LLM usage would be a factor Then again - it very well could be a factor in the training and preparation? I imagine "Write me a prep document for the IMO" will surface all kinds of interesting things from the training set.
  • thierrydamiba 9 minutes ago
    Is the harness more valuable than the weights?
  • pfannl 46 minutes ago
    The real AGI test is apparently not solving the IMO, but getting caught in the same scoring traffic jam as 46 teenagers.