Brier score and log loss: scoring World Cup 2026 predictions

The Brier score and log loss are the two standard ways to grade probabilistic predictions after the outcomes are known. The Brier score for a yes/no prediction is (forecast − outcome)², where the outcome is 1 if the event happened and 0 if it didn’t — lower is better, 0 is perfect, and an always-50% forecaster scores 0.25. Log loss is −ln(probability assigned to what actually happened) — it also rewards calibrated confidence, but punishes confident misses far more brutally. Use the Brier score for an intuitive report card and log loss when you want overconfidence to hurt.

The two rules side by side

Rule	Formula (binary)	Range	Best for
Brier score	(p − outcome)²	0 (perfect) to 1 (worst)	Intuitive comparisons; bounded penalty
Log loss	−ln(p of actual outcome)	0 (perfect) to ∞	Punishing overconfidence; model training

Standard definitions, stated here as of June 2026 for scoring the 2026 tournament.

Worked example with the 2026 favorites

Take the tournament favorite. As of June 2026, Polymarket prices Spain to win the 2026 World Cup at roughly 17%, and the Opta supercomputer’s 25,000-simulation run says 16.1%. Suppose Spain wins on July 19, 2026. The market’s Brier score on that single question is (0.17 − 1)² = 0.689; its log loss is −ln(0.17) ≈ 1.77. Now suppose Spain doesn’t win — the most likely single outcome, by both sources. The market scores (0.17 − 0)² = 0.029 and −ln(0.83) ≈ 0.19. The lesson built into both rules: a 17% forecast on an event that doesn’t happen was a good forecast, and a single tournament can never separate a 17% claim from a 16.1% claim. Scoring only becomes decisive across many questions.

Scoring a whole tournament, not one match

The 2026 format is generous to scorers: 48 teams, 12 groups, and a first-ever Round of 32 produce 104 matches plus hundreds of stage questions — does Brazil win its group, does the USA reach the quarterfinal, and so on. The right procedure is to log a probability for every question you genuinely had a view on, before kickoff, then average the per-question Brier scores after July 19, 2026. Averaged over enough questions, the comparison between you, the market, and a simulator becomes meaningful. A useful benchmark pair: an always-50% forecaster averages 0.25 Brier; beating the market average is the real bar, since market prices are the live consensus you could have had for free.

Calibration: the habit the scores enforce

Both rules are proper scoring rules: your expected score is best when you report what you actually believe. You cannot game the Brier score by hedging to 50% or by overstating confidence — over time, miscalibration shows up as a worse average. This is why we recommend writing probabilities down before each stage. History can inform those probabilities — from the 7,503-match head-to-head record, Brazil is 13–1 lifetime against its 2026 group opponents and Spain is 8–0 against its group, while 27 of 72 group fixtures are first-ever meetings with no record to lean on — but the score you earn comes from the number you committed to, not the story behind it. The angles on AI Picks are deliberately descriptive for exactly this reason.

Scoring Prever-style allocations

A Road to Glory position is a portfolio of seven stage predictions, so it can be scored stage by stage: each stage market (Groups through Champion) resolves yes or no, and the live price you accepted was your effective forecast. The engine’s inputs and weighting are documented in our methodology docs; converting the prices you see into probabilities is covered in implied probability and overround and, more gently, in how to read World Cup probabilities. If you want positions to score this summer, the step-by-step picks guide and the Road to Glory builder are the places to start.