Research articles and in-depth analysis from the AI March Madness 2026 team. Topics include AI prediction methodology, source citation analysis, confidence calibration insights, prediction drift patterns, upset detection, prompt sensitivity testing, and tournament strategy.
This section contains 37 research articles covering how GPT-4o, Gemini 2.5, and Perplexity Sonar Pro approach NCAA Tournament predictions. Each article examines a specific aspect of AI forecasting with data from our automated collection pipeline.
UPSET DETECTION: WHICH AI IS MOST LIKELY TO CALL THE CHAOS EARLY
In a typical 64-team field, there are 6–8 upsets where a seed 10 or higher beats a seed 7 or lower in the first two rounds. Correctly calling even 2–3 of those gives a massive bracket edge.
DT
Data Team
Data Analysis
Mar 14, 20265 min readUpdated May 1, 2026
UpsetsSeedingsAnalysis
THE UPSET HYPOTHESIS
Our hypothesis: Perplexity's heavy use of analytics domains (KenPom, Barttorvik) makes it more likely to surface statistical mismatches between seed and actual team quality. A 12-seed with elite defensive efficiency that's seeded low due to a weak conference often looks identical to a 5-seed by the numbers.
MEDIA NARRATIVE VS. ANALYTICS SIGNAL
GPT-4o, which leans on national media coverage, may be more consensus-following - and therefore worse at identifying under-covered upsets. Gemini, with its broader data blend, likely sits between the two.
We track upset accuracy separately on the Upsets page. Round of 64 will be the first real test - Play-in games don't produce classic upsets.
LIVE DATA
See this tracked in real-time as the tournament plays out.