Question 1

What is AI March Madness 2026?

Accepted Answer

AI March Madness 2026 is a live experiment that tracks how three AI models - GPT-4o, Gemini 2.5, and Perplexity Sonar Pro - predict every game of the 2026 NCAA Men's Basketball Tournament. We measure accuracy, source transparency, confidence calibration, and prediction drift across all 67 tournament games.

Question 2

Which AI models are tracked?

Accepted Answer

We track three search-enabled AI models: OpenAI's GPT-4o (via gpt-4o-search-preview), Google's Gemini 2.5 Pro (with Google grounding), and Perplexity's Sonar Pro. All three have real-time web search capabilities, which lets us analyze both their predictions and the sources they cite.

Question 3

How are predictions collected?

Accepted Answer

Each model receives the same structured prompt at three time windows before every game: 24 hours out (T-24h), 6 hours out (T-6h), and 1 hour before tip-off (T-1h). This triple-snapshot approach lets us measure prediction drift and flip rates as new information becomes available.

Question 4

What is prediction drift?

Accepted Answer

Prediction drift measures how an AI model's pick changes between collection windows. A "flip" occurs when the model switches its predicted winner between T-24h and T-1h. High flip rates without corresponding news events suggest unstable reasoning priors.

Question 5

What is confidence calibration?

Accepted Answer

Calibration measures whether an AI model's stated confidence matches its actual accuracy. A well-calibrated model that says "80% confident" should be correct roughly 80% of the time. Most AI models show overconfidence - stating high confidence on predictions they get wrong more often than their confidence implies.

Question 6

How is accuracy scored?

Accepted Answer

Correct picks earn 1 point. Bracket scoring weights later rounds more heavily: Sweet 16 picks are worth 2 points, Elite 8 worth 4 points, Final Four worth 8 points, and the Championship pick is worth 16 points. This mirrors standard bracket pool scoring.

Question 7

What is source intelligence?

Accepted Answer

Source intelligence tracks every URL each AI model cites in its prediction responses. We categorize sources by type (major media, analytics, social media, team official sites) and rank domains by citation frequency. This reveals each model's evidence base and potential biases.

Question 8

What is prompt sensitivity testing?

Accepted Answer

Prompt sensitivity tests whether rephrasing a prediction question changes the AI's answer. We run five prompt variants per game per model. A model that changes its pick based on synonymous rephrasing has unstable priors - its prediction is more a function of phrasing than genuine analysis.

ABOUT THIS PROJECT

WHAT IS AI MARCH MADNESS?

METHODOLOGY

QUERY PROTOCOL

SOURCE TRACKING

ACCURACY SCORING

CALIBRATION ANALYSIS

BIAS REGISTER

PROMPT SENSITIVITY

FREQUENTLY ASKED QUESTIONS

GLOSSARY

Get the weekly AI accuracy report.