🧠 AI Model Performance
Transparent insights into how our AI models perform. See accuracy, calibration, and reliability metrics.
Best Accuracy
GEMINI
69.7% correct
Best Calibrated
GEMINI
8.0% error
Total Predictions
230
Across all models
Avg Brier Score
0.223
Lower is better
Model Comparison
G
Gemini
89 predictions 0%
Accuracy 69.7%
Brier Score 0.210
Calibration Error 8.0%
Correct Calls 62/89
Calibration by Confidence
■ Predicted ■ Actual
C
Claude
76 predictions 0%
Accuracy 65.8%
Brier Score 0.240
Calibration Error 11.0%
Correct Calls 50/76
Calibration by Confidence
■ Predicted ■ Actual
O
Openai
65 predictions 0%
Accuracy 67.7%
Brier Score 0.220
Calibration Error 9.0%
Correct Calls 44/65
Calibration by Confidence
■ Predicted ■ Actual
📚 Understanding the Metrics
🎯 Accuracy
The percentage of predictions that turned out correct. Higher is better.
📊 Brier Score
Measures the accuracy of probabilistic predictions. Range: 0 (perfect) to 1 (worst). Lower is better.
⚖️ Calibration Error
How closely the predicted confidence matches actual accuracy. A well-calibrated model's 70% predictions are correct 70% of the time.
📈 Calibration Chart
Visual comparison of predicted confidence vs actual outcomes. Bars should align when well-calibrated.