Public evidence
Drift Monitor
Public benchmark cadence and summaries. Model behavioral health tracked in real time.
Models tracked: 8 | Providers: 3 | Last run: 2026-02-28 | MongoDB: Unknown
Recent runs (last 30 days): No runs published in last 30 days.
Showing last 30 days. Need older history? Request a custom period.
Stable models
6
Drift detected (total)
3
Corrected (total)
3
Total calls
360
Live run
STABLECORRECTEDDRIFT DETECTEDPublished 2026-02-283 providers8 models
Category radar
Highest drift category: factual
Model comparison table
20%+ green5–20% amber<5% muted
| Model | Provider | Health | Coherence | Entropy | Avg drift | Min drift | Max drift | Prompts | Drifts detected | Corrections success |
|---|---|---|---|---|---|---|---|---|---|---|
| claude-opus-4.6 | anthropic | -- | -- | -- | 0.279185 | 0.108842 | 0.454999 | 45 | 0 | 0 (0%) |
| gpt-5.2 | openai | -- | -- | -- | 0.271264 | 0.066595 | 0.450000 | 45 | 0 | 0 (0%) |
| claude-sonnet-4.5 | anthropic | -- | -- | -- | 0.267708 | 0.083605 | 0.437624 | 45 | 0 | 0 (0%) |
| deepseek-r1 | deepseek | -- | -- | -- | 0.266721 | 0.062825 | 0.450000 | 45 | 0 | 0 (0%) |
| claude-haiku-4.5 | anthropic | -- | -- | -- | 0.264638 | 0.074218 | 0.569523 | 45 | 2 | 2 (100%) |
| deepseek-v3.2 | deepseek | -- | -- | -- | 0.250527 | 0.083469 | 0.609293 | 45 | 1 | 1 (100%) |
| gpt-4o | openai | -- | -- | -- | 0.241527 | 0.072350 | 0.412773 | 45 | 0 | 0 (0%) |
| gpt-5.2-instant | openai | -- | -- | -- | 0.237899 | 0.070503 | 0.450000 | 45 | 0 | 0 (0%) |
What we publish
- Model-level ranges and averages
- Aggregate drift and corrected counts
- Status labels and benchmark dates
What we do not publish
- Prompt-level details
- Provider attribution analysis
- Internal feature or strategy identifiers