Leaderboard
DeepSeek
DeepSeek R1
Current drift
40.4%
Behavioral scorecard
Drift
40.4%
Behavioral Health
21.3%
Anomaly
0.1%
Correctability
100.0%
Consistency
0.0%
Complexity
55.3%
Reasoning Depth
58.1%
Alignment Stability
50.0%
Entropy
72.4%
Coherence
12.0%
Recent drift trend
9 Mar 2026
23 Mar 2026
Recent version changes
behavioral_shift8 Mar 2026, 09:13
Anomaly moved down (-99.9%) during run_1772961175.
behavioral_shift8 Mar 2026, 09:11
Drift moved down (-59.8%) during run_1772961050.
behavioral_shift8 Mar 2026, 09:07
Anomaly moved up (+88.8%) during run_1772960821.
behavioral_shift8 Mar 2026, 09:06
Anomaly moved down (-88.8%) during baseline_1772960132.
behavioral_shift8 Mar 2026, 09:06
Correctability moved up (+81.7%) during run_1772960761.
Recent findings
Want to score your own responses against this same system?
Start free, grab a key, and move from public observation to product monitoring.