ABIS
Leaderboard

DeepSeek

DeepSeek R1

Current drift

40.4%

Behavioral scorecard

Drift

40.4%

Behavioral Health

21.3%

Anomaly

0.1%

Correctability

100.0%

Consistency

0.0%

Complexity

55.3%

Reasoning Depth

58.1%

Alignment Stability

50.0%

Entropy

72.4%

Coherence

12.0%

Recent drift trend

9 Mar 2026
23 Mar 2026

Recent version changes

behavioral_shift8 Mar 2026, 09:13

Anomaly moved down (-99.9%) during run_1772961175.

behavioral_shift8 Mar 2026, 09:11

Drift moved down (-59.8%) during run_1772961050.

behavioral_shift8 Mar 2026, 09:07

Anomaly moved up (+88.8%) during run_1772960821.

behavioral_shift8 Mar 2026, 09:06

Anomaly moved down (-88.8%) during baseline_1772960132.

behavioral_shift8 Mar 2026, 09:06

Correctability moved up (+81.7%) during run_1772960761.

Want to score your own responses against this same system?

Start free, grab a key, and move from public observation to product monitoring.