Leaderboard

Public rankings

Live scorecardsSorted by driftShareable evidence

Behavioral stabilityleaderboard

ABIS ranks the currently monitored models by observed drift. Lower drift means a higher stability index and fewer surprises for teams building on top of them.

Try the scorer Start free Read the findings feed

Most stable

Claude Haiku 4.5

Current drift is 9.4% in the latest public scorecard.

Actions

Refresh or share the table in one step.

The leaderboard is built to be checked, cited, and passed around.

ShareRefresh

models ranked

90.6%

best stability index

25 Mar 2026, 10:04

latest public snapshot

Most Stable

Claude Haiku 4.5

90.6%

Highest stability index in the current public scorecard.

Highest Drift

GPT-4o

40.6%

Model showing the most instability right now.

Best Health

GPT-5.2

30.0%

Strongest behavioral health score in the latest snapshot.

Rank	Model	Provider	Stability Index	Current Drift	Behavioral Health	Status
1	Claude Haiku 4.5	Anthropic	90.6%	9.4%	21.9%	stable
2	GPT-5.2 Instant	OpenAI	90.5%	9.5%	19.6%	stable
3	GPT-5.2	OpenAI	88.2%	11.8%	30.0%	stable
4	Claude Opus 4.6	Anthropic	78.6%	21.4%	21.7%	drifting
5	Claude Sonnet 4.5	Anthropic	78.2%	21.8%	21.9%	drifting
6	DeepSeek V3.2	DeepSeek	67.3%	32.7%	29.4%	drifting
7	DeepSeek R1	DeepSeek	59.6%	40.4%	21.3%	volatile
8	GPT-4o	OpenAI	59.5%	40.6%	26.1%	volatile

Generated at 25 Mar 2026, 10:04.

Want these scores in your own workflow?

Use the public pages for discovery, then start free for API access.

Start free Developer quickstart