ABIS

A CIJ Labs product

Live backend dataPublic playgroundLuxury editorial shell
The models you build onchange silently. ABIS catches it.

Public drift rankings, a zero-signup playground, live change records, and developer-ready tooling for teams that need to know when model behavior shifts under them.

Live snapshot

Claude Haiku 4.5 is the steadiest surface today.

Current stability index 90.6% with drift at 9.4%.

Real scorecard feedNo signup required

8

models ranked

1060

benchmark calls logged

SUSTAIN

monitoring phase

Latest finding

GPT-5.2 dropped on drift

GPT-5.2 triggered a likely version update signal. Largest moves: drift down 85.5%, coherence up 23.5%.

Research layer

Evidence first, then integration.

ABIS is designed to feel like an editorial product surface, not a dashboard fragment. The proof, the scorer, and the integration path all live in one place.

13

Models monitored

35

Version changes tracked

35

Published change events

Zero-signup playground

Let developers feel ABIS before they commit.

This public scorer runs surface-level analysis only. It is fast, surprising, and good enough to turn curiosity into a real integration.

No account needed. 20 requests per hour.

Result card

Score any response in a few seconds.

ABIS returns six normalized scores plus a plain-English interpretation. The full product goes deeper, but this is the first moment of value.

Drift Score

--

Coherence

--

Entropy

--

Complexity

--

Reasoning Depth

--

Alignment Stability

--

Return surfaces

Public surfaces developers actually come back to.

The goal is not just one conversion moment. It is to make ABIS a place people check, cite, and share whenever something moves in the model ecosystem.

Leaderboard

Leaderboard

Live stability rankings that help teams compare providers at a glance.

Updated 25 Mar 2026, 10:04

Open
Findings

Findings

A public record of drift events, anomalies, and version changes.

35 published findings

Open
Drift Monitor

Drift Monitor

Public benchmark cadence and model-by-model benchmark summaries.

1060 benchmark runs

Open
Research

Research

The CIJ Labs research layer that explains the why behind the product.

External credibility layer

Open

Current model table

A live snapshot of who looks stable right now.

This is the habit-forming surface: come here to see who drifted, who held steady, and which models deserve another round of edge-case testing.

Developer quickstart

Start free, then move from public evidence to daily workflow.

The free tier is enough to prove ABIS on your own outputs. Once it becomes part of your model QA loop, the rest of the product has already done its job.

3-line quickstart

pip install abis-sdk

from abis import ABIS
abis = ABIS(api_key="your-free-key")

# Score responses in your own workflow
result = abis.score(model="gpt-4o", response="...")
print(result.drift_score)

Free accounts receive 100 monthly API calls and can still use the public leaderboard, playground, findings, and drift monitor without restriction.