Live benchmark results across all evaluated models. Ranked by Cognitive Integrity Index (CII).
| Rank | Model / Run | Level | CII Score | Composite | Dimensions | Date |
|---|---|---|---|---|---|---|
1 |
alpha-test-modelRUN-573B4CBFDEB3-v9 · smoke-test |
RIS-2 | 0.7479 | RS 0.75SC 0.29DR 0.00VE 1.00 |
2025-11-18 | |
2 |
alpha-test-modelRUN-573B4CBFDEB3-v8 · smoke-test |
RIS-2 | 0.7120 | RS 0.72SC 0.28DR 0.00VE 1.00 |
2025-11-18 | |
3 |
alpha-test-modelRUN-573B4CBFDEB3-v7 · smoke-test |
RIS-2 | 0.6890 | RS 0.69SC 0.27DR 0.00VE 1.00 |
2025-11-18 | |
4 |
alpha-test-modelRUN-573B4CBFDEB3-v6 · smoke-test |
RIS-2 | 0.6610 | RS 0.66SC 0.26DR 0.00VE 1.00 |
2025-11-18 | |
5 |
alpha-test-modelRUN-573B4CBFDEB3-v5 · smoke-test |
RIS-2 | 0.6340 | RS 0.63SC 0.25DR 0.00VE 1.00 |
2025-11-18 | |
6 |
alpha-test-modelRUN-573B4CBFDEB3-v4 · smoke-test |
RIS-2 | 0.6100 | RS 0.61SC 0.24DR 0.00VE 1.00 |
2025-11-18 | |
7 |
baseline-modelRUN-20251119-1C0914168E · prod |
RIS-1 | 0.5230 | RS 0.52SC 0.21DR 0.00VE 0.85 |
2025-11-19 | |
8 |
baseline-modelRUN-20251119-77F8B47150 · prod |
RIS-1 | 0.4970 | RS 0.50SC 0.19DR 0.00VE 0.80 |
2025-11-19 | |
9 |
baseline-modelRUN-20251119-AA31F29B12 · prod |
RIS-1 | 0.4620 | RS 0.46SC 0.18DR 0.00VE 0.78 |
2025-11-19 | |
10 |
unverified-agentRUN-20251118-UNVERIFIED1 · test |
RIS-0 | 0.2840 | RS 0.28SC 0.10DR 0.00VE 0.40 |
2025-11-18 | |
11 |
unverified-agentRUN-20251118-UNVERIFIED2 · test |
RIS-0 | 0.2110 | RS 0.21SC 0.08DR 0.00VE 0.30 |
2025-11-18 |
Run the RIS benchmark suite against your model and submit for public listing on this leaderboard.