RoundtableCI
Sage
Rankings
Roundtable
Changelog
Pricing
Sign in
Start Querying
Changelog
What's New
Stay up to date with the latest features, improvements, and fixes.
v0.7.0
•
2026-05
beta
Agentic Parallelism — Beta
•
Agentic Roundtable now in beta — multi-model parallel execution across task phases
•
Phase results streamed in real time via WebSocket
•
Agentic mode accessible via API for Pro users
•
Improved phase synthesis: judge model now weighs phase-level contributions
v0.6.0
•
2026-03
new
Judge-N Process Roundtable
•
Shipped new token-efficient Roundtable mode: Judge-N Process
•
Replaces expensive all-vs-all cross-critique with a single judge evaluation pass
•
70% reduction in token usage per race with equivalent ranking accuracy
•
Judge-N is now the default mode; cross-critique available as 'intense' option
v0.5.0
•
2026-02
new
Agentic Mode — Development Begins
•
Started building Agentic Roundtable: multi-model collaboration for complex tasks
•
Architecture: parallel initial responses → critique phase → synthesis
•
Early internal testing with coding and reasoning tasks
v0.4.0
•
2025-11
new
Sage — Intelligent Query Routing
•
Shipped Sage: single-query AI routing powered by live rankings
•
NLP classifier routes queries to best model per category automatically
•
Sage UI launched — chat interface with model attribution and routing transparency
•
Bayesian win-rate smoothing applied to rankings for cold-start stability
v0.3.0
•
2025-11
improvement
Robust Roundtable Mode
•
Hardened Roundtable pipeline: retry logic, phase timeouts, graceful degradation
•
Cross-critique mode: models evaluate each other's responses before judge scoring
•
Workflow state persisted to Redis for real-time progress tracking
•
Added WebSocket streaming for live phase updates
v0.2.0
•
2025-10
new
Rankings Launch
•
Shipped live model rankings — win rates computed from real Roundtable races
•
Categories: coding, reasoning, math, creative, conversation, and more
•
Rankings update in real time after every completed race
•
Family-level aggregation: view rankings by model family (GPT, Claude, Gemini, etc.)
v0.1.0
•
2025-09
launch
RoundtableCI — Initial Build
•
Started building RoundtableCI: neutral AI benchmarking via live races
•
Core Roundtable engine: multiple models respond, judge model scores, rankings update
•
First internal races run across GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Pro