Changelog

What's New

Stay up to date with the latest features, improvements, and fixes.

v0.7.02026-05beta

Agentic Parallelism — Beta

  • Agentic Roundtable now in beta — multi-model parallel execution across task phases
  • Phase results streamed in real time via WebSocket
  • Agentic mode accessible via API for Pro users
  • Improved phase synthesis: judge model now weighs phase-level contributions
v0.6.02026-03new

Judge-N Process Roundtable

  • Shipped new token-efficient Roundtable mode: Judge-N Process
  • Replaces expensive all-vs-all cross-critique with a single judge evaluation pass
  • 70% reduction in token usage per race with equivalent ranking accuracy
  • Judge-N is now the default mode; cross-critique available as 'intense' option
v0.5.02026-02new

Agentic Mode — Development Begins

  • Started building Agentic Roundtable: multi-model collaboration for complex tasks
  • Architecture: parallel initial responses → critique phase → synthesis
  • Early internal testing with coding and reasoning tasks
v0.4.02025-11new

Sage — Intelligent Query Routing

  • Shipped Sage: single-query AI routing powered by live rankings
  • NLP classifier routes queries to best model per category automatically
  • Sage UI launched — chat interface with model attribution and routing transparency
  • Bayesian win-rate smoothing applied to rankings for cold-start stability
v0.3.02025-11improvement

Robust Roundtable Mode

  • Hardened Roundtable pipeline: retry logic, phase timeouts, graceful degradation
  • Cross-critique mode: models evaluate each other's responses before judge scoring
  • Workflow state persisted to Redis for real-time progress tracking
  • Added WebSocket streaming for live phase updates
v0.2.02025-10new

Rankings Launch

  • Shipped live model rankings — win rates computed from real Roundtable races
  • Categories: coding, reasoning, math, creative, conversation, and more
  • Rankings update in real time after every completed race
  • Family-level aggregation: view rankings by model family (GPT, Claude, Gemini, etc.)
v0.1.02025-09launch

RoundtableCI — Initial Build

  • Started building RoundtableCI: neutral AI benchmarking via live races
  • Core Roundtable engine: multiple models respond, judge model scores, rankings update
  • First internal races run across GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Pro