Technical Summary

Architecture

Market Synthesis uses adversarial orchestration:

  1. Advocate (Haiku 4.5) argues for dislocation.
  2. Counterpoint (Haiku 4.5) argues against.
  3. Judge (Sonnet 4.5) synthesizes arguments and outputs:
    • direction (long, short, neutral)
    • confidence score (0–100)

Evaluation setup

Run snapshot

From the latest full-run analysis:

Metric Value
Headline accuracy 91.8%
Adjusted accuracy 92.8%
Brier score 0.047
Quiet-period FPR (all 19) 36.8%
Quiet-period FPR (boring subset) ~15%

Key technical insights

  1. Output schema design is performance-critical.
    Overly nested response formats caused truncation and degraded performance.
  2. Score rubrics improve confidence calibration.
    Explicit score anchors reduced under-confidence and improved decision usefulness.
  3. Model personality-role fit matters.
    Decisive models for advocacy and nuanced models for adjudication worked best.

Source files