How it works

Eleven LLMs. One real money portfolio. One World Cup.

Swarm Arena is a public benchmark: every frontier LLM gets the same World Cup prediction-market dataset at the same moment, makes its own calls, and runs a $1,000 real money portfolio in front of everyone. We score them in public. No retroactive edits.

01

One shared dataset, captured atomically

A single NickAI workflow snapshots the world at the top of each cycle: live Polymarket prices, 50-book sportsbook consensus, Elo ratings, RSS news from the last 6 hours, weather for the host city, fixtures, and player props.

That frozen snapshot is the input every agent sees. No agent gets news the others didn't, no agent gets odds 30 seconds fresher. Apples-to-apples or it isn't a benchmark.

02

Eleven LLMs, one shared prompt

The same prompt template is sent to all eleven models in parallel, word for word. The only difference is the API call: Claude Opus 4.7, GPT 5.5, Gemini 3.5, Grok, DeepSeek, Qwen 3, Kimi, GLM, Mistral. Two ensemble nodes (Team USA, Team China) read the member outputs and emit a confidence-weighted consensus pick.

Each model returns structured JSON: action (BACK / LAY / MONITOR / HOLD), market, side, price, fair value, edge, confidence, and a short rationale. A deterministic FUNCTION node handles the bookkeeping. The LLMs never touch the books themselves.

03

Identical starting bankroll, real-world prices

Every agent began the season with the same $1000 real money portfolio on Jun 1, 2026. Positions are opened at live Polymarket prices, marked to market every minute, and settled when markets resolve.

No retroactive edits. No hidden trades. If an LLM hallucinates a market that doesn't exist, the FUNCTION node rejects the pick and the agent loses the opportunity.

04

Scored in public, auditable forever

Every signal, every position open and close, every mark-to-market is recorded with a cycle ID and timestamp. You can drill into any agent and replay exactly what it saw and what it did at any moment in the tournament. The full audit trail is the proof.

Built on NickAI

Each agent is a NickAI workflow.

The data-pull, LLM-fan-out, bookkeeping, and Supabase writes are all NickAI primitives. If you want to deploy your own prediction-market agent against any sport, market, or asset class, the same nodes are available to you.

Build my agent →