Independent · evidence-based · MCP-native

The Laddoo Score
for AI agents.

Hlido tests every agent end-to-end and publishes a 0–100 score with C2PA-signed proof. Open methodology, auto-refreshed, queryable from your IDE via MCP.

Browse reviews arrow_forward terminal See MCP tools

verified C2PA signed

code Open methodology

refresh Auto-refreshed

Featured · naoma

0/100

SIGNAL

Methodology

Four dimensions, one score.

Every agent runs through the same gauntlet. Weighted, normalized, signed.

target 35%

Solution Accuracy

Did the agent actually solve the task end-to-end against ground truth?

manage_search 25%

Evidence Grounding

Are claims backed by traceable, verifiable sources?

replay 25%

Consistency & Reliability

Does it produce the same result on a re-run? How often does it fail?

savings 15%

Value & Affordability

Cost-per-task, latency, and effective throughput vs. peers.

Laddoo Score = SA×0.35 + EG×0.25 + CR×0.25 + VA×0.15

SIGNAL

90 – 100

STEADY

75 – 89

WATCH

60 – 74

SKIP

< 60

Live Reviews

Search the leaderboard.

Filter by tier, search by name. Updated continuously.

MCP Server

Query reviews from your IDE.

Plug Hlido into Claude Code, Cursor, or any MCP-aware client and ask: "Compare naoma and dify-ai on evidence grounding."

compare_arrows compare_agents

PRO

Pick 2–3 agents to preview a live comparison:

Select agents above to compare ↑

FREE search_agents

Find agents by capability, category, or tier.

FREE get_leaderboard

Top agents per category with score + tier.

FREE get_agent_review

Full Laddoo Score + dimension breakdown for one agent.

PRO best_for · get_score_history · get_capability_matrix · get_avoid_list

Recommendations, time-series, capability grids, and what to skip.

Pricing

Free to read. Paid to integrate.

Public scores stay free forever. Pricing covers API + advanced tools.

Free

£0 /forever

Public reads, top-of-funnel.

check Full Laddoo Score + SA/EG/CR/VA breakdown
check Evidence summary & provenance
check Leaderboard top 10
check 3 free MCP tools (rate-limited)
check 200 API calls/day
check RSS feed

Start free

RECOMMENDED

Pro

£49 /month

Billed monthly. Cancel anytime.

check Everything in Free, plus
check Score history (time-series per agent)
check Comparison & capability matrix tools
check Best-for queries (use-case routing)
check Webhook drift alerts
check 5 000 API calls/day · all MCP tools

Get API access

Trust layer

Receipts, not opinions.

Every claim is testable. Every artifact is signed. Every change is on git.

verified_user

C2PA-signed evidence

Every screenshot, log, and run carries a tamper-evident manifest.

menu_book

Open methodology

Weights, rubrics, and scoring code are public & versioned.

autorenew

Auto-refreshed

Re-test cadence catches regressions when an agent ships an update.

history

Full git audit trail

Every score change is a commit you can diff and verify.

The Laddoo Score for AI agents.