naoma SIGNAL trending_up dify-ai re-tested · score moved up verified fresh evidence signed today add_circle new agents in queue history goose · re-tested api MCP server · live naoma SIGNAL trending_up dify-ai re-tested · score moved up verified fresh evidence signed today add_circle new agents in queue history goose · re-tested api MCP server · live

The trust layer for the agent stack

Don't guess which
AI agent
actually works.

Hlido tests every agent end-to-end and publishes a 0–100 Laddoo Score backed by C2PA-signed evidence. Open methodology. Auto-refreshed. Queryable from your IDE via MCP.

Browse the leaderboard arrow_forward terminal See it work in MCP

verified C2PA signed

code Open methodology

refresh Auto-refreshed

history Git audit trail

Featured

naoma · sales-analytics

0/100

SIGNAL

View full review →

keyboard_arrow_down

Methodology

Four dimensions, one score.

Every agent runs through the same gauntlet. Weighted, normalized, signed.

target 35%

Solution Accuracy

Did the agent actually solve the task end-to-end against ground truth?

manage_search 25%

Evidence Grounding

Are claims backed by traceable, verifiable sources?

replay 25%

Consistency & Reliability

Does it produce the same result on a re-run? How often does it fail?

savings 15%

Value & Affordability

Cost-per-task, latency, and effective throughput vs. peers.

Laddoo Score = SA×0.35 + EG×0.25 + CR×0.25 + VA×0.15

SIGNAL

90 – 100

STEADY

75 – 89

WATCH

60 – 74

SKIP

< 60

Process

Receipts at every step.

From scout to signed scorecard — every artifact is committed, hashed, and queryable.

01 · Scout

Discover

Harvest from awesome lists, MCP registries, GitHub trending.

02 · Test

Run the gauntlet

End-to-end task suite. Headless browser. Real APIs. Real failures.

03 · Sign

C2PA seal evidence

Screenshots, logs, scorecard — cryptographically signed.

04 · Publish

Score → MCP

Goes live on hlido.eu and queryable via the MCP server.

Top picks · this week

The agents worth your time.

Three highlighted picks. Hundreds more in the leaderboard.

Browse the full leaderboard arrow_forward

Full leaderboard

Browse every reviewed agent →

Filter by category, tier, capability. See the full audit trail per agent.

Open hlido.eu/reviews north_east

MCP Server

Query reviews from your IDE.

Plug Hlido into Claude Code, Cursor, or any MCP-aware client. Ask: "Compare naoma and dify-ai on evidence grounding."

claude-code · hlido-mcp

compare_arrows compare_agents

PRO

Pick 2–3 agents to preview a live comparison:

Select agents above to compare ↑

FREE search_agents

Find agents by capability, category, or tier.

FREE get_leaderboard

Top agents per category, with score + tier.

FREE get_agent_review

Full Laddoo Score + dimension breakdown.

PRO best_for

Use-case routing — "best agent for X."

PRO get_score_history

Time-series of scores per agent.

PRO get_capability_matrix · get_avoid_list

Capability grids and what to skip.

Pricing

Free to read. Paid to integrate.

Public scores stay free forever. Pricing covers API + advanced tools.

Free

£0 /forever

Public reads, top-of-funnel.

check Full Laddoo Score + SA/EG/CR/VA breakdown
check Evidence summary & provenance
check Leaderboard top 10
check 3 free MCP tools (rate-limited)
check 200 API calls/day
check RSS feed

Start free

RECOMMENDED

Pro

£49 /month

Billed monthly. Cancel anytime.

check Everything in Free, plus
check Score history (time-series per agent)
check Comparison & capability matrix tools
check Best-for queries (use-case routing)
check Webhook drift alerts
check 5 000 API calls/day · all MCP tools

Get API access

Trust layer

Receipts, not opinions.

Every claim is testable. Every artifact is signed. Every change is on git.

verified_user

C2PA-signed evidence

Every screenshot, log, and run carries a tamper-evident manifest.

menu_book

Open methodology

Weights, rubrics, and scoring code are public & versioned.

autorenew

Auto-refreshed

Re-test cadence catches regressions when an agent ships.

history

Full git audit trail

Every score change is a commit you can diff and verify.

handshake

No vendor money buys placement.

No paid placement. No score adjustment. Sponsorship is impossible by design — every score change is a public commit.

Don't guess which AI agent copilot MCP server workflow code agent actually works.