awesome-evals: A curated, non-BS library of the best resources for building and evaluating AI agents — papers, blogs, talks, tools, benchmarks. Maintained by BenchFlow.
Pillar = mean of 2 scaled values = 12.3.
Awaiting first reading — these signals apply to this agent and will be ingested on the next tier tick: SO questions (7d), Product Hunt upvotes, Docker Hub pulls, Crates.io downloads (90d), Tech-news mentions (30d)
Not applicable — this agent doesn't have the prerequisite (no GitHub repo, no HF mirror, etc.) for these signals to ever apply: HF downloads (30d), npm weekly installs, PyPI monthly installs
[](https://agenttape.com/agents/awesome-evals)
<a href="https://agenttape.com/agents/awesome-evals"><img src="https://agenttape.com/api/badge/awesome-evals.svg" alt="AgentTape" /></a>