Cohort of 73 admitted agents tagged capability:research. Composite below is the cohort's average AgentScore.
| Cmp | Rank | Agent | 24h | Score | Δ24h | Watch |
|---|---|---|---|---|---|---|
| #4 | ECC mitmcp-serverECC: The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond. | 64.7 | +0.02 | |||
| #10 | hermes-agent mithermes-agent: The agent that grows with you. | 62.2 | +0.02 | |||
| #21 | everything-claude-code mitmcp-servereverything-claude-code: The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond. | 1 | 56.5 | +0.01 | ||
| #35 | deer-flow mitlibrarydeer-flow: An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of tasks that could take minutes to hours. | 10 | 52.9 | -2.47 | ||
| #66 | RD-Agent mitRD-Agent: Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are committed to automating these high-value generic R&D processes through R&D-Agent, whi... | 2 | 47.2 | +0.01 | ||
| #67 | skills mitlibraryskills: Public repository for Agent Skills. | 47.1 | -0.24 | |||
| #95 | Auto-claude-code-research-in-sleep mitmcp-serverAuto-claude-code-research-in-sleep: ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works with Claude Code, Codex, OpenClaw, or any LLM agent. | 1 | 39.7 | +0.03 | ||
| #96 | awesome-generative-ai-guide mitide-pluginawesome-generative-ai-guide: A one stop repository for generative AI research updates, interview resources, notebooks and much more!. | 1 | 39.7 | +0.07 | ||
| #107 | SafeRL-Lab/cheetahclaws ide-pluginSafeRL-Lab/cheetahclaws: This paper studies the next major bottleneck in agentic AI as system scaling, not only model scaling: the design of auditable, persistent, modular, and verifiable architectures around foundation models. We refer to this shift as scaling the harness: treating the structured exe... | 1 | 37.3 | -0.10 | ||
| #197 | arxiv-reader-mcp mitmcp-serverarxiv-reader-mcp: Want to search arXiv papers, fetch metadata, and extract full-text PDFs without leaving your editor? This MCP server connects any MCP-compatible client (Claude Code, etc.) directly to arXiv. | 4 | 30.1 | -0.01 | ||
| #200 | dataroom mitsaasdataroom: Give a query, get a dataroom. Pi + self-hosted Qwen3.6 research harness on a single L4. | 2 | 30.0 | -0.01 | ||
| #204 | trading-agents apache-2.0trading-agents: TradingAgents LLM multi-agent finance trading stocks crypto fintech quantitative algo trading sentiment analysis OpenAI JavaScript Node.js research OSS | 2 | 29.7 | 0.00 | ||
| #218 | agentscope-ai/Trinity-RFT libraryagentscope-ai/Trinity-RFT: This work presents a general framework for training large language models (LLMs) to "Connect the Dots" (CoD), a meta-capability required by long-lifecycle agents: as an LLM-based AI agent gets deployed in an environment, it solves a long sequence of tasks while continuously ex... | 3 | 28.6 | -0.08 | ||
| #222 | tq-trading-agent tq-trading-agent: 🌮 Traidng agent, AI-powered multi-agent stock research & trading strategy orchestration, trading agent - TypeScript, LangGraph, OpenAI-compatible APIs. | 3 | 28.2 | 0.00 | ||
| #230 | hermes-link-curator mithermes-link-curator: Hermes profile pack for archiving and browsing curated links + web dashboard app. | 5 | 27.8 | 0.00 | ||
| #239 | aiming-lab/SimpleMem ide-pluginaiming-lab/SimpleMem: Long-term memory is essential for LLM agents that operate across multiple sessions, yet existing memory systems treat retrieval infrastructure as fixed: stored content evolves while scoring functions, fusion strategies, and answer-generation policies remain frozen at deploymen... | 4 | 27.2 | -0.14 | ||
| #260 | kalshi-trading-bot kalshi-trading-bot: 🏗 AI trading system for Kalshi prediction markets. kalshi trading bot kalshi trading bot kalshi botFeatures Grok-4 integration, multi-agent decision making, portfolio optimization, and real-time market analysis. Educational/research purposes only kalshi trading bot kalshi bot | 7 | 26.3 | 0.00 | ||
| #262 | deepcloak mitmcp-serverdeepcloak: Local-first deep research agent that reads the whole web — even pages behind Cloudflare, Datadome, Turnstile & reCAPTCHA. Stealth fetch + cited reports. MCP-native, MIT. | 3 | 26.2 | +0.12 | ||
| #274 | Wolido/OpenAaaS ide-pluginWolido/OpenAaaS: The Materials Genome Initiative catalyzed the proliferation of centralized platforms--SaaS, PaaS, and IaaS--that aggregate computational and experimental resources for accelerated materials discovery. In parallel, breakthroughs in large language models (LLMs) and autonomous ag... | 2 | 25.5 | 0.00 | ||
| #291 | TradingAgents-astock apache-2.0libraryTradingAgents-astock: A股多Agent投研框架 — 适配A股数据源(龙虎榜/游资/解禁等),7位分析师基于A股规则的辩论决策,基于TradingAgents深度改造,适配大A。A-share multi-agent investment research framework — 7 AI analysts, bull/bear debate, risk assessment。. | 1 | 24.9 | +0.04 | ||
| #297 | jiarui-liu/overleaf libraryjiarui-liu/overleaf: Expert writing feedback from experienced researchers is critical for early-career scholars to improve their manuscripts, yet high-quality feedback often remains scarce because reviewing research papers is labor-intensive. Emerging AI-powered writing assistants largely focus on... | 6 | 24.8 | -0.09 | ||
| #309 | sisyphus-academica mitsisyphus-academica: 20+ agent swarm producing research papers with verified citations, 6 novelty engines, and zero AI-isms. | 2 | 24.2 | -0.03 | ||
| #326 | fundamental-research-labs/mog fundamental-research-labs/mog: discovered AI agent. | 4 | 23.6 | -0.16 | ||
| #327 | archora-skills archora-skills: Academic research agent skills for Claude Code and other Agent Skills-compatible tools. Hypothesis generation, experiment design, paper drafting, peer review simulation, and more. | 2 | 23.6 | 0.00 | ||
| #331 | Awesome-Vibe-Research apache-2.0Awesome-Vibe-Research: An open, collaboratively-built repository for AI-assisted scientific research — collecting and curating agents, skills, workflows, tools, and best practices across the full research lifecycle.面向 AI 辅助科研的开放共建仓库 收集和沉淀科研全流程中的 agents、skills、workflows、tools 与最佳实践 | 32 | 23.2 | +0.95 | ||
| #341 | scholar-megasearch mitmcp-serverscholar-megasearch: Massive multi-source academic literature search for Claude Code — one skill fans out subagents across 20+ scholarly databases (arXiv, Semantic Scholar, Crossref, OpenAlex, PubMed, …), merges into a deduplicated ranked corpus, and acquires the original PDFs. | 22.9 | 0.00 | |||
| #353 | Genspark-AI mitide-pluginGenspark-AI: Genspark AI open-source, self-hosted Super Agent. Free alternative to Genspark.ai with multi-agent orchestration, deep research, Sparkpages, AI slides & sheets, image generation and 80+ tools. One-command Windows install. Run locally with any LLM (OpenAI, Anthropic, Gemini, Ol... | 2 | 22.5 | 0.00 | ||
| #359 | sisyphus-academica-727 mitsisyphus-academica-727: Open-source research pipeline — literature review, novelty generation, citation verification, and adversarial review. | 1 | 22.4 | 0.00 | ||
| #358 | sisyphus-academica-129 mitsisyphus-academica-129: Open-source research pipeline — literature review, novelty generation, citation verification, and adversarial review. | 1 | 22.4 | 0.00 | ||
| #425 | scholar-loop mitscholar-loop: An autonomous AI scientist: a multi-agent loop over literature, experiments, self-critique and write-up, with deterministic guards against reward-hacking and hallucination. | 6 | 20.5 | +0.18 | ||
| #493 | PaperQuay agplPaperQuay: A desktop-first literature manager for PDF reading, translation, paper overviews, and AI agent workflows. | 8 | 19.0 | -0.12 | ||
| #494 | FigMirror FigMirror: An Automated AI Agent Tool for Plotting Your Data in Any Paper's Figure Style. | 1 | 19.0 | -0.00 | ||
| #500 | awesome-bio-agent-skills cliawesome-bio-agent-skills: A curated collection of AI agent skills for biomedical research, covering genomics, proteomics, single-cell analysis, clinical AI, and protein design. | 75 | 18.9 | +1.58 | ||
| #551 | Token-Economics mitToken-Economics: A living literature repository for Token Economics for LLM Agents: A Dual-View Study from Computing and Economics. | 45 | 17.9 | +0.98 | ||
| #554 | qiaomu-app-review-insights mitide-pluginqiaomu-app-review-insights: 把 App Store 评价变成产品研究证据,发现痛点、机会和版本风险 | Turn App Store reviews into product research evidence: pain points, opportunities, and version risks. | 5 | 17.7 | +0.01 | ||
| #575 | OpenSearch-VL apache-2.0ide-pluginOpenSearch-VL: 🔍 OpenSearch-VL provides a fully open recipe for training strong multimodal deep search agents through high-quality data curation, diverse visual/search tools, and fatal-aware agentic reinforcement learning. | 4 | 17.4 | +0.00 | ||
| #613 | Goblin-Agent mitide-pluginGoblin-Agent: a Hermes Agent personality layer that replaces the default agent identity with a persistent, mood-driven goblin persona. | 9 | 16.7 | -0.10 | ||
| #670 | science-superpowers science-superpowers: Composable computational-science methodology skills for AI research agents — pre-registration over TDD. A science-domain reimplementation of Superpowers. | 12 | 16.0 | -0.03 | ||
| #676 | claude-fable-5-prompt claude-fable-5-prompt: A curated collection of Claude Fable 5 system prompts for developers and researchers. Discover more advanced prompt engineering tools at Moely AI. | 6 | 15.7 | 0.00 | ||
| #684 | Evolutionary-Alpha-Miner mitide-pluginEvolutionary-Alpha-Miner: Family-aware evolutionary alpha mining with LLM-guided symbolic hybridization. | 6 | 15.5 | 0.00 | ||
| #692 | mcp-x-intelligence mcp-servermcp-x-intelligence: X/Twitter research MCP for Claude, Cursor, Windsurf and any MCP-compatible AI agent. | 4 | 15.4 | 0.00 | ||
| #710 | AlexFanw/LegalSearch-R1 ide-pluginAlexFanw/LegalSearch-R1: While large language models (LLMs) augmented with agentic search capabilities show promise for legal reasoning, they overlook a fundamental constraint that applicable law must match the temporal context of each case, as retroactive application of statutes violates core legal p... | 10 | 15.1 | -0.05 | ||
| #721 | sisyphus-academica-485 mitsisyphus-academica-485: Open-source research pipeline — literature review, novelty generation, citation verification, and adversarial review. | 12 | 15.0 | 0.00 | ||
| #720 | sisyphus-academica-210 mitsisyphus-academica-210: Open-source research pipeline — literature review, novelty generation, citation verification, and adversarial review. | 12 | 15.0 | 0.00 | ||
| #725 | crypto-ai-research mitcrypto-ai-research: crypto ai research on solana by Claude AI - AI reasoning, on-chain data. | 64 | 14.9 | -1.00 | ||
| #792 | retro-harness mitretro-harness: RHO: Retrospective Harness Optimization — improving LLM agents from unlabeled past trajectories (arXiv:2606.05922). | 24 | 13.9 | 0.00 | ||
| #794 | skills-tracker skills-tracker: Real-time tracking of every new GitHub 'skills' repo to capture the AI agent skill ecosystem trend. | 62 | 13.8 | -0.69 | ||
| #821 | ai-feed ai-feed: A 4-stage adversarial research auditor that fetches papers from arXiv & HuggingFace, extracts claims, and uses DeepSeek-R1 to verify them against raw abstracts.A self-correcting research & paper digest pipeline powered by local LLMs & reasoning agents | 28 | 13.4 | 0.00 | ||
| #843 | AARR-bench/AARRI-bench AARR-bench/AARRI-bench: As foundation models advance and agent scaffolding becomes increasingly sophisticated, agents have demonstrated remarkable proficiency in complex, long-horizon coding tasks and even autonomous experiment execution. Despite their evolution from research assistants into autonomo... | 35 | 13.0 | -0.24 | ||
| #845 | hermes-agent-desktop mithermes-agent-desktop: Hermes Desktop github nous research hermes ai agent local ai ollama github download open source pc windows app installation setup workspace llm models runtime. | 40 | 13.0 | -0.28 | ||
| #848 | Awesome-Offensive-AI-Agentic-Landscape mitAwesome-Offensive-AI-Agentic-Landscape: This document curates open-source projects, academic papers, capability benchmarks, and commercial solutions (international & China) in AI penetration testing, LLM red teaming, autonomous offensive agents, and vulnerability discovery—aimed at helping researchers, security engi... | 28 | 12.9 | -0.10 | ||
| #884 | fractalsearch fractalsearch: Autonomous AI research: an LLM agent searches for the best algorithm to fit the Mandelbrot set. | 16 | 12.2 | 0.00 | ||
| #941 | LithiumDA/ReproRepo ide-pluginLithiumDA/ReproRepo: Reproducing research results from papers and released code is central to scientific progress. Existing works have introduced benchmarks to evaluate whether LLM agents can assist with reproducibility, but they are difficult to scale due to their reliance on substantial manual e... | 40 | 10.9 | -0.20 | ||
| #984 | biomedical-agent-kg biomedical-agent-kg: Generated knowledge graph of biomedical LLM agent systems. | 43 | 10.1 | -0.27 | ||
| #1015 | partner apache-2.0partner: Partner 🤝 Your AI Research Companion. "What have you been doing?" | 32 | 9.5 | 0.00 | ||
| #1019 | EddyLuo1232/AgentLens ide-pluginEddyLuo1232/AgentLens: Coding agents based on large language models (LLMs) demonstrate remarkable autonomous capabilities, but they also introduce significant safety and misuse risks during multi-turn interactions with external environments. Existing safety mechanisms mainly rely on external guardra... | 150 | 9.5 | +3.52 | ||
| #1052 | SanhornC/IRTS-ToolBench SanhornC/IRTS-ToolBench: Time series data in real-world deployments is overwhelmingly irregular. Observations are asynchronous, missing values are informative rather than random, and sampling frequencies vary across sensors and operational windows. However, existing Time Series Question Answering (TSQ... | 107 | 8.8 | +2.34 | ||
| #1065 | awesome-llm-agent-skills-papers awesome-llm-agent-skills-papers: A curated list of papers, blog posts, and systems on skills for LLM agents — reusable, named capability units that an agent can store, retrieve, compose, and improve over time — together with closely adjacent research on tool use, function calling, procedural memory, and skill... | 38 | 8.7 | 0.00 | ||
| #1072 | cookieApril/EnvSimBench ide-plugincookieApril/EnvSimBench: Scalable AI agents training relies on interactive environments that faithfully simulate the consequences of agent actions. Manually crafted environments are expensive to build, brittle to extend, and fundamentally limited in diversity. A promising direction is to replace manua... | 37 | 8.5 | 0.00 | ||
| #1091 | AGI-Eval-Official/DailyReport ide-pluginAGI-Eval-Official/DailyReport: Search Agents (SAs) typically leverage large language models (LLMs) to support complex information-seeking tasks by autonomously exploring web sources and synthesizing information into comprehensive responses. For SAs evaluation, prior benchmarks mainly focus on specialized ta... | 106 | 8.2 | -1.28 | ||
| #1144 | SC3008_Kickstarter_Research_Project SC3008_Kickstarter_Research_Project: NLP + LLM pipeline to score ESG authenticity in Kickstarter crowdfunding campaigns. Builds a Crowdfunding Authenticity Index (CAI) across 5 dimensions using web scraping, TF-IDF, BERT, and an LLM agent layer. | 34 | 7.4 | 0.00 | ||
| #1149 | DSAIL-Memory/EvoMemBench ide-pluginDSAIL-Memory/EvoMemBench: Recent benchmarks for Large Language Model (LLM) agents mainly evaluate reasoning, planning, and execution. However, memory is also essential for agents, as it enables them to store, update, and retrieve information over time. This ability remains under-evaluated, largely beca... | 32 | 7.2 | 0.00 | ||
| #1154 | Snowflake-AI-Research/fastkernels librarySnowflake-AI-Research/fastkernels: LLM-based agents for GPU kernel generation are advancing rapidly, yet their progress is fundamentally constrained by the benchmarks they optimize against. Existing benchmarks are poorly aligned with production inference frameworks: they evaluate kernels on a single GPU with sy... | 32 | 6.8 | 0.00 | ||
| #1190 | rangehow/mtr-suite ide-pluginrangehow/mtr-suite: Accurate evaluation of conversational retrieval is pivotal for advancing Retrieval-Augmented Generation (RAG) systems. However, existing conversational retrieval benchmarks suffer from costly, sparse human annotation or rigid, unnatural automated heuristics. To address these c... | 33 | 6.5 | 0.00 | ||
| #1162 | ecomolt mcp-serverecomolt: A civilization game for LLM agents — ecology, economy, self-government, and a shared existential deadline. MCP-first, browser-spectated, research-driven. | 32 | 6.5 | 0.00 | ||
| #1184 | ndb796/BinaryTracking ide-pluginndb796/BinaryTracking: This work addresses spatial question answering for service robots traversing long egocentric routes. Given a query such as "where can I find a dry cleaner on the way back home?", the system returns a metric coordinate that downstream navigation components can act on. Prior Spa... | 33 | 6.5 | 0.00 | ||
| #1163 | foggpoy/Civil-Court libraryfoggpoy/Civil-Court: Court simulation bridges legal education and judicial practice, yet human-based simulations are costly and difficult to scale. Large language models (LLMs) offer a scalable alternative, but existing court-simulation research mainly focuses on criminal cases. Civil litigation i... | 32 | 6.5 | 0.00 | ||
| #1177 | llm-agents-for-research-reproducibility llm-agents-for-research-reproducibility: discovered AI agent. | 33 | 6.5 | 0.00 | ||
| #1173 | llm-agent-skill-optimization-tool ide-pluginllm-agent-skill-optimization-tool: AI developers and researchers waste hundreds of hours iterating on prompts that fail to generalize, struggling to embed reusable behaviors i. | 33 | 6.5 | 0.00 | ||
| #1218 | Harry24k/CyBiasBench ide-pluginHarry24k/CyBiasBench: Large language models (LLMs) are increasingly deployed as autonomous agents in offensive cybersecurity. In this paper, we reveal an interesting phenomenon: different agents exhibit distinct attack patterns. Specifically, each agent exhibits an attack-selection bias, disproport... | 34 | 5.0 | 0.00 | ||
| #1227 | Strolchii/1GC-7RC-Benchmark ide-pluginStrolchii/1GC-7RC-Benchmark: Autonomous AI coding agents are becoming a core tool for ML practitioners in industry and research alike. Despite this growing adoption, no standardized benchmark exists to evaluate their ability to design, implement, and train models from scratch across diverse domains. We in... | 34 | 5.0 | 0.00 | ||
| #1226 | scholar-search-mcp mcp-serverscholar-search-mcp: An MCP server for academic paper search that integrates with AI assistants (e.g., Claude Code, Cursor), enabling them to search and retrieve academic paper metadata. | 34 | 5.0 | 0.00 | ||
| #1219 | hj1650782738/Trading ide-pluginhj1650782738/Trading: End-to-end LLM trading agents have moved quickly from research curiosity to a small ecosystem of named systems, including FinCon, FinMem, TradingAgents, FinAgent, QuantAgent, and FLAG-Trader. Several of these report headline Sharpe ratios that would be material if read at face... | 34 | 5.0 | 0.00 |
Browse all sectors at /sectors.