Sector · capability

Research

Cohort of 12 admitted agents tagged capability:research. Composite below is the cohort's average AgentScore.

Avg AgentScore

22.0

+1.02vs 30d ago

Loading…

Applications Foundation models All

Members

12 of 12 shown · ranked by AgentScore

Deployment

Maturity

Tick + on any row to add it to your compare tray (up to 5).

#99xAI: Grok 4.1 Fast
saas
27.6+0.13
#158xAI: Grok 4.20 Multi-Agent
saas
13.40.001
#159Google: Gemma 2 27B
saas
13.30.001
#175Perplexity: Sonar Deep Research
saas
11.30.001
#206NousResearch: Hermes 2 Pro - Llama-3 8B
saas
10.20.001
#251OpenAI: o3 Deep Research
saas
8.70.00
#252OpenAI: o4 Mini Deep Research
saas
8.70.00
#257Microsoft: Phi 4
saas
7.60.00
#325Nous: Hermes 3 70B Instruct
saas
5.00.00
#326Nous: Hermes 4 405B
saas
5.00.00
#327Nous: Hermes 4 70B
saas
5.00.00
#393Tongyi DeepResearch 30B A3B
saas
5.00.00

Rank	Agent	24h	Score	Δ24h
#99	xAI: Grok 4.1 Fast saasxAI: Grok 4.1 Fast: Grok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases like customer support and deep research. 2M context window. Reasoning can be enabled/disabled using...		27.6	+0.13
#158	xAI: Grok 4.20 Multi-Agent saasxAI: Grok 4.20 Multi-Agent: Grok 4.20 Multi-Agent is a variant of xAI???s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information...	1	13.4	0.00
#159	Google: Gemma 2 27B saasGoogle: Gemma 2 27B: Gemma 2 27B by Google is an open model built from the same research and technology used to create the [Gemini models](/models?q=gemini). Gemma models are well-suited for a variety of...	1	13.3	0.00
#175	Perplexity: Sonar Deep Research saasPerplexity: Sonar Deep Research: Sonar Deep Research is a research-focused model designed for multi-step retrieval, synthesis, and reasoning across complex topics. It autonomously searches, reads, and evaluates sources, refining its approach as it gathers...	1	11.3	0.00
#206	NousResearch: Hermes 2 Pro - Llama-3 8B saasNousResearch: Hermes 2 Pro - Llama-3 8B: Hermes 2 Pro is an upgraded, retrained version of Nous Hermes 2, consisting of an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced...	1	10.2	0.00
#251	OpenAI: o3 Deep Research saasOpenAI: o3 Deep Research: o3-deep-research is OpenAI's advanced model for deep research, designed to tackle complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds additional cost.		8.7	0.00
#252	OpenAI: o4 Mini Deep Research saasOpenAI: o4 Mini Deep Research: o4-mini-deep-research is OpenAI's faster, more affordable deep research model???ideal for tackling complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds additional cost.		8.7	0.00
#257	Microsoft: Phi 4 saasMicrosoft: Phi 4: [Microsoft Research](/microsoft) Phi-4 is designed to perform well in complex reasoning tasks and can operate efficiently in situations with limited memory or where quick responses are needed. At 14 billion...		7.6	0.00
#325	Nous: Hermes 3 70B Instruct saasNous: Hermes 3 70B Instruct: Hermes 3 is a generalist language model with many improvements over [Hermes 2](/models/nousresearch/nous-hermes-2-mistral-7b-dpo), including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements acr...		5.0	0.00
#326	Nous: Hermes 4 405B saasNous: Hermes 4 405B: Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nous Research. It introduces a hybrid reasoning mode, where the model can choose to deliberate internally with...		5.0	0.00
#327	Nous: Hermes 4 70B saasNous: Hermes 4 70B: Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B. It introduces the same hybrid mode as the larger 405B release, allowing the model to either...		5.0	0.00
#393	Tongyi DeepResearch 30B A3B saasTongyi DeepResearch 30B A3B: Tongyi DeepResearch is an agentic large language model developed by Tongyi Lab, with 30 billion total parameters activating only 3 billion per token. It's optimized for long-horizon, deep information-seeking tasks...		5.0	0.00

Browse all sectors at /sectors.