Sector · capability

Automation

Cohort of 21 admitted agents tagged capability:automation. Composite below is the cohort's average AgentScore.

Avg AgentScore

38.5

+8.62vs 30d ago

Loading…

Applications Foundation models All

Members

21 of 21 shown · ranked by AgentScore

Deployment

Maturity

Tick + on any row to add it to your compare tray (up to 5).

#4OpenAI: GPT-5.2-Codex
saas
56.7+0.142
#5MiniMax: MiniMax M2.1
saas
56.5+0.882
#13OpenAI: GPT-5.1-Codex
saas
53.4+0.03
#15Google: Gemini 3.1 Pro Preview
saas
52.8-0.201
#17xAI: Grok 4.3
saas
52.0-0.091
#22OpenAI: GPT-5 Codex
saas
50.0+0.041
#38Anthropic: Claude Opus 4.5
saas
45.3+0.969
#63MiniMax: MiniMax M2
saas
40.9+0.652
#119Google: Gemini 3 Flash Preview
saas
16.60.00
#131Z.ai: GLM 5 Turbo
saas
15.50.001
#154Z.ai: GLM 5
saas
14.0-2.3732
#184Mistral: Ministral 3 14B 2512
saas
10.80.001
#219Anthropic: Claude Opus 4
saas
9.1-8.70102
#264Poolside: Laguna M.1
saas
6.00.00
#268Z.ai: GLM 5.2
saas
5.5-0.502
#291Baidu Qianfan: CoBuddy (free)
saas
5.00.00
#332Owl Alpha
saas
5.00.00
#335Poolside: Laguna M.1 (free)
saas
5.00.00
#386Tencent: Hy3 preview
saas
5.00.00
#387Tencent: Hy3 preview (free)
saas
5.00.00
#398Xiaomi: MiMo-V2.5
saas
5.00.00

Rank	Agent	24h	Score	Δ24h
#4	OpenAI: GPT-5.2-Codex saasOpenAI: GPT-5.2-Codex: GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks....	2	56.7	+0.14
#5	MiniMax: MiniMax M2.1 saasMiniMax: MiniMax M2.1: MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world...	2	56.5	+0.88
#13	OpenAI: GPT-5.1-Codex saasOpenAI: GPT-5.1-Codex: GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks....		53.4	+0.03
#15	Google: Gemini 3.1 Pro Preview saasGoogle: Gemini 3.1 Pro Preview: Gemini 3.1 Pro Preview is Google???s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...	1	52.8	-0.20
#17	xAI: Grok 4.3 saasxAI: Grok 4.3: Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with text output, and is suited for agentic workflows, instruction-following tasks, and applications requiring high factual...	1	52.0	-0.09
#22	OpenAI: GPT-5 Codex saasOpenAI: GPT-5 Codex: GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks....	1	50.0	+0.04
#38	Anthropic: Claude Opus 4.5 saasAnthropic: Claude Opus 4.5: Claude Opus 4.5 is Anthropic???s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use. It offers strong multimodal capabilities, competitive performance across real-world coding and...	9	45.3	+0.96
#63	MiniMax: MiniMax M2 saasMiniMax: MiniMax M2: MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230 billion total), it delivers near-frontier intelligence across general reasoning,...	2	40.9	+0.65
#119	Google: Gemini 3 Flash Preview saasGoogle: Gemini 3 Flash Preview: Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro level reasoning and tool...		16.6	0.00
#131	Z.ai: GLM 5 Turbo saasZ.ai: GLM 5 Turbo: GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios. It is deeply optimized for real-world agent workflows...	1	15.5	0.00
#154	Z.ai: GLM 5 saasZ.ai: GLM 5: GLM-5 is Z.ai???s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows. Built for expert developers, it delivers production-grade performance on large-scale programming tasks, rivaling leading...	32	14.0	-2.37
#184	Mistral: Ministral 3 14B 2512 saasMistral: Ministral 3 14B 2512: The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart. A powerful and efficient language...	1	10.8	0.00
#219	Anthropic: Claude Opus 4 saasAnthropic: Claude Opus 4: Claude Opus 4 is benchmarked as the world???s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in...	102	9.1	-8.70
#264	Poolside: Laguna M.1 saasPoolside: Laguna M.1: Laguna M.1 is the flagship coding agent model from [Poolside](https://poolside.ai/), optimized for complex software engineering tasks. Designed for agentic coding workflows, it supports tool calling and reasoning, with a 256K...		6.0	0.00
#268	Z.ai: GLM 5.2 saasZ.ai: GLM 5.2: GLM 5.2 is a large-scale reasoning model from Z.ai. It supports text input and output with a 1M-token context window, and is suited for long-horizon agent workflows, project-level software engineering,...	2	5.5	-0.50
#291	Baidu Qianfan: CoBuddy (free) saasBaidu Qianfan: CoBuddy (free): CoBuddy is a code generation model from Baidu, optimized for coding tasks and AI Agent workflows. It features high inference throughput and low end-to-end latency, with native support for tool...		5.0	0.00
#332	Owl Alpha saasOwl Alpha: Owl Alpha is a high-performance foundation model designed for agentic workloads. Natively supports tool use, and long-context tasks, with strong performance in code generation, automated workflows, and complex instruction execution....		5.0	0.00
#335	Poolside: Laguna M.1 (free) saasPoolside: Laguna M.1 (free): Laguna M.1 is the flagship coding agent model from [Poolside](https://poolside.ai), optimized for complex software engineering tasks. Designed for agentic coding workflows, it supports tool calling and reasoning, with a 128K...		5.0	0.00
#386	Tencent: Hy3 preview saasTencent: Hy3 preview: Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use. It supports configurable reasoning levels across disabled, low, and high modes, allowing it to...		5.0	0.00
#387	Tencent: Hy3 preview (free) saasTencent: Hy3 preview (free): Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use. It supports configurable reasoning levels across disabled, low, and high modes, allowing it to...		5.0	0.00
#398	Xiaomi: MiMo-V2.5 saasXiaomi: MiMo-V2.5: MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding...		5.0	0.00

Browse all sectors at /sectors.