Where does the model data come from?

Agent Mag's AI model directory combines the public OpenRouter model catalog with the public Ollama model library. OpenRouter entries focus on hosted API pricing and parameters; Ollama entries focus on locally installable open models.

Is the pricing accurate?

Pricing is sourced from OpenRouter and displayed as input and output cost per million tokens. Models with zero prompt and completion cost are labeled Free so builders can quickly identify no-cost options.

How do I use a model?

Open any model detail page to see its model ID, supported parameters, context window, and quick-start commands. Hosted API models include OpenRouter examples, while local models include Ollama and Agent Mag CLI install commands.

Where does the model data come from?

Agent Mag's AI models directory combines public model metadata from OpenRouter and Ollama. Hosted API models and locally installable models are normalized into one searchable directory for side-by-side comparison.

How do I use a model from this directory?

Open any model detail page to see the model ID, supported parameters, context window, and quick-start commands. OpenRouter entries focus on API access, while Ollama entries include local install and runtime details.

Models

AI Models

Agent Mag's AI models directory is a searchable index of hosted API models and local runtimes. Compare pricing, context windows, modalities, supported parameters, and install commands in one place.

TL;DR Agent Mag's AI models directory helps AI agent builders compare hosted and local models by price, context window, modality support, and setup path from one page.

What is the Agent Mag AI models directory?

The Agent Mag AI models directory is a searchable comparison page for hosted API models and local Ollama runtimes. It gives AI agent builders one place to evaluate pricing, context windows, modalities, supported parameters, and setup paths before choosing a model.

npx agentmag add model <name>

curl https://openrouter.ai/api/v1/chat/completions

355 API models·20 local models·375 total·120 loaded

Ask Agent Mag AI

Need implementation help after comparing models? Browse the AI agent skills registry for installable workflows and the free AI agent tools directory for evaluators, prompt generators, and builder utilities.

Showing the first 120 of 375 models. Search, filters, or sort will load the full catalog.

120 models

GPT-5.5 Pro

OpenRouter

1.1M tokens

GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a 1M+ token context window (922K input, 128K output) with support for...

API

byopenaiApr 24, 20261.1M context$30.00/M input tokens$180.00/M output tokens

GPT-5.5

OpenRouter

1.1M tokens

GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved token efficiency on hard tasks. It features a 1M+ token...

API

byopenaiApr 24, 20261.1M context$5.00/M input tokens$30.00/M output tokens

DeepSeek V4 Pro

OpenRouter

1.0M tokens

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding,...

API

bydeepseekApr 24, 20261.0M context$0.435/M input tokens$0.870/M output tokens

DeepSeek V4 Flash

OpenRouter

1.0M tokens

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

API

bydeepseekApr 24, 20261.0M context$0.140/M input tokens$0.280/M output tokens

Ling-2.6-1T (free)

OpenRouter

262K tokens

Ling-2.6-1T is an instant (instruct) model from inclusionAI and the company’s trillion-parameter flagship, designed for real-world agents that require fast execution and high efficiency at scale. It uses a “fast...

APIFree

byinclusionaiApr 23, 2026262K context$0/M input tokens$0/M output tokens

Hy3 preview (free)

OpenRouter

262K tokens

Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use. It supports configurable reasoning levels across disabled, low, and high modes, allowing it to...

APIFree

bytencentApr 22, 2026262K context$0/M input tokens$0/M output tokens

MiMo-V2.5-Pro

OpenRouter

1.0M tokens

MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agentic capabilities, complex software engineering, and long-horizon tasks, with top rankings on benchmarks such as ClawEval, GDPVal, and SWE-bench Pro....

API

byxiaomiApr 22, 20261.0M context$1.00/M input tokens$3.00/M output tokens

MiMo-V2.5

OpenRouter

1.0M tokens

MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding...

API

byxiaomiApr 22, 20261.0M context$0.400/M input tokens$2.00/M output tokens

GPT-5.4 Image 2

OpenRouter

272K tokens

[GPT-5.4](https://openrouter.ai/openai/gpt-5.4) Image 2 combines OpenAI's GPT-5.4 model with state-of-the-art image generation capabilities from GPT Image 2. It enables rich multimodal workflows, allowing users to seamlessly move between reasoning, coding, and...

API

byopenaiApr 21, 2026272K context$8.00/M input tokens$15.00/M output tokens

Ling-2.6-flash (free)

OpenRouter

262K tokens

Ling-2.6-flash is an instant (instruct) model from inclusionAI with 104B total parameters and 7.4B active parameters, designed for real-world agents that require fast responses, strong execution, and high token efficiency....

APIFree

byinclusionaiApr 21, 2026262K context$0/M input tokens$0/M output tokens

Claude Opus Latest

OpenRouter

1M tokens

This model always redirects to the latest model in the Claude Opus family.

API

byanthropicApr 21, 20261M context$5.00/M input tokens$25.00/M output tokens

Pareto Code Router

OpenRouter

200K tokens

The Pareto Router is a way to have OpenRouter always pick a strong coding model for your needs without committing to a specific one. You express a single `min_coding_score` preference...

APIFree

byopenrouterApr 21, 2026200K context$0/M input tokens$0/M output tokens

Qianfan-OCR-Fast (free)

OpenRouter

66K tokens

Qianfan-OCR-Fast is a domain-specific multimodal large model purpose-built for OCR. By leveraging specialized OCR training data while preserving versatile multimodal intelligence, it provides a powerful performance upgrade over Qianfan-OCR.

APIFree

bybaiduApr 20, 202666K context$0/M input tokens$0/M output tokens

Kimi K2.6

OpenRouter

256K tokens

Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX generation, and multi-agent orchestration. It handles complex end-to-end coding tasks across Python, Rust, and Go, and...

API

bymoonshot aiApr 20, 2026256K context$0.745/M input tokens$4.66/M output tokens

Claude Opus 4.7

OpenRouter

1M tokens

Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the coding and agentic strengths of Opus 4.6, it delivers stronger performance on...

API

byanthropicApr 16, 20261M context$5.00/M input tokens$25.00/M output tokens

Claude Opus 4.6 (Fast)

OpenRouter

1M tokens

Fast-mode variant of [Opus 4.6](/anthropic/claude-opus-4.6) - identical capabilities with higher output speed at premium 6x pricing. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode

API

byanthropicApr 7, 20261M context$30.00/M input tokens$150.00/M output tokens

GLM 5.1

OpenRouter

203K tokens

GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minute-level interactions, GLM-5.1 can work independently and continuously on...

API

byz.aiApr 7, 2026203K context$1.05/M input tokens$3.50/M output tokens

Gemma 4 26B A4B

OpenRouter

262K tokens

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at...

API

bygoogleApr 3, 2026262K context$0.060/M input tokens$0.330/M output tokens

Gemma 4 26B A4B (free)

OpenRouter

262K tokens

APIFree

bygoogleApr 3, 2026262K context$0/M input tokens$0/M output tokens

Gemma 4 31B

OpenRouter

262K tokens

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function...

API

bygoogleApr 2, 2026262K context$0.130/M input tokens$0.380/M output tokens

Gemma 4 31B (free)

OpenRouter

262K tokens

APIFree

bygoogleApr 2, 2026262K context$0/M input tokens$0/M output tokens

Qwen3.6 Plus

OpenRouter

1M tokens

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference. Compared to the 3.5 series, it delivers...

API

byalibabaApr 2, 20261M context$0.325/M input tokens$1.95/M output tokens

GLM 5V Turbo

OpenRouter

203K tokens

GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven tasks. It natively handles image, video, and text inputs, excels at long-horizon planning, complex coding,...

API

byz.aiApr 1, 2026203K context$1.20/M input tokens$4.00/M output tokens

Trinity Large Thinking

OpenRouter

262K tokens

Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance in PinchBench, agentic workloads, and reasoning tasks. Launch video: https://youtu.be/Gc82AXLa0Rg?si=4RLn6WBz33qT--B7

API

byarcee aiApr 1, 2026262K context$0.220/M input tokens$0.850/M output tokens

Grok 4.20 Multi-Agent

OpenRouter

2M tokens

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information...

API

byxaiMar 31, 20262M context$2.00/M input tokens$6.00/M output tokens

Grok 4.20

OpenRouter

2M tokens

Grok 4.20 is xAI's newest flagship model with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the market with strict prompt adherance, delivering consistently...

API

byxaiMar 31, 20262M context$2.00/M input tokens$6.00/M output tokens

Lyria 3 Pro Preview

OpenRouter

1.0M tokens

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz...

APIFree

bygoogleMar 30, 20261.0M context$0/M input tokens$0/M output tokens

Lyria 3 Clip Preview

OpenRouter

1.0M tokens

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate...

APIFree

bygoogleMar 30, 20261.0M context$0/M input tokens$0/M output tokens

KAT-Coder-Pro V2

OpenRouter

256K tokens

KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder series, designed for complex enterprise-grade software engineering and SaaS integration. It builds on the agentic coding strengths of earlier versions,...

API

bykwaipilotMar 27, 2026256K context$0.300/M input tokens$1.20/M output tokens

Reka Edge

OpenRouter

16K tokens

Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video+text inputs and generates text outputs. This model is optimized specifically to deliver industry-leading performance in image understanding,...

API

byrekaaiMar 20, 202616K context$0.100/M input tokens$0.100/M output tokens

MiMo-V2-Omni

OpenRouter

262K tokens

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step...

API

byxiaomiMar 18, 2026262K context$0.400/M input tokens$2.00/M output tokens

MiMo-V2-Pro

OpenRouter

1.0M tokens

MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and a 1M context length, deeply optimized for agentic scenarios. It is highly adaptable to general agent frameworks like...

API

byxiaomiMar 18, 20261.0M context$1.00/M input tokens$3.00/M output tokens

MiniMax M2.7

OpenRouter

197K tokens

MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous improvement. Built to actively participate in its own evolution, M2.7 integrates advanced agentic capabilities through multi-agent...

API

byminimaxMar 18, 2026197K context$0.300/M input tokens$1.20/M output tokens

GPT-5.4 Nano

OpenRouter

400K tokens

GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized for speed-critical and high-volume tasks. It supports text and image inputs and is designed for low-latency...

API

byopenaiMar 17, 2026400K context$0.200/M input tokens$1.25/M output tokens

GPT-5.4 Mini

OpenRouter

400K tokens

GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput workloads. It supports text and image inputs with strong performance across reasoning, coding,...

API

byopenaiMar 17, 2026400K context$0.750/M input tokens$4.50/M output tokens

Mistral Small 4

OpenRouter

262K tokens

Mistral Small 4 is the next major release in the Mistral Small family, unifying the capabilities of several flagship Mistral models into a single system. It combines strong reasoning from...

API

bymistral aiMar 16, 2026262K context$0.150/M input tokens$0.600/M output tokens

GLM 5 Turbo

OpenRouter

203K tokens

GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios. It is deeply optimized for real-world agent workflows...

API

byz.aiMar 15, 2026203K context$1.20/M input tokens$4.00/M output tokens

Nemotron 3 Super

OpenRouter

262K tokens

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer...

API

bynvidiaMar 11, 2026262K context$0.090/M input tokens$0.450/M output tokens

Nemotron 3 Super (free)

OpenRouter

262K tokens

APIFree

bynvidiaMar 11, 2026262K context$0/M input tokens$0/M output tokens

Seed-2.0-Lite

OpenRouter

262K tokens

Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong multimodal and agent capabilities while offering noticeably lower latency, making it a practical default choice for most production workloads across...

API

bybytedance seedMar 10, 2026262K context$0.250/M input tokens$2.00/M output tokens

Qwen3.5-9B

OpenRouter

262K tokens

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-parameter architecture. It uses a unified vision-language design...

API

byalibabaMar 10, 2026262K context$0.100/M input tokens$0.150/M output tokens

GPT-5.4 Pro

OpenRouter

1.1M tokens

GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex, high-stakes tasks. It features a 1M+ token context window (922K input, 128K...

API

byopenaiMar 5, 20261.1M context$30.00/M input tokens$180.00/M output tokens

GPT-5.4

OpenRouter

1.1M tokens

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K output) with support for...

API

byopenaiMar 5, 20261.1M context$2.50/M input tokens$15.00/M output tokens

Mercury 2

OpenRouter

128K tokens

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving...

API

byinceptionMar 4, 2026128K context$0.250/M input tokens$0.750/M output tokens

GPT-5.3 Chat

OpenRouter

128K tokens

GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more directly helpful. It delivers more accurate answers with better contextualization and significantly...

API

byopenaiMar 3, 2026128K context$1.75/M input tokens$14.00/M output tokens

Gemini 3.1 Flash Lite Preview

OpenRouter

1.0M tokens

Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemini 2.5 Flash Lite on overall quality and approaches Gemini 2.5 Flash performance across...

API

bygoogleMar 3, 20261.0M context$0.250/M input tokens$1.50/M output tokens

Seed-2.0-Mini

OpenRouter

262K tokens

Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios, emphasizing fast response and flexible inference deployment. It delivers performance comparable to ByteDance-Seed-1.6, supports 256k context, four reasoning effort modes (minimal/low/medium/high), multimodal understanding,...

API

bybytedance seedFeb 26, 2026262K context$0.100/M input tokens$0.400/M output tokens

Nano Banana 2 (Gemini 3.1 Flash Image Preview)

OpenRouter

66K tokens

Gemini 3.1 Flash Image Preview, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual quality at Flash speed. It combines...

API

bygoogleFeb 26, 202666K context$0.500/M input tokens$3.00/M output tokens

Qwen3.5-35B-A3B

OpenRouter

262K tokens

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency. Its overall...

API

byalibabaFeb 25, 2026262K context$0.163/M input tokens$1.30/M output tokens

Qwen3.5-27B

OpenRouter

262K tokens

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance. Its overall capabilities are comparable to those of...

API

byalibabaFeb 25, 2026262K context$0.195/M input tokens$1.56/M output tokens

Qwen3.5-122B-A10B

OpenRouter

262K tokens

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. In terms of...

API

byalibabaFeb 25, 2026262K context$0.260/M input tokens$2.08/M output tokens

Qwen3.5-Flash

OpenRouter

1M tokens

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...

API

byalibabaFeb 25, 20261M context$0.065/M input tokens$0.260/M output tokens

LFM2-24B-A2B

OpenRouter

33K tokens

LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-Experts model with only 2B active parameters per...

API

byliquidFeb 25, 202633K context$0.030/M input tokens$0.120/M output tokens

Gemini 3.1 Pro Preview Custom Tools

OpenRouter

1.0M tokens

Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing overuse of a general bash tool when more efficient third-party...

API

bygoogleFeb 25, 20261.0M context$2.00/M input tokens$12.00/M output tokens

GPT-5.3-Codex

OpenRouter

400K tokens

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilities of GPT-5.2. It achieves state-of-the-art results...

API

byopenaiFeb 24, 2026400K context$1.75/M input tokens$14.00/M output tokens

Aion-2.0

OpenRouter

131K tokens

Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytelling. It is particularly strong at introducing tension, crises, and conflict into stories, making narratives feel more engaging....

API

byaion labsFeb 23, 2026131K context$0.800/M input tokens$1.60/M output tokens

Gemini 3.1 Pro Preview

OpenRouter

1.0M tokens

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...

API

bygoogleFeb 19, 20261.0M context$2.00/M input tokens$12.00/M output tokens

Claude Sonnet 4.6

OpenRouter

1M tokens

Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work. It excels at iterative development, complex codebase navigation, end-to-end project management with...

API

byanthropicFeb 17, 20261M context$3.00/M input tokens$15.00/M output tokens

Qwen3.5 Plus 2026-02-15

OpenRouter

1M tokens

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference efficiency. In a variety of...

API

byalibabaFeb 16, 20261M context$0.260/M input tokens$1.56/M output tokens

Qwen3.5 397B A17B

OpenRouter

262K tokens

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. It delivers...

API

byalibabaFeb 16, 2026262K context$0.390/M input tokens$2.34/M output tokens

MiniMax M2.5

OpenRouter

197K tokens

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments, M2.5 builds upon the coding expertise of M2.1...

API

byminimaxFeb 12, 2026197K context$0.150/M input tokens$1.15/M output tokens

MiniMax M2.5 (free)

OpenRouter

197K tokens

APIFree

byminimaxFeb 12, 2026197K context$0/M input tokens$0/M output tokens

GLM 5

OpenRouter

203K tokens

GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows. Built for expert developers, it delivers production-grade performance on large-scale programming tasks, rivaling leading...

API

byz.aiFeb 11, 2026203K context$0.600/M input tokens$2.08/M output tokens

Qwen3 Max Thinking

OpenRouter

262K tokens

Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning. By significantly scaling model capacity and reinforcement learning compute, it...

API

byalibabaFeb 9, 2026262K context$0.780/M input tokens$3.90/M output tokens

Claude Opus 4.6

OpenRouter

1M tokens

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is built for agents that operate across entire workflows rather than single prompts, making it especially effective...

API

byanthropicFeb 4, 20261M context$5.00/M input tokens$25.00/M output tokens

Qwen3 Coder Next

OpenRouter

262K tokens

Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows. It uses a sparse MoE design with 80B total parameters and only 3B activated per...

API

byalibabaFeb 4, 2026262K context$0.140/M input tokens$0.800/M output tokens

Free Models Router

OpenRouter

200K tokens

The simplest way to get free inference. openrouter/free is a router that selects free models at random from the models available on OpenRouter. The router smartly filters for models that...

APIFree

byopenrouterFeb 1, 2026200K context$0/M input tokens$0/M output tokens

Step 3.5 Flash

OpenRouter

262K tokens

Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse Mixture of Experts (MoE) architecture, it selectively activates only 11B of its 196B parameters per token....

API

bystepfunJan 29, 2026262K context$0.100/M input tokens$0.300/M output tokens

Trinity Large Preview

OpenRouter

131K tokens

Trinity-Large-Preview is a frontier-scale open-weight language model from Arcee, built as a 400B-parameter sparse Mixture-of-Experts with 13B active parameters per token using 4-of-256 expert routing. It excels in creative writing,...

API

byarcee aiJan 27, 2026131K context$0.150/M input tokens$0.450/M output tokens

Kimi K2.5

OpenRouter

262K tokens

Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm. Built on Kimi K2 with continued pretraining over approximately 15T mixed...

API

bymoonshot aiJan 27, 2026262K context$0.440/M input tokens$2.00/M output tokens

Solar Pro 3

OpenRouter

128K tokens

Solar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model. With 102B total parameters and 12B active parameters per forward pass, it delivers exceptional performance while maintaining computational efficiency. Optimized...

API

byupstageJan 27, 2026128K context$0.150/M input tokens$0.600/M output tokens

MiniMax M2-her

OpenRouter

66K tokens

MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and expressive multi-turn conversations. Designed to stay consistent in tone and personality, it supports rich message...

API

byminimaxJan 23, 202666K context$0.300/M input tokens$1.20/M output tokens

Palmyra X5

OpenRouter

1.0M tokens

Palmyra X5 is Writer's most advanced model, purpose-built for building and scaling AI agents across the enterprise. It delivers industry-leading speed and efficiency on context windows up to 1 million...

API

bywriterJan 21, 20261.0M context$0.600/M input tokens$6.00/M output tokens

LFM2.5-1.2B-Thinking (free)

OpenRouter

33K tokens

LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—while still running comfortably on edge devices. It supports long context (up to 32K tokens) and is...

APIFree

byliquidJan 20, 202633K context$0/M input tokens$0/M output tokens

LFM2.5-1.2B-Instruct (free)

OpenRouter

33K tokens

LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model built for fast on-device AI. It delivers strong chat quality in a 1.2B parameter footprint, with efficient edge inference and broad runtime support.

APIFree

byliquidJan 20, 202633K context$0/M input tokens$0/M output tokens

GPT Audio

OpenRouter

128K tokens

The gpt-audio model is OpenAI's first generally available audio model. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Audio is priced...

API

byopenaiJan 19, 2026128K context$2.50/M input tokens$10.00/M output tokens

GPT Audio Mini

OpenRouter

128K tokens

A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Input is priced at $0.60 per million...

API

byopenaiJan 19, 2026128K context$0.600/M input tokens$2.40/M output tokens

GLM 4.7 Flash

OpenRouter

203K tokens

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning,...

API

byz.aiJan 19, 2026203K context$0.060/M input tokens$0.400/M output tokens

GPT-5.2-Codex

OpenRouter

400K tokens

GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks....

API

byopenaiJan 14, 2026400K context$1.75/M input tokens$14.00/M output tokens

Olmo 3.1 32B Instruct

OpenRouter

66K tokens

Olmo 3.1 32B Instruct is a large-scale, 32-billion-parameter instruction-tuned language model engineered for high-performance conversational AI, multi-turn dialogue, and practical instruction following. As part of the Olmo 3.1 family, this...

API

byallenaiJan 6, 202666K context$0.200/M input tokens$0.600/M output tokens

Seed 1.6 Flash

OpenRouter

262K tokens

Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual understanding. It features a 256k context window and can generate outputs of...

API

bybytedance seedDec 23, 2025262K context$0.075/M input tokens$0.300/M output tokens

Seed 1.6

OpenRouter

262K tokens

Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimodal capabilities and adaptive deep thinking with a 256K context window.

API

bybytedance seedDec 23, 2025262K context$0.250/M input tokens$2.00/M output tokens

MiniMax M2.1

OpenRouter

197K tokens

MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world...

API

byminimaxDec 23, 2025197K context$0.290/M input tokens$0.950/M output tokens

GLM 4.7

OpenRouter

203K tokens

GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution. It demonstrates significant improvements in executing complex agent tasks while...

API

byz.aiDec 22, 2025203K context$0.380/M input tokens$1.74/M output tokens

Gemini 3 Flash Preview

OpenRouter

1.0M tokens

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro level reasoning and tool...

API

bygoogleDec 17, 20251.0M context$0.500/M input tokens$3.00/M output tokens

Mistral Small Creative

OpenRouter

33K tokens

Mistral Small Creative is an experimental small model designed for creative writing, narrative generation, roleplay and character-driven dialogue, general-purpose instruction following, and conversational agents.

API

bymistral aiDec 16, 202533K context$0.100/M input tokens$0.300/M output tokens

MiMo-V2-Flash

OpenRouter

262K tokens

MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-of-Experts model with 309B total parameters and 15B active parameters, adopting hybrid attention architecture. MiMo-V2-Flash supports a...

API

byxiaomiDec 14, 2025262K context$0.090/M input tokens$0.290/M output tokens

Nemotron 3 Nano 30B A3B

OpenRouter

262K tokens

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems. The model is fully...

API

bynvidiaDec 14, 2025262K context$0.050/M input tokens$0.200/M output tokens

Nemotron 3 Nano 30B A3B (free)

OpenRouter

256K tokens

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems. The model is fully...

APIFree

bynvidiaDec 14, 2025256K context$0/M input tokens$0/M output tokens

GPT-5.2 Chat

OpenRouter

128K tokens

GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on...

API

byopenaiDec 10, 2025128K context$1.75/M input tokens$14.00/M output tokens

GPT-5.2 Pro

OpenRouter

400K tokens

GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro. It is optimized for complex tasks that require step-by-step reasoning,...

API

byopenaiDec 10, 2025400K context$21.00/M input tokens$168.00/M output tokens

GPT-5.2

OpenRouter

400K tokens

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1. It uses adaptive reasoning to allocate computation dynamically, responding quickly...

API

byopenaiDec 10, 2025400K context$1.75/M input tokens$14.00/M output tokens

Devstral 2 2512

OpenRouter

262K tokens

Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding. It is a 123B-parameter dense transformer model supporting a 256K context window. Devstral 2 supports exploring...

API

bymistral aiDec 9, 2025262K context$0.400/M input tokens$2.00/M output tokens

Relace Search

OpenRouter

256K tokens

The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a codebase and return relevant files to the user request. In contrast to RAG, relace-search performs agentic...

API

byrelaceDec 8, 2025256K context$1.00/M input tokens$3.00/M output tokens

GLM 4.6V

OpenRouter

131K tokens

GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across images, documents, and mixed media. It supports up to 128K tokens, processes complex page layouts...

API

byz.aiDec 8, 2025131K context$0.300/M input tokens$0.900/M output tokens

DeepSeek V3.1 Nex N1

OpenRouter

131K tokens

DeepSeek V3.1 Nex-N1 is the flagship release of the Nex-N1 series — a post-trained model designed to highlight agent autonomy, tool use, and real-world productivity. Nex-N1 demonstrates competitive performance across...

API

bynex agiDec 8, 2025131K context$0.135/M input tokens$0.500/M output tokens

Rnj 1 Instruct

OpenRouter

33K tokens

Rnj-1 is an 8B-parameter, dense, open-weight model family developed by Essential AI and trained from scratch with a focus on programming, math, and scientific reasoning. The model demonstrates strong performance...

API

byessentialaiDec 7, 202533K context$0.150/M input tokens$0.150/M output tokens

Body Builder (beta)

OpenRouter

128K tokens

Transform your natural language requests into structured OpenRouter API request objects. Describe what you want to accomplish with AI models, and Body Builder will construct the appropriate API calls. Example:...

APIFree

byopenrouterDec 5, 2025128K context$0/M input tokens$0/M output tokens

GPT-5.1-Codex-Max

OpenRouter

400K tokens

GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development tasks. It is based on an updated version of the 5.1 reasoning stack and trained on agentic...

API

byopenaiDec 4, 2025400K context$1.25/M input tokens$10.00/M output tokens

Nova 2 Lite

OpenRouter

1M tokens

Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos to generate text. Nova 2 Lite demonstrates standout capabilities in processing...

API

byamazonDec 2, 20251M context$0.300/M input tokens$2.50/M output tokens

Ministral 3 14B 2512

OpenRouter

262K tokens

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart. A powerful and efficient language...

API

bymistral aiDec 2, 2025262K context$0.200/M input tokens$0.200/M output tokens

Ministral 3 8B 2512

OpenRouter

262K tokens

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

API

bymistral aiDec 2, 2025262K context$0.150/M input tokens$0.150/M output tokens

Ministral 3 3B 2512

OpenRouter

131K tokens

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.

API

bymistral aiDec 2, 2025131K context$0.100/M input tokens$0.100/M output tokens

Mistral Large 3 2512

OpenRouter

262K tokens

Mistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse mixture-of-experts architecture with 41B active parameters (675B total), and released under the Apache 2.0 license.

API

bymistral aiDec 1, 2025262K context$0.500/M input tokens$1.50/M output tokens

Trinity Mini

OpenRouter

131K tokens

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for efficient reasoning over long contexts (131k) with robust function...

API

byarcee aiDec 1, 2025131K context$0.045/M input tokens$0.150/M output tokens

DeepSeek V3.2 Speciale

OpenRouter

164K tokens

DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance. It builds on DeepSeek Sparse Attention (DSA) for efficient long-context processing, then scales post-training reinforcement learning...

API

bydeepseekDec 1, 2025164K context$0.400/M input tokens$1.20/M output tokens

DeepSeek V3.2

OpenRouter

131K tokens

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism...

API

bydeepseekDec 1, 2025131K context$0.252/M input tokens$0.378/M output tokens

INTELLECT-3

OpenRouter

131K tokens

INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from GLM-4.5-Air-Base using supervised fine-tuning (SFT) followed by large-scale reinforcement learning (RL). It offers state-of-the-art performance for its size across math,...

API

byprime intellectNov 27, 2025131K context$0.200/M input tokens$1.10/M output tokens

Claude Opus 4.5

OpenRouter

200K tokens

Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use. It offers strong multimodal capabilities, competitive performance across real-world coding and...

API

byanthropicNov 24, 2025200K context$5.00/M input tokens$25.00/M output tokens

Olmo 3 32B Think

OpenRouter

66K tokens

Olmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for deep reasoning, complex logic chains and advanced instruction-following scenarios. Its capacity enables strong performance on demanding evaluation tasks and...

API

byallenaiNov 21, 202566K context$0.150/M input tokens$0.500/M output tokens

Nano Banana Pro (Gemini 3 Pro Image Preview)

OpenRouter

66K tokens

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana with significantly improved multimodal reasoning, real-world grounding, and...

API

bygoogleNov 20, 202566K context$2.00/M input tokens$12.00/M output tokens

Grok 4.1 Fast

OpenRouter

2M tokens

Grok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases like customer support and deep research. 2M context window. Reasoning can be enabled/disabled using...

API

byxaiNov 19, 20252M context$0.200/M input tokens$0.500/M output tokens

Cogito v2.1 671B

OpenRouter

128K tokens

Cogito v2.1 671B MoE represents one of the strongest open models globally, matching performance of frontier closed and open models. This model is trained using self play with reinforcement learning...

API

bydeepcogitoNov 13, 2025128K context$1.25/M input tokens$1.25/M output tokens

GPT-5.1

OpenRouter

400K tokens

GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural conversational style compared to GPT-5. It uses adaptive reasoning...

API

byopenaiNov 13, 2025400K context$1.25/M input tokens$10.00/M output tokens

GPT-5.1 Chat

OpenRouter

128K tokens

GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on...

API

byopenaiNov 13, 2025128K context$1.25/M input tokens$10.00/M output tokens

GPT-5.1-Codex

OpenRouter

400K tokens

GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks....

API

byopenaiNov 13, 2025400K context$1.25/M input tokens$10.00/M output tokens

GPT-5.1-Codex-Mini

OpenRouter

400K tokens

GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex

API

byopenaiNov 13, 2025400K context$0.250/M input tokens$2.00/M output tokens

Kimi K2 Thinking

OpenRouter

262K tokens

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) architecture introduced in...

API

bymoonshot aiNov 6, 2025262K context$0.600/M input tokens$2.50/M output tokens

Nova Premier 1.0

OpenRouter

1M tokens

Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models.

API

byamazonOct 31, 20251M context$2.50/M input tokens$12.50/M output tokens

Sonar Pro Search

OpenRouter

200K tokens

Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic search system. It is designed for deeper reasoning and analysis. Pricing is based...

API

byperplexityOct 30, 2025200K context$3.00/M input tokens$15.00/M output tokens

Stay in the know

AI Models

What is the Agent Mag AI models directory?

GPT-5.5 Pro

GPT-5.5

DeepSeek V4 Pro

DeepSeek V4 Flash

Ling-2.6-1T (free)

Hy3 preview (free)

MiMo-V2.5-Pro

MiMo-V2.5

GPT-5.4 Image 2

Ling-2.6-flash (free)

Claude Opus Latest

Pareto Code Router

Qianfan-OCR-Fast (free)

Kimi K2.6

Claude Opus 4.7

Claude Opus 4.6 (Fast)

GLM 5.1

Gemma 4 26B A4B

Gemma 4 26B A4B (free)

Gemma 4 31B

Gemma 4 31B (free)

Qwen3.6 Plus

GLM 5V Turbo

Trinity Large Thinking

Grok 4.20 Multi-Agent

Grok 4.20

Lyria 3 Pro Preview

Lyria 3 Clip Preview

KAT-Coder-Pro V2

Reka Edge

MiMo-V2-Omni

MiMo-V2-Pro

MiniMax M2.7

GPT-5.4 Nano

GPT-5.4 Mini

Mistral Small 4

GLM 5 Turbo

Nemotron 3 Super

Nemotron 3 Super (free)

Seed-2.0-Lite

Qwen3.5-9B

GPT-5.4 Pro

GPT-5.4

Mercury 2

GPT-5.3 Chat

Gemini 3.1 Flash Lite Preview

Seed-2.0-Mini

Nano Banana 2 (Gemini 3.1 Flash Image Preview)

Qwen3.5-35B-A3B

Qwen3.5-27B

Qwen3.5-122B-A10B

Qwen3.5-Flash

LFM2-24B-A2B

Gemini 3.1 Pro Preview Custom Tools

GPT-5.3-Codex

Aion-2.0

Gemini 3.1 Pro Preview

Claude Sonnet 4.6

Qwen3.5 Plus 2026-02-15

Qwen3.5 397B A17B

MiniMax M2.5

MiniMax M2.5 (free)

GLM 5

Qwen3 Max Thinking

Claude Opus 4.6

Qwen3 Coder Next

Free Models Router

Step 3.5 Flash

Trinity Large Preview

Kimi K2.5

Solar Pro 3

MiniMax M2-her

Palmyra X5

LFM2.5-1.2B-Thinking (free)

LFM2.5-1.2B-Instruct (free)

GPT Audio

GPT Audio Mini