AI Agent API Pricing Comparator

Compare LLM API costs for your agent workload. Pick providers, set your token volume, and see the cheapest option ranked by monthly cost.

Workload

Requests / month Input tokens / request Output tokens / request Cached input %

Prices per 1M tokens. Last updated: 2026-06-29. See notes.

Provider / Model	Input	Cached input	Output	Monthly cost	Best for
OpenAI GPT-4o gpt-4o	$2.50	$1.250	$10.00	—	General agent tasks, vision, tool use
OpenAI GPT-4o-mini gpt-4o-mini	$0.15	$0.075	$0.60	—	Cheap high-volume agents
Anthropic Claude 3.5 Sonnet claude-3-5-sonnet-20240620	$3.00	$3.750	$15.00	—	Long context, reasoning, coding
Anthropic Claude 3 Haiku claude-3-haiku-20240307	$0.25	$0.030	$1.25	—	Fast, cheap agent calls
Google Gemini 1.5 Pro gemini-1.5-pro	$3.50	—	$10.50	—	Massive context, multimodal
Google Gemini 1.5 Flash gemini-1.5-flash	$0.35	—	$1.05	—	Low-latency, high-volume
Groq Llama 3.1 70B llama-3.1-70b-versatile	$0.59	—	$0.79	—	Fast open-weight inference
Together AI Llama 3.1 70B llama-3.1-70b	$0.90	—	$0.90	—	Open-weight fine-tuning hub
Cohere Command R+ command-r-plus	$2.50	—	$10.00	—	RAG and enterprise search

Cheapest option

Enter a workload to see the cheapest provider.

Frequently asked questions

Which LLM API is cheapest for high-volume agents?

For most high-volume text agents, GPT-4o-mini, Claude 3 Haiku, and Gemini 1.5 Flash are the cheapest per 1M tokens. Groq Llama 3.1 70B is competitive for fast open-weight inference. Use this calculator with your actual token counts to find the cheapest provider for your workload.

How does cached input pricing work?

OpenAI and Anthropic discount input tokens that hit a context cache. Enter your typical cache-hit percentage and the calculator applies the cached-input rate to that portion while billing the rest at the standard input rate.

What is not included in this estimate?

This estimate covers token and request fees only. It does not include image/audio tokens, fine-tuning, storage, bandwidth, vector DB, or fallback/retry overhead. Use the LLM Cost Calculator and Agent Memory Cost Calculator for full agent TCO.

How often are prices updated?

Prices are reviewed monthly. Last updated: 2026-06-29. Always confirm current rates on the provider's pricing page before committing to a budget.

Prices are per 1M tokens in USD. Check provider sites for current rates.