Google API Pricing (2026)

Google offers the Gemini family of multimodal models via Google AI Studio and Vertex AI, featuring some of the largest context windows and competitive pricing. Founded in 1998 and headquartered in Mountain View, CA, Google offers 12 active models through their API. All pricing data below is sourced directly from the Google pricing page and is updated regularly.

API Documentation Official Pricing Page

Prices verified Apr 5, 2026

Google Model Pricing — All Models

All prices are in USD per 1 million tokens ($/M tokens). Input tokens are the text you send to the model; output tokens are the generated response.

Model	Tier	Input	Output	Cached Input	Batch Input	Batch Output
Gemini 3.1 Pro Google's newest and most capable model in preview, with a 2M token context window and advanced reasoning.	Premium	$2.00/M	$12.00/M	$0.20/M	$1.00/M	$6.00/M
Gemini 3 Flash Google's latest Flash model in preview, offering strong capabilities at fast speed and low cost.	Mid	$0.50/M	$3.00/M	$0.05/M	$0.25/M	$1.50/M
Gemini 2.5 Pro Google's most capable model with a 1 million token context window and strong performance on complex reasoning and coding tasks.	Premium	$1.25/M	$10.00/M	$0.13/M	$0.63/M	$5.00/M
Gemini 2.5 Flash Google's most efficient model with hybrid reasoning capabilities, balancing high speed with strong intelligence at low cost.	Mid	$0.30/M	$2.50/M	$0.03/M	$0.15/M	$1.25/M
Gemini 2.5 Flash-Lite Google's most cost-efficient 2.5 series model, optimized for high-volume low-latency tasks.	Budget	$0.10/M	$0.40/M	$0.01/M	$0.05/M	$0.20/M
Gemini 2.0 Flash Google's next-generation multimodal model with native tool use, code execution, and real-time streaming support.	Mid	$0.10/M	$0.40/M	$0.03/M	$0.05/M	$0.20/M
Gemini 2.0 Flash-Lite The most cost-efficient Gemini model, optimized for high-volume low-latency tasks with a 1M context window.	Budget	$0.08/M	$0.30/M	—	$0.04/M	$0.15/M
Gemini 1.5 Pro(deprecated) Google's previous-generation Pro model with a 2 million token context window, ideal for large document analysis.	Premium	$1.25/M	$5.00/M	$0.31/M	—	—
Gemini 1.5 Flash(deprecated) A fast, versatile model optimized for high-frequency tasks with a generous 1M token context window.	Budget	$0.08/M	$0.30/M	$0.02/M	—	—
Gemini 1.5 Flash-8B(deprecated) Google's smallest model, an 8B parameter variant of Flash optimized for high-volume low-complexity tasks.	Budget	$0.04/M	$0.15/M	$0.01/M	—	—
Gemini 3.1 Flash-Lite Google's most cost-efficient Gemini 3.1 model, optimized for high-volume lightweight tasks at extremely low cost.	Budget	$0.25/M	$1.50/M	$0.03/M	$0.13/M	$0.75/M
Gemini 3.1 Flash Image Preview Gemini 3.1 Flash Image Preview by Google.	Budget	$0.50/M	$3.00/M	—	—	—
Gemini 3.1 Pro Preview Custom Tools Gemini 3.1 Pro Preview Custom Tools by Google.	Mid	$2.00/M	$12.00/M	—	—	—
Gemma 4 31B Google DeepMind's 30.7B dense multimodal model supporting text, image, and video input. Features 256K context window, configurable reasoning mode, native function calling, and multilingual support across 140+ languages. Open-weight (Apache 2.0).	Budget	$0.14/M	$0.40/M	—	—	—
Gemma 4 26B A4B Gemma 4 26B A4B by Google.	Budget	$0.13/M	$0.40/M	—	—	—

Prices last verified: 2026-04-05. Always confirm on the official pricing page before production use.

Google Model Capabilities

Key technical capabilities across all Google models. Use this table to identify which models support the features your application requires.

Model	Context	Max Output	Vision	Functions	JSON Mode	Caching	Batch API	Reasoning
Gemini 3.1 Pro	2,097,152	65,536	✓	✓	✓	✓	✓	✓
Gemini 3 Flash	1,048,576	65,536	✓	✓	✓	✓	✓	✓
Gemini 2.5 Pro	1,048,576	65,536	✓	✓	✓	✓	✓	✓
Gemini 2.5 Flash	1,048,576	65,536	✓	✓	✓	✓	✓	✓
Gemini 2.5 Flash-Lite	1,048,576	8,192	✓	✓	✓	✓	✓	—
Gemini 2.0 Flash	1,048,576	8,192	✓	✓	✓	✓	✓	—
Gemini 2.0 Flash-Lite	1,048,576	8,192	✓	✓	✓	—	✓	—
Gemini 3.1 Flash-Lite	1,048,576	65,536	✓	✓	✓	✓	✓	—
Gemini 3.1 Flash Image Preview	65,536	65,536	✓	✓	✓	—	—	—
Gemini 3.1 Pro Preview Custom Tools	1,048,576	65,536	✓	✓	✓	—	—	—
Gemma 4 31B	262,144	131,072	✓	✓	✓	—	—	✓
Gemma 4 26B A4B	262,144	262,144	✓	✓	✓	—	—	✓

When to Choose Google

Cost-sensitive workloads

Use Gemini 2.0 Flash-Lite — Google's most affordable model at $0.08/M input / $0.30/M output. Estimated cost at 100K requests/month: $22.50.

Complex or high-accuracy tasks

Use Gemini 3.1 Pro — Google's most capable model (premium tier) with a 2,097,152-token context window. Priced at $2.00/M input / $12.00/M output.

High-volume pipelines

Google models are well-suited for high-throughput workloads. Consider batch API (where supported) for an additional 50% cost reduction on asynchronous pipelines. Check the pricing table above for batch pricing availability.

Compare Google Models vs Competitors

See detailed cost comparisons between Google models and models from other providers, including side-by-side pricing tables and monthly cost projections.

Browse use case guides where Google models are recommended, including cost-effective model rankings and monthly cost estimates for each workload.

Customer Support Bot Code Generation Content & Copywriting Data Extraction RAG / Semantic Search Document Summarization General Chatbot Text Classification

Google Free Tier Details

The following Google models include a free tier for development and prototyping. Free tier limits apply to unpaid usage only.

Model	Requests/min	Requests/day	Expires
Gemini 3.1 Pro	5	25	No expiry
Gemini 3 Flash	10	500	No expiry
Gemini 2.5 Pro	5	25	No expiry
Gemini 2.5 Flash	10	500	No expiry
Gemini 2.5 Flash-Lite	30	1,500	No expiry
Gemini 2.0 Flash	15	1,500	No expiry
Gemini 2.0 Flash-Lite	30	1,500	No expiry
Gemini 3.1 Flash-Lite	30	1,500	No expiry

Deprecated Google Models

The following Google models are deprecated and should not be used in new projects. Historical pricing is shown for reference only.

Gemini 1.5 Pro — deprecated 2026-03-01
Gemini 1.5 Flash — deprecated 2026-03-01
Gemini 1.5 Flash-8B — deprecated 2026-03-01

Frequently Asked Questions: Google API Pricing

What is the cheapest Google model?

The cheapest Google model by input token price is Gemini 2.0 Flash-Lite, priced at $0.08/M for input tokens and $0.30/M for output tokens. It is best suited for high-volume, cost-sensitive workloads where speed and price matter most.

What is the most capable Google model?

Gemini 3.1 Pro is Google's most capable model, classified as a premium-tier model. It supports a context window of 2,097,152 tokens and is priced at $2.00/M input / $12.00/M output per million tokens.

How does Google API pricing work?

Google charges per token consumed — separately for input tokens (the text you send) and output tokens (the text the model generates). Prices are listed in USD per 1 million tokens ($/M). Google was founded in 1998 and is headquartered in Mountain View, CA. All pricing shown here is sourced from the official Google pricing page.

Does Google offer batch API pricing?

Yes — 8 Google models support batch processing at reduced rates: Gemini 3.1 Pro, Gemini 3 Flash, Gemini 2.5 Pro, and others. Batch API is ideal for asynchronous workloads like bulk data extraction, document processing, or offline classification pipelines. Batch pricing is shown in the pricing table above.

Which Google models support prompt caching?

7 Google models support prompt caching: Gemini 3.1 Pro, Gemini 3 Flash, Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.5 Flash-Lite, Gemini 2.0 Flash, Gemini 3.1 Flash-Lite. Prompt caching stores repeated context (system prompts, knowledge base content) in memory and applies a discount to cached input tokens, making it highly effective for applications that reuse the same context across many requests.

Which Google models support vision / image inputs?

12 Google models support vision (image input): Gemini 3.1 Pro, Gemini 3 Flash, Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.5 Flash-Lite, Gemini 2.0 Flash, Gemini 2.0 Flash-Lite, Gemini 3.1 Flash-Lite, Gemini 3.1 Flash Image Preview, Gemini 3.1 Pro Preview Custom Tools, Gemma 4 31B, Gemma 4 26B A4B . These models can analyze images, charts, screenshots, and documents alongside text prompts.

Google pricing data last verified: 2026-04-05. Prices may change — verify on the official Google pricing page before production use.