Google API Pricing (2026)
Google offers the Gemini family of multimodal models via Google AI Studio and Vertex AI, featuring some of the largest context windows and competitive pricing. Founded in 1998 and headquartered in Mountain View, CA, Google offers 12 active models through their API. All pricing data below is sourced directly from the Google pricing page and is updated regularly.
Prices verified Apr 5, 2026
Google Model Pricing — All Models
All prices are in USD per 1 million tokens ($/M tokens). Input tokens are the text you send to the model; output tokens are the generated response.
| Model | Tier | Input | Output | Cached Input | Batch Input | Batch Output |
|---|---|---|---|---|---|---|
Gemini 3.1 Pro Google's newest and most capable model in preview, with a 2M token context window and advanced reasoning. | Premium | $2.00/M | $12.00/M | $0.20/M | $1.00/M | $6.00/M |
Gemini 3 Flash Google's latest Flash model in preview, offering strong capabilities at fast speed and low cost. | Mid | $0.50/M | $3.00/M | $0.05/M | $0.25/M | $1.50/M |
Gemini 2.5 Pro Google's most capable model with a 1 million token context window and strong performance on complex reasoning and coding tasks. | Premium | $1.25/M | $10.00/M | $0.13/M | $0.63/M | $5.00/M |
Gemini 2.5 Flash Google's most efficient model with hybrid reasoning capabilities, balancing high speed with strong intelligence at low cost. | Mid | $0.30/M | $2.50/M | $0.03/M | $0.15/M | $1.25/M |
Gemini 2.5 Flash-Lite Google's most cost-efficient 2.5 series model, optimized for high-volume low-latency tasks. | Budget | $0.10/M | $0.40/M | $0.01/M | $0.05/M | $0.20/M |
Gemini 2.0 Flash Google's next-generation multimodal model with native tool use, code execution, and real-time streaming support. | Mid | $0.10/M | $0.40/M | $0.03/M | $0.05/M | $0.20/M |
Gemini 2.0 Flash-Lite The most cost-efficient Gemini model, optimized for high-volume low-latency tasks with a 1M context window. | Budget | $0.08/M | $0.30/M | — | $0.04/M | $0.15/M |
Gemini 1.5 Pro(deprecated) Google's previous-generation Pro model with a 2 million token context window, ideal for large document analysis. | Premium | $1.25/M | $5.00/M | $0.31/M | — | — |
Gemini 1.5 Flash(deprecated) A fast, versatile model optimized for high-frequency tasks with a generous 1M token context window. | Budget | $0.08/M | $0.30/M | $0.02/M | — | — |
Gemini 1.5 Flash-8B(deprecated) Google's smallest model, an 8B parameter variant of Flash optimized for high-volume low-complexity tasks. | Budget | $0.04/M | $0.15/M | $0.01/M | — | — |
Gemini 3.1 Flash-Lite Google's most cost-efficient Gemini 3.1 model, optimized for high-volume lightweight tasks at extremely low cost. | Budget | $0.25/M | $1.50/M | $0.03/M | $0.13/M | $0.75/M |
Gemini 3.1 Flash Image Preview Gemini 3.1 Flash Image Preview by Google. | Budget | $0.50/M | $3.00/M | — | — | — |
Gemini 3.1 Pro Preview Custom Tools Gemini 3.1 Pro Preview Custom Tools by Google. | Mid | $2.00/M | $12.00/M | — | — | — |
Gemma 4 31B Google DeepMind's 30.7B dense multimodal model supporting text, image, and video input. Features 256K context window, configurable reasoning mode, native function calling, and multilingual support across 140+ languages. Open-weight (Apache 2.0). | Budget | $0.14/M | $0.40/M | — | — | — |
Gemma 4 26B A4B Gemma 4 26B A4B by Google. | Budget | $0.13/M | $0.40/M | — | — | — |
Prices last verified: 2026-04-05. Always confirm on the official pricing page before production use.
Google Model Capabilities
Key technical capabilities across all Google models. Use this table to identify which models support the features your application requires.
| Model | Context | Max Output | Vision | Functions | JSON Mode | Caching | Batch API | Reasoning |
|---|---|---|---|---|---|---|---|---|
| Gemini 3.1 Pro | 2,097,152 | 65,536 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Gemini 3 Flash | 1,048,576 | 65,536 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Gemini 2.5 Pro | 1,048,576 | 65,536 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Gemini 2.5 Flash | 1,048,576 | 65,536 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Gemini 2.5 Flash-Lite | 1,048,576 | 8,192 | ✓ | ✓ | ✓ | ✓ | ✓ | — |
| Gemini 2.0 Flash | 1,048,576 | 8,192 | ✓ | ✓ | ✓ | ✓ | ✓ | — |
| Gemini 2.0 Flash-Lite | 1,048,576 | 8,192 | ✓ | ✓ | ✓ | — | ✓ | — |
| Gemini 3.1 Flash-Lite | 1,048,576 | 65,536 | ✓ | ✓ | ✓ | ✓ | ✓ | — |
| Gemini 3.1 Flash Image Preview | 65,536 | 65,536 | ✓ | ✓ | ✓ | — | — | — |
| Gemini 3.1 Pro Preview Custom Tools | 1,048,576 | 65,536 | ✓ | ✓ | ✓ | — | — | — |
| Gemma 4 31B | 262,144 | 131,072 | ✓ | ✓ | ✓ | — | — | ✓ |
| Gemma 4 26B A4B | 262,144 | 262,144 | ✓ | ✓ | ✓ | — | — | ✓ |
When to Choose Google
Cost-sensitive workloads
Use Gemini 2.0 Flash-Lite — Google's most affordable model at $0.08/M input / $0.30/M output. Estimated cost at 100K requests/month: $22.50.
Complex or high-accuracy tasks
Use Gemini 3.1 Pro — Google's most capable model (premium tier) with a 2,097,152-token context window. Priced at $2.00/M input / $12.00/M output.
High-volume pipelines
Google models are well-suited for high-throughput workloads. Consider batch API (where supported) for an additional 50% cost reduction on asynchronous pipelines. Check the pricing table above for batch pricing availability.
Compare Google Models vs Competitors
See detailed cost comparisons between Google models and models from other providers, including side-by-side pricing tables and monthly cost projections.
Use Cases for Google Models
Browse use case guides where Google models are recommended, including cost-effective model rankings and monthly cost estimates for each workload.
Google Free Tier Details
The following Google models include a free tier for development and prototyping. Free tier limits apply to unpaid usage only.
| Model | Requests/min | Requests/day | Expires |
|---|---|---|---|
| Gemini 3.1 Pro | 5 | 25 | No expiry |
| Gemini 3 Flash | 10 | 500 | No expiry |
| Gemini 2.5 Pro | 5 | 25 | No expiry |
| Gemini 2.5 Flash | 10 | 500 | No expiry |
| Gemini 2.5 Flash-Lite | 30 | 1,500 | No expiry |
| Gemini 2.0 Flash | 15 | 1,500 | No expiry |
| Gemini 2.0 Flash-Lite | 30 | 1,500 | No expiry |
| Gemini 3.1 Flash-Lite | 30 | 1,500 | No expiry |
Deprecated Google Models
The following Google models are deprecated and should not be used in new projects. Historical pricing is shown for reference only.
- Gemini 1.5 Pro — deprecated 2026-03-01
- Gemini 1.5 Flash — deprecated 2026-03-01
- Gemini 1.5 Flash-8B — deprecated 2026-03-01
Frequently Asked Questions: Google API Pricing
What is the cheapest Google model?
The cheapest Google model by input token price is Gemini 2.0 Flash-Lite, priced at $0.08/M for input tokens and $0.30/M for output tokens. It is best suited for high-volume, cost-sensitive workloads where speed and price matter most.
What is the most capable Google model?
Gemini 3.1 Pro is Google's most capable model, classified as a premium-tier model. It supports a context window of 2,097,152 tokens and is priced at $2.00/M input / $12.00/M output per million tokens.
How does Google API pricing work?
Google charges per token consumed — separately for input tokens (the text you send) and output tokens (the text the model generates). Prices are listed in USD per 1 million tokens ($/M). Google was founded in 1998 and is headquartered in Mountain View, CA. All pricing shown here is sourced from the official Google pricing page.
Does Google offer batch API pricing?
Yes — 8 Google models support batch processing at reduced rates: Gemini 3.1 Pro, Gemini 3 Flash, Gemini 2.5 Pro, and others. Batch API is ideal for asynchronous workloads like bulk data extraction, document processing, or offline classification pipelines. Batch pricing is shown in the pricing table above.
Which Google models support prompt caching?
7 Google models support prompt caching: Gemini 3.1 Pro, Gemini 3 Flash, Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.5 Flash-Lite, Gemini 2.0 Flash, Gemini 3.1 Flash-Lite. Prompt caching stores repeated context (system prompts, knowledge base content) in memory and applies a discount to cached input tokens, making it highly effective for applications that reuse the same context across many requests.
Which Google models support vision / image inputs?
12 Google models support vision (image input): Gemini 3.1 Pro, Gemini 3 Flash, Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.5 Flash-Lite, Gemini 2.0 Flash, Gemini 2.0 Flash-Lite, Gemini 3.1 Flash-Lite, Gemini 3.1 Flash Image Preview, Gemini 3.1 Pro Preview Custom Tools, Gemma 4 31B, Gemma 4 26B A4B . These models can analyze images, charts, screenshots, and documents alongside text prompts.
Google pricing data last verified: 2026-04-05. Prices may change — verify on the official Google pricing page before production use.