Gemini
Our most intelligent AI models, built for the agentic era
Gemini 2.5 models are capable of reasoning through their thoughts before responding, resulting in enhanced performance and improved accuracy.
Model family
Gemini 2.5 builds on the best of Gemini — with native multimodality and a long context window.
Hands-on with 2.5 Pro
See how Gemini 2.5 Pro uses its reasoning capabilities to create interactive simulations and do advanced coding.
Performance
Gemini 2.5 is state-of-the-art across a wide range of benchmarks.
Benchmarks
Gemini 2.5 Pro demonstrates significantly improved performance across a wide range of benchmarks.
Benchmark |
Gemini 2.5 Pro
Experimental (03-25)
|
OpenAI o3-mini
High
|
OpenAI GPT-4.5
|
Claude 3.7 Sonnet
64k Extended thinking
|
Grok 3 Beta
Extended thinking
|
DeepSeek R1
|
|
---|---|---|---|---|---|---|---|
Reasoning & knowledge
Humanity's Last Exam (no tools)
|
18.8% | 14.0%* | 6.4% | 8.9% | — | 8.6%* | |
Science
GPQA diamond
|
single attempt (pass@1) | 84.0% | 79.7% | 71.4% | 78.2% | 80.2% | 71.5% |
|
multiple attempts | — | — | — | 84.8% | 84.6% | — |
Mathematics
AIME 2025
|
single attempt (pass@1) | 86.7% | 86.5% | — | 49.5% | 77.3% | 70.0% |
|
multiple attempts | — | — | — | — | 93.3% | — |
Mathematics
AIME 2024
|
single attempt (pass@1) | 92.0% | 87.3% | 36.7% | 61.3% | 83.9% | 79.8% |
|
multiple attempts | — | — | — | 80.0% | 93.3% | — |
Code generation
LiveCodeBench v5
|
single attempt (pass@1) | 70.4% | 74.1% | — | — | 70.6% | 64.3% |
|
multiple attempts | — | — | — | — | 79.4% | — |
Code editing
Aider Polyglot
|
74.0% / 68.6%
whole / diff
|
60.4%
diff
|
44.9%
diff
|
64.9%
diff
|
— |
56.9%
diff
|
|
Agentic coding
SWE-bench Verified
|
63.8% | 49.3% | 38.0% | 70.3% | — | 49.2% | |
Factuality
SimpleQA
|
52.9% | 13.8% | 62.5% | — | 43.6% | 30.1% | |
Visual reasoning
MMMU
|
single attempt (pass@1) | 81.7% | no MM support | 74.4% | 75.0% | 76.0% | no MM support |
|
multiple attempts | — | no MM support | — | — | 78.0% | no MM support |
Image understanding
Vibe-Eval (Reka)
|
69.4% | no MM support | — | — | — | no MM support | |
Long context
MRCR
|
128k (average) | 94.5% | 61.4% | 64.0% | — | — | — |
|
1M (pointwise) | 83.1% | — | — | — | — | — |
Multilingual performance
Global MMLU (Lite)
|
89.8% | — | — | — | — | — |
Building responsibly in the agentic era
As we develop these new technologies, we recognize the responsibility it entails, and aim to prioritize safety and security in all our efforts.
Learn moreFor developers
Gemini’s advanced thinking, native multimodality and massive context window empowers developers to build next-generation experiences.
Start buildingDeveloper ecosystem
Build with cutting-edge generative AI models and tools to make AI helpful for everyone.
Accessing our latest AI models
We want developers to gain access to our models as quickly as possible. We’re making these available through Google AI Studio.