Gemini vs Kimi for Coding

Gemini's 1M token context window and native code execution make it the clear winner for reviewing large codebases and running code directly, while Kimi delivers solid coding benchmarks (85% LiveCodeBench, 76.8% SWE-bench) at a fraction of the cost. Pick Gemini for enterprise-scale projects and full development workflows; pick Kimi for everyday coding tasks and budget-conscious development.

Head-to-Head for Coding

Code Execution

Gemini

Gemini

Built-in sandbox

Kimi

No execution

Context Window

Gemini

Gemini

1M tokens

Kimi

128K tokens

Coding Benchmarks

Kimi

Gemini

Strong reasoning ability

Kimi

76.8% on SWE-bench

API Cost

Kimi

Gemini

$2-12 per million

Kimi

~$0.60-3 per million

Web Search

Gemini

Gemini

Google Search included

Kimi

Not available

See Them Answer a Coding Question

We asked both models the same coding-focused prompt.

Write a Python function that takes a nested JSON object and flattens it into a single-level dictionary with dot-notation keys, handling arrays by including the index in the key path.

Gemini 3.1 Pro

Write a Python function that takes a nested JSON object and flattens it into a single-level dictionary with dot-notation keys, handling arrays by including the index in the key path.

Here is the Python function to achieve this. It uses recursion to traverse the nested dictionary and lists, building the dot-notation keys along the way.

def flatten_json(nested_data, parent_key='', sep='.'):
    """
    Flattens a nested JSON object (dictionary/list) into a single-level dictionary.
    
    Args:
        nested_data (dict | list): The parsed JSON data to flatten.
        parent_key (str): The base key upon which to build the dot-notation path.
        sep (str): The separator to use between key levels (default is '.').
        
    Returns:
        dict: A flat dictionary with dot-notation keys.
    """
    flattened = {}
    
    def _flatten(obj, current_key):
        # Handle dictionaries
        if isinstance(obj, dict):
            for k, v in obj.items():
                # Append the key, separated by a dot if it's not the root
                new_key = f"{current_key}{sep}{k}" if current_key else str(k)
                _flatten(v, new_key)
                
        # Handle arrays (lists)
        elif isinstance(obj, list):
            for i, v in enumerate(obj):
                # Append the array index as part of the key path
                new_key = f"{current_key}{sep}{i}" if current_key else str(i)
                _flatten(v, new_key)
                
        # Handle base cases (strings, ints, booleans, None)
        else:
            flattened[current_key] = obj
            
    _flatten(nested_data, parent_key)
    return flattened

Example Usage:

# Sample nested JSON object
json_data = {
    "user": {
        "id": 123,
        "name": {
            "first": "Alice",
            "last": "Smith"
        }
    },
    "tags": ["python", "data", "json"],
    "active": True,
    "settings": [
        {"theme": "dark"},
        {"notifications": False}
    ]
}

flat_data = flatten_json(json_data)

# Print the result nicely formatted 
import json
print(json.dumps(flat_data, indent=4))

Output:

{
    "user.id": 123,
    "user.name.first": "Alice",
    "user.name.last": "Smith",
    "tags.0": "python",
    "tags.1": "data",
    "tags.2": "json",
    "active": true,
    "settings.0.theme": "dark",
    "settings.1.notifications": false
}

How it works:

  1. Helper Function (_flatten): Keeps the state contained so you don't continually create and update a bunch of intermediate dictionaries in memory. It directly writes to the parent flattened dictionary.
  2. isinstance(obj, dict): Loops through key-value pairs, appending the new key to the parent path string.
  3. isinstance(obj, list): Iterates via enumerate() to extract the index i alongside the value v, adding the index into the dot-notation path natively.
  4. Base Case (else): Once it reaches an ending value (like a string, integer, or boolean), it adds the accumulated key path and value to the flattened dictionary.

Try coding tasks with both models

See Gemini and Kimi answer side by side in Multichat

Try it yourself — free

Detailed Breakdown

When it comes to coding assistance, Kimi holds a meaningful edge over Gemini — at least on paper. Its K2.5 model scores 76.8% on SWE-bench Verified, one of the most respected real-world software engineering benchmarks, and posts a 85.0% on LiveCodeBench v6. These aren't abstract math scores; SWE-bench measures an AI's ability to resolve actual GitHub issues, which closely mirrors day-to-day developer work. Gemini doesn't publish comparable SWE-bench numbers, making direct comparison difficult — but Kimi's results are genuinely competitive with the best models available.

Gemini's coding strength comes from a different angle. Its 1 million token context window is transformative for large codebases. You can load an entire monorepo, multiple files of documentation, and a long conversation history simultaneously — something that's simply not possible with Kimi's 128K window. For developers working on legacy systems, large refactors, or complex multi-file debugging sessions, Gemini's context advantage is real and practical. It also supports native code execution, meaning it can run snippets, test logic, and iterate on solutions without switching tools.

Kimi counters with strong reasoning and multi-step task coordination. Its ability to break down complex engineering problems into parallel sub-tasks makes it well-suited for tasks like designing a system architecture, writing a full feature end-to-end, or debugging intricate logic chains. Its AIME 2025 score of 96.1% signals excellent mathematical reasoning, which translates to algorithmic problem-solving, data structures, and competitive programming scenarios.

For real-world use cases: if you're doing competitive programming or LeetCode-style problems, Kimi's raw reasoning performance makes it a strong pick. If you're building with Google's ecosystem — deploying to Cloud Run, writing Apps Script, or working inside Google Colab — Gemini's integrations offer genuine workflow advantages. For frontend developers, Gemini's multimodal input lets you upload a screenshot and get code from it directly.

Cost is also a factor. Kimi's API is significantly cheaper (~$0.60/1M input tokens vs ~$2.00 for Gemini), which matters for teams running automated pipelines, code review bots, or CI-integrated assistants.

Recommendation: For most coding tasks — especially backend development, debugging, and algorithmic work — Kimi K2.5 is the stronger technical performer. Its benchmark results and reasoning capabilities are purpose-built for code. Choose Gemini if you need to work across massive codebases, benefit from Google ecosystem integrations, or require code execution in a managed environment.

Frequently Asked Questions

Other Topics for Gemini vs Kimi

Coding Comparisons for Other Models

Try coding tasks with Gemini and Kimi

Compare in Multichat — free

Join 10,000+ professionals who use Multichat