Gemini vs Qwen for Math

Gemini edges out Qwen for pure mathematical reasoning, with superior GPQA Diamond performance (94% vs 88.4%) and a 1M-token context window ideal for complex problem-solving, though Qwen's exceptional AIME 2025 score (91.3%) proves it's a formidable competitor for advanced math problems. For the cost-conscious, Qwen offers compelling value with nearly equivalent math capabilities at a fraction of the price, but Gemini's extended thinking and broader feature set make it the slightly stronger all-around choice for demanding mathematical tasks.

Head-to-Head for Math

Math Competition Performance

Qwen

Gemini

No dedicated benchmark

Qwen

91.3% AIME 2025

General Mathematical Reasoning

Gemini

Gemini

94% GPQA Diamond

Qwen

88.4% GPQA Diamond

Extended Thinking for Complex Problems

Tie

Gemini

Available

Qwen

Available

Context Window for Large Problems

Gemini

Gemini

1M token capacity

Qwen

256K token capacity

Cost-Effectiveness for Math Users

Qwen

Gemini

$2.00 per 1M input

Qwen

$0.40 per 1M input

See Them Answer a Math Question

We asked both models the same math-focused prompt.

Solve the integral ∫(x²·ln(x))dx step by step, then verify your answer by differentiating the result.

Gemini 3.1 Pro

Solve the integral ∫(x²·ln(x))dx step by step, then verify your answer by differentiating the result.

Here is the step-by-step solution to evaluate the integral, followed by the verification.

Part 1: Evaluating the Integral

We want to solve the integral: ∫(x² · ln(x)) dx

Since we are integrating the product of two different types of functions (algebraic and logarithmic), we should use Integration by Parts. The formula for integration by parts is: ∫ u dv = u·v - ∫ v du

Step 1: Choose u and dv. To decide which function should be $u$, we can use the LIATE acronym (Logarithmic, Inverse trigonometric, Algebraic, Trigonometric, Exponential). Logarithmic comes before Algebraic, so:

  • Let u = ln(x)
  • Let dv = x² dx

Step 2: Differentiate u and integrate dv.

  • Differentiate $u$ to find $du$: du = $\frac{1}{x}$ dx
  • Integrate $dv$ to find $v$: v = ∫ x² dx = $\frac{x^3}{3}$

Step 3: Apply the integration by parts formula. ∫(x² · ln(x)) dx = (ln(x)) · ($\frac{x^3}{3}$) - ∫ ($\frac{x^3}{3}$) · ($\frac{1}{x}$) dx

Step 4: Simplify and solve the new integral. = $\frac{x^3}{3}$ ln(x) - ∫ $\frac{x^2}{3}$ dx

Pull the constant ($\frac{1}{3}$) out of the integral: = $\frac{x^3}{3}$ ln(x) - $\frac{1}{3}$ ∫ x² dx

Integrate $x^2$: = $\frac{x^3}{3}$ ln(x) - $\frac{1}{3}$ ($\frac{x^3}{3}$) + C

Step 5: Final simplification. = $\frac{x^3}{3}$ ln(x) - $\frac{x^3}{9}$ + C

(Optional: You can also factor out $\frac{x^3}{9}$ to write it as $\frac{x^3}{9}$(3 ln(x) - 1) + C)


Part 2: Verifying the Answer

To verify, we will take the derivative of our result with respect to $x$. If our result is correct, the derivative will equal the original integrand, $x^2 \ln(x)$.

Find $\frac{d}{dx}$ [ $\frac{x^3}{3}$ ln(x) - $\frac{x^3}{9}$ + C ]

Step 1: Apply the derivative rules.

  • For the first term ($\frac{x^3}{3}$ ln(x)), we must use the Product Rule: $\frac{d}{dx}[f(x)g(x)] = f'(x)g(x) + f(x)g'(x)$
  • For the second term (-$\frac{x^3}{9}$), we use the basic power rule.
  • For the constant (C), the derivative is 0.

Step 2: Differentiate the first term using the Product Rule.

  • Let f(x) = $\frac{x^3}{3}$, so f'(x) = $x^2$
  • Let g(x) = ln(x), so g'(x) = $\frac{1}{x}$
  • Apply the rule: ($x^2$)(ln(x)) + ($\frac{x^3}{3}$)($\frac{1}{x}$) = $x^2$ ln(x) + $\frac{x^2}{3}$

Step 3: Differentiate the second term.

  • $\frac{d}{dx}$ [-$\frac{x^3}{9}$] = -$\frac{3x^2}{9}$ = -$\frac{x^2}{3}$

Step 4: Combine everything. Derivative = [ $x^2$ ln(x) + $\frac{x^2}{3}$ ] - $\frac{x^2}{3}$ + 0

Notice that the $+\frac{x^2}{3}$ and $-\frac{x^2}{3}$ cancel each other out.

Derivative = x² · ln(x)

Because taking the derivative mathematically brings us perfectly back to the original function inside the integral, our answer is verified as exactly correct!

Try math tasks with both models

See Gemini and Qwen answer side by side in Multichat

Try it yourself — free

Detailed Breakdown

When it comes to mathematical tasks, Qwen holds a meaningful edge over Gemini — at least on the benchmarks that matter most for math. Qwen3.5 Plus scored 91.3% on AIME 2025, one of the most demanding math competition benchmarks available, and 88.4% on GPQA Diamond. These numbers place it firmly in the top tier for quantitative reasoning. Gemini 3.1 Pro scores 94% on GPQA Diamond, which covers graduate-level science and reasoning, but lacks a published AIME score for direct comparison on pure math competition problems.

For everyday math use — solving equations, working through calculus, checking proofs, or explaining statistics — both models perform well. The difference emerges at the edges. Qwen's extended thinking mode allows it to work through multi-step problems with careful, deliberate reasoning, which is critical for competition-level or graduate-level mathematics where a single misstep cascades into a wrong answer. Gemini also supports extended thinking, so both can tackle complex derivations, but Qwen's stronger AIME performance suggests it handles olympiad-style problem structures with greater reliability.

Gemini has one practical advantage: code execution. If your math workflow involves running Python scripts, verifying numerical results, or using libraries like NumPy or SymPy, Gemini can execute code directly in the conversation. This is genuinely useful for applied math — think checking an integral numerically, simulating probability distributions, or validating a matrix operation. Qwen currently lacks this capability, meaning you'd need to copy its output into a separate environment to verify computational results.

For students working through homework, tutoring themselves on linear algebra, or preparing for standardized tests like the SAT or GRE, either model will serve well. For researchers, engineers, or competitive math students tackling harder problems, Qwen's benchmark profile is more reassuring. Its AIME 2025 score in particular reflects the kind of structured, multi-step mathematical reasoning that separates capable models from exceptional ones.

Cost is also worth noting. Qwen's API pricing is roughly five times cheaper than Gemini's, which matters if you're building a math tutoring tool or running large volumes of problem sets programmatically.

Recommendation: Choose Qwen if pure mathematical reasoning is your priority — its AIME performance is the strongest signal here. Choose Gemini if you need code execution alongside math, or if you're already embedded in the Google ecosystem and want seamless integration with Docs or Sheets for applied quantitative work.

Frequently Asked Questions

Other Topics for Gemini vs Qwen

Math Comparisons for Other Models

Try math tasks with Gemini and Qwen

Compare in Multichat — free

Join 10,000+ professionals who use Multichat