DeepSeek vs Qwen for Math

DeepSeek edges out Qwen on standardized math competitions (AIME: 93.1% vs 91.3%), but Qwen demonstrates stronger performance across broader math benchmarks like GPQA Diamond (88.4% vs 82.4%) and MMLU Pro (87.8% vs 85.0%). For general mathematical problem-solving, Qwen is the more well-rounded choice, while DeepSeek excels specifically at contest-style math with its dedicated R1 reasoning mode—both are highly capable and affordable options.

Head-to-Head for Math

Math Competition Performance

DeepSeek

DeepSeek

Outstanding (93.1% AIME)

Qwen

Very strong (91.3% AIME)

Quantitative Reasoning

Qwen

DeepSeek

Strong (82.4% GPQA)

Qwen

Excellent (88.4% GPQA)

Extended Reasoning Capability

Tie

DeepSeek

Yes (R1 reasoning mode)

Qwen

Yes (extended thinking)

Context for Long Mathematical Proofs

Qwen

DeepSeek

Adequate (128K tokens)

Qwen

Superior (256K tokens)

API Cost for Math Solutions

DeepSeek

DeepSeek

More economical (cheaper output)

Qwen

Higher output costs

See Them Answer a Math Question

We asked both models the same math-focused prompt.

Solve this step-by-step: Find all integer solutions to the equation x³ - 6x² + 11x - 6 = 0, then prove that the sum of cubes of the roots equals 36.

DeepSeek V3.2

Solve this step-by-step: Find all integer solutions to the equation x³ - 6x² + 11x - 6 = 0, then prove that the sum of cubes of the roots equals 36.

Alright, let's work through this step-by-step.


Step 1 – Solve x36x2+11x6=0x^3 - 6x^2 + 11x - 6 = 0 for integer solutions.

We look for integer roots using the Rational Root Theorem: possible integer roots are divisors of the constant term 6-6: ±1,±2,±3,±6\pm 1, \pm 2, \pm 3, \pm 6.

Try x=1x = 1: 136(1)2+11(1)6=16+116=01^3 - 6(1)^2 + 11(1) - 6 = 1 - 6 + 11 - 6 = 0 Yes, x=1x = 1 is a root.

Now perform polynomial division: x36x2+11x6x^3 - 6x^2 + 11x - 6 divided by x1x - 1:

Using synthetic division with 11:

1161161561560\begin{array}{r|rrrr} 1 & 1 & -6 & 11 & -6 \\ & & 1 & -5 & 6 \\ \hline & 1 & -5 & 6 & 0 \\ \end{array} So the quotient is x25x+6x^2 - 5x + 6.

Factor x25x+6x^2 - 5x + 6: (x2)(x3)=0(x - 2)(x - 3) = 0 Roots: x=2x = 2 and x=3x = 3.

Thus, integer solutions: x=1,  x=2,  x=3x = 1, \; x = 2, \; x = 3.


Step 2 – Sum of cubes of the roots

The roots are r1=1,r2=2,r3=3r_1 = 1, r_2 = 2, r_3 = 3.

We want r13+r23+r33r_1^3 + r_2^3 + r_3^3:

13+23+33=1+8+27=361^3 + 2^3 + 3^3 = 1 + 8 + 27 = 36 Numerically, it's already 3636.


Step 3 – Prove algebraically using Vieta's formulas

For x36x2+11x6=0x^3 - 6x^2 + 11x - 6 = 0, sum of roots S1=r1+r2+r3=6S_1 = r_1 + r_2 + r_3 = 6.

Sum of roots taken two at a time: S2=r1r2+r2r3+r3r1=11S_2 = r_1 r_2 + r_2 r_3 + r_3 r_1 = 11.

Product of roots: S3=r1r2r3=6S_3 = r_1 r_2 r_3 = 6.

We know the identity: r13+r23+r33=S133S1S2+3S3r_1^3 + r_2^3 + r_3^3 = S_1^3 - 3 S_1 S_2 + 3 S_3 Let’s verify the identity first (standard identity for sum of cubes of roots in terms of elementary symmetric sums).

Actually, known formula: r13+r23+r33=S133S1S2+3S3r_1^3 + r_2^3 + r_3^3 = S_1^3 - 3 S_1 S_2 + 3 S_3 Where S1=riS_1 = \sum r_i, S2=i<jrirjS_2 = \sum_{i<j} r_i r_j, S3=r1r2r3S_3 = r_1 r_2 r_3.

Check: From ri3=riri2r_i^3 = r_i \cdot r_i^2, but better:
Use ri3=(ri)33rii<jrirj+3r1r2r3\sum r_i^3 = (\sum r_i)^3 - 3\sum r_i \sum_{i<j} r_i r_j + 3 r_1 r_2 r_3 (since r12r2+r1r22=r1r2(r1+r2)r_1^2 r_2 + r_1 r_2^2 = r_1 r_2 (r_1 + r_2) etc., sum over all symmetric).

Derivation (known identity): (r1+r2+r3)3=r13+r23+r33+3(r1+r2+r3)(r1r2+r2r3+r3r1)3r1r2r3(r_1 + r_2 + r_3)^3 = r_1^3 + r_2^3 + r_3^3 + 3(r_1 + r_2 + r_3)(r_1 r_2 + r_2 r_3 + r_3 r_1) - 3 r_1 r_2 r_3 Rearrange: r13+r23+r33=S133S1S2+3S3r_1^3 + r_2^3 + r_3^3 = S_1^3 - 3 S_1 S_2 + 3 S_3


Step 4 – Plug values

Here: S1=6,S2=11,S3=6S_1 = 6, \quad S_2 = 11, \quad S_3 = 6 r13+r23+r33=633611+36r_1^3 + r_2^3 + r_3^3 = 6^3 - 3 \cdot 6 \cdot 11 + 3 \cdot 6 =216198+18=36= 216 - 198 + 18 = 36


Step 5 – Conclusion

We found integer solutions 1,2,31, 2, 3, and using Vieta’s formulas we proved the sum of their cubes is 3636.


36\boxed{36}

Try math tasks with both models

See DeepSeek and Qwen answer side by side in Multichat

Try it yourself — free

Detailed Breakdown

Both DeepSeek and Qwen are serious contenders for mathematical work, and the benchmarks tell an interesting story. DeepSeek scores 93.1% on AIME 2025 — a competition-level math benchmark — edging out Qwen's 91.3%. That gap, while small, is meaningful: AIME problems require multi-step algebraic and combinatorial reasoning, so DeepSeek's advantage here reflects genuine strength in structured mathematical problem-solving.

DeepSeek's edge in pure math stems partly from its R1 reasoning model, which uses extended chain-of-thought thinking to work through proofs, calculus problems, and number theory step by step. For tasks like solving differential equations, verifying mathematical proofs, or tackling olympiad-style problems, DeepSeek R1 is one of the strongest open-source options available. Students preparing for competitions or researchers working through formal derivations will find DeepSeek particularly capable.

Qwen closes the gap quickly when math intersects with broader knowledge domains. Its MMLU Pro score of 87.8% (versus DeepSeek's 85.0%) and GPQA Diamond score of 88.4% (versus 82.4%) suggest Qwen performs better on applied and interdisciplinary problems — think physics word problems, statistics in data science contexts, or financial mathematics. Qwen's image understanding capability is also a practical differentiator: you can photograph a handwritten equation or a textbook page and have Qwen parse and solve it directly. DeepSeek cannot do this.

For classroom or tutoring use cases, Qwen's 256K context window gives it a significant advantage. It can hold an entire problem set, worked solutions, and ongoing student dialogue in a single session without losing context. DeepSeek's 128K window is still generous but may hit limits in lengthy tutoring sessions or when working through a full chapter of problems.

Both models are highly cost-effective compared to commercial alternatives, with API pricing under $2 per million output tokens. DeepSeek is slightly cheaper on input ($0.56/M vs $0.40/M favoring Qwen, though Qwen charges more on output), making either viable for high-volume math tutoring applications.

The recommendation depends on the use case. For pure mathematical reasoning — proofs, competition problems, symbolic manipulation — DeepSeek with its R1 model is the stronger choice. For applied math, STEM problem-solving that involves diagrams or images, or long-form tutoring sessions, Qwen's broader capabilities and larger context window give it the edge. Power users doing serious mathematical work would do well to keep both accessible and route tasks accordingly.

Frequently Asked Questions

Other Topics for DeepSeek vs Qwen

Math Comparisons for Other Models

Try math tasks with DeepSeek and Qwen

Compare in Multichat — free

Join 10,000+ professionals who use Multichat