DeepSeek V4-Pro Hits 1.6T Parameters and 80.6% on SWE-Verified, Trailing Frontier Models by 3–6 Months

DeepSeek's V4-Pro becomes the largest open-weight model available at 1.6 trillion parameters, posting 80.6% on SWE-Verified to match Claude Opus 4.6 — but coding and knowledge benchmarks suggest the open-weight stack still trails closed frontier labs by roughly three to six months.

DeepSeek V4-Pro architecture and benchmark scores closing the gap with frontier closed-source models.
V4-Pro's 1.6T-parameter open-weight stack is the largest of its class, with benchmarks within 3–6 months of frontier.

Beneath the pricing headline, DeepSeek V4-Pro represents a structural shift in the open-weight tier. At 1.6 trillion total parameters with 49 billion active per inference, it is now the largest open-weight model publicly released — paired with a 1 million-token context window that previously existed only behind closed APIs.

Architecture worth attention

V4-Pro is a Mixture-of-Experts model using compressed sparse attention and a new heavily compressed attention layer aimed at long-context efficiency. DeepSeek claims V4-Pro-Max exceeds GPT-5.2 and Gemini 3.0 Pro on selected reasoning benchmarks; coding performance is described as "comparable to GPT-5.4." V4-Flash, the lighter sibling, ships at 284B total / 13B active. Both add an "interleaved thinking" mode positioned for multi-step agent workflows.

Where the gap remains

Knowledge tests still trail. V4-Pro-Max scores 87.5% on MMLU-Pro against Gemini 3.1 Pro at 91.0%, and lags GPT-5.4 on broad-domain knowledge. Both DeepSeek models remain text-only, while frontier closed-source competitors offer audio, video, and image natively. The pattern aligns with what TechCrunch's analysis frames as a 3–6 month trailing gap versus state-of-the-art frontier labs — a gap that has held roughly steady across V3.2 → V4.

DeepSeek previews new AI model that 'closes the gap' with frontier models
TechCrunch on V4-Pro/Flash architecture, benchmark deltas, and the open-weight trajectory.

Our Take

The 3–6 month gap framing is the durable insight here. If that delta holds, every closed-source frontier release becomes a forward leading indicator for what arrives in open-weight form by the next quarter. For enterprise architects that translates to a planning rule: any workload that can tolerate a one-quarter capability lag can be locked to an open-weight track at a 90%+ cost savings. The remaining wedge for closed-source labs is in modalities (audio/video/image) and tool-use reliability — which is exactly where the next 6 months of differentiation will play out.

Want every AI × Web3 signal the moment it breaks? Subscribe to the BlockAI News daily brief.

How we report: This article cites primary sources, regulatory filings, and on-chain data where available. BlockAI News uses AI tools to assist with research and first-draft generation; every article is reviewed and edited by a human editor before publication. Read our full How We Report page, Editorial Policy, AI Use Policy, and Corrections Policy.

Keep Reading

Claude Overtakes ChatGPT in Enterprise: Ramp's 34.4% vs 32.3% Bombshell

Claude Overtakes ChatGPT in Enterprise: Ramp's 34.4% vs 32.3% Bombshell

TL;DR

  • Ramp's May 2026 AI Index shows Anthropic at 34.4% enterprise adoption vs OpenAI's 32.3% — the first-ever crossover across 50,000+ businesses.
  • Claude Code alone hit ~$2.5B annualized run-rate; Anthropic quadrupled enterprise adoption year-over-year while OpenAI grew just 0.3%.
  • Tokenized Anthropic pre-IPO shares on Solana crashed ~34–39% after Anthropic voided SPV-based share transfers, exposing private-market fragility.

It took twelve months and a coding tool to reshape enterprise AI's entire competitive map. On May 13, 2026, Ramp — the corporate card

Read full story →

Stay Ahead of the Market

Daily AI & crypto briefings — straight to your inbox, your phone, and your timeline.