DeepSeek

DeepSeek V4-Flash

Name: DeepSeek V4-Flash — Fast 284B MoE
Brand: DeepSeek
Rating: 4.9 (187 reviews)

Fast, lower-cost DeepSeek V4 variant

V4-Flash brings the V4 architecture down to 284B total / 13B active parameters, keeping the 1M-token context while drastically reducing inference cost.

Open Chat DeepSeek Image Generator

Key features

DeepSeek V4-Flash · AI Models

Parameters 284B total · 13B active

Context window 1M tokens

Released April 24, 2026

Pricing Lower-cost DeepSeek API tier

Key features

DeepSeek V4-Flash

V4-Flash brings the V4 architecture down to 284B total / 13B active parameters, keeping the 1M-token context while drastically reducing inference cost.

Key features

284B total / 13B active parameters.
Same V4 hybrid attention as V4-Pro.
1M-token context window.
Significantly cheaper than V4-Pro.

Best for

Use V4-Flash for high-throughput chat, classification and coding workloads where you want V4 quality at a lower price point.

Frequently Asked Questions

How does DeepSeek V4-Flash compare to V4-Pro?

V4-Flash uses 284B total / 13B active parameters vs V4-Pro's 1.6T / 49B. Flash is significantly cheaper and faster while sharing the same hybrid attention architecture.

Can I self-host DeepSeek V4-Flash?

DeepSeek makes V4-Flash available via their API. The open-weight model enables enterprise deployments that need cost-efficient V4 quality on their own infrastructure.

Open Chat

V4-Flash brings the V4 architecture down to 284B total / 13B active parameters, keeping the 1M-token context while drastically reducing inference cost.

Open Chat