DeepSeek: DeepSeek V3.2 Exp

160K Context
0.27/M Input Tokens
0.4/M Output Tokens

DeepSeek
Text 2 text
13 Oct, 2025

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism designed to improve training and inference efficiency in long-context scenarios while maintaining output quality. Users can control the reasoning behaviour with the reasoning enabled boolean. Learn more in our docs

The model was trained under conditions aligned with V3.1-Terminus to enable direct comparison. Benchmarking shows performance roughly on par with V3.1 across reasoning, coding, and agentic tool-use tasks, with minor tradeoffs and gains depending on the domain. This release focuses on validating architectural optimizations for extended context lengths rather than advancing raw task accuracy, making it primarily a research-oriented model for exploring efficient transformer designs.

DeepSeek: DeepSeek V3 0324

Text 2 text

DeepSeek V3，一个拥有685B参数的混合专家模型，是DeepSeek团队旗舰聊天模型系列的最新版本。它继承了DeepSeek V3模型，并在多种任务上表现出色。 ...

DeepSeek 62.5K context $0.27/M input tokens $1.1/M output tokens

FREE

DeepSeek: DeepSeek V3 0324 (free)

Text 2 text

# Free

DeepSeek V3，一个拥有685B参数的混合专家模型，是DeepSeek团队旗舰聊天模型系列的最新版本。它继承了DeepSeek V3模型，并在多种任务上表现出色。 ...

DeepSeek 62.5K context $0 input tokens $0 output tokens

FREE

DeepSeek: DeepSeek V3.1 (free)

Text 2 text

# Free

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase ...

DeepSeek 159.96K context $0 input tokens $0 output tokens

DeepSeek V3

Text 2 text

# New # Hot

1. 介绍我们推出了 DeepSeek-V3，这是一款强大的混合专家 (MoE) 语言模型，拥有 671B 的总参数，其中每个令牌激活 37B。为了实现高效推理和具有成本效益的训练，DeepSeek-V3 采用了多头潜在注意力 (MLA) 和 DeepSeekMoE 架构，这些架构在 DeepSeek-V2 中得到了充分验证。此外，DeepSeek-V3 首创了一种无辅助损失的 ...

DeepSeek 62.5K context $0.14/M input tokens $0.28/M output tokens

DeepSeek V3

Text 2 text

# New # Hot

DeepSeek-V3 是 DeepSeek 团队最新的模型，基于之前版本的指令跟随和编码能力。该模型在近 15 万亿个标记上进行预训练，报告的评估显示该模型在性能上优于其他开源模型，并与领先的闭源模型相媲美。有关模型的详细信息，请访问 DeepSeek-V3 仓库以获取更多信息。 DeepSeek-V2 Chat 是 DeepSeek-V2 的对话微调版本，属于混合专家（MoE）语言模型。 ...

DeepSeek 62.5K context $0.14/M input tokens $0.28/M output tokens

FREE

DeepSeek: R1 0528 (free)

Text 2 text

# Free

DeepSeek-R1 1. 介绍我们介绍我们的第一代推理模型，DeepSeek-R1-Zero 和 DeepSeek-R1。 DeepSeek-R1-Zero 是通过大规模强化学习（RL）训练的模型，没有经过监督微调（SFT）作为初步步骤，表现出卓越的推理能力。通过 RL，DeepSeek-R1-Zero 自然展现出许多强大且有趣的推理行为。然而，DeepSeek-R ...

DeepSeek 160K context $0 input tokens $0 output tokens

DeepSeek: DeepSeek V3.2 Exp

Tags :

Share :

Related Posts

DeepSeek: DeepSeek V3 0324

DeepSeek: DeepSeek V3 0324 (free)

DeepSeek: DeepSeek V3.1 (free)

DeepSeek V3

DeepSeek V3

DeepSeek: R1 0528 (free)