China’s DeepSeek releases ‘intermediate’ AI model on route to next generation

(Reuters) – Chinese AI developer DeepSeek has released its “experimental” latest model, which it said was more efficient to train and better at processing long sequences of text than previous iterations of its large language models.

The Hangzhou-based company called DeepSeek-V3.2-Exp an “intermediate step toward our next-generation architecture” in a post on developer forum Hugging Face.

That architecture will likely be DeepSeek’s most important product release since V3 and R1 shocked Silicon Valley and tech investors outside China.

The V3.2-Exp model includes a mechanism called DeepSeek Sparse Attention, which the Chinese firm says can cut computing costs and boost some types of model performance. DeepSeek said in a post on X on Monday that it is cutting API prices by “50%+”.

While DeepSeek’s next-generation architecture is unlikely to roil markets as previous versions did in January, it could still put significant pressure on domestic rivals like Alibaba’s Qwen and U.S. counterparts like OpenAI if it can repeat the success of DeepSeek R1 and V3.

That would require it to demonstrate high capability for a fraction of what competitors charge and spend in model training.