DeepSeek's New V3.1 Release Points To Potent New Chinese Chips Coming Soon

Chinese AI darling DeepSeek unveiled an update to its flagship large language model that the company claims is already optimized for use with a new generation of homegrown silicon.

According to DeepSeek, it trained the new V3.1 model using the UE8M0 data type, scaling the FP8 format that's already supported by the likes of Nvidia.

In a WeChat comment, the org clarified that the change was made in anticipation of a new generation of silicon. "UE8M0 FP8 is designed for the next generation of domestically produced chips to be released soon," the company wrote.

Lower-precision data types offer several benefits, including reduced memory consumption and higher throughput for both inference and training. However, it's worth noting DeepSeek was already using FP8, specifically the E4M3 type. As such, the switch to UE8M0 appears to be more about compatibility than efficiency.

DeepSeek hasn't named the source of the chips its new model can use, but the AI startup has reportedly been working closely with Huawei on training and inference using its Ascend family of neural processing units (NPUs).

Huawei's Ascend 910C, which powers its CloudMatrix rack systems we looked at last month, doesn't support FP8 natively, suggesting the IT giant may have even more powerful accelerators on the way.

Last week, it was reported that DeepSeek had attempted to train its next-gen R2 model on Huawei's Ascend accelerators but struggled to make them work and reverted to using Nvidia H20 accelerators. DeepSeek is now said to be evaluating Huawei's accelerators for inference duty.

It's not clear whether or not the so-called R2 refers to the V3.1 model released this week or a forthcoming model.

Not really so new

DeepSeek V3.1 isn't really a new model. It was trained from an earlier V3 checkpoint.

Despite this, the LLM does promise notable improvements. With V3.1, DeepSeek is no longer differentiating between its "thinking" and "non-thinking" models. V3.1 supports both paradigms in a single model and uses a pair of chat templates to toggle between the two. As such, the company’s chatbot interface now omits any reference to R1.

The idea of a unified model capable of reasoning and non reasoning outputs isn't new. Alibaba attempted something like this earlier this year but abandoned the idea after finding the functionality degraded the quality of its Qwen 3 models.

At least in benchmarking, DeepSeek's V3.1 appears to have avoided that problem. Compared to V3, the point release's non-thinking model achieved significant gains across the board.

Here's how DeepSeek says its new hybrid reasoning model compares to R1

Here's how DeepSeek says its new hybrid reasoning model compares to R1 - Click to enlarge

With thinking enabled, the model's gains were more modest. However that doesn't quite tell the full story, as DeepSeek notes that the model now requires far fewer thinking tokens to arrive at an answer than before, which should help to cut costs associated with serving the model.

Speaking of tokens, DeepSeek has boosted the number of tokens in its context window, which you can think of as its short-term memory, from 65,536 to 131,072. While a significant improvement, that still trails other Chinese models like Qwen3, which can handle million-token contexts.

DeepSeek also boasted of significant gains in tool and function calling capabilities crucial for agentic AI workloads where external tools and data must be retrieved on the fly.

For example, in Browsecomp, a benchmark aimed at autonomous browser use tasks, DeepSeek v3.1 achieved a score of 30 where the May refresh of R1 managed a score of 8.9.

Along with access via its Chatbot service and API endpoint, DeepSeek has also made the mode weights for both the base and instruct-tuned models available for download on Hugging Face and ModeScope. ®

RECENT NEWS

From Chip War To Cloud War: The Next Frontier In Global Tech Competition

The global chip war, characterized by intense competition among nations and corporations for supremacy in semiconductor ... Read more

The High Stakes Of Tech Regulation: Security Risks And Market Dynamics

The influence of tech giants in the global economy continues to grow, raising crucial questions about how to balance sec... Read more

The Tyranny Of Instagram Interiors: Why It's Time To Break Free From Algorithm-Driven Aesthetics

Instagram has become a dominant force in shaping interior design trends, offering a seemingly endless stream of inspirat... Read more

The Data Crunch In AI: Strategies For Sustainability

Exploring solutions to the imminent exhaustion of internet data for AI training.As the artificial intelligence (AI) indu... Read more

Google Abandons Four-Year Effort To Remove Cookies From Chrome Browser

After four years of dedicated effort, Google has decided to abandon its plan to remove third-party cookies from its Chro... Read more

LinkedIn Embraces AI And Gamification To Drive User Engagement And Revenue

In an effort to tackle slowing revenue growth and enhance user engagement, LinkedIn is turning to artificial intelligenc... Read more