The Chinese company DeepSeek has introduced its new open AI model DeepSeek V3.1 with 685 billion parameters, which quickly became popular among researchers and developers worldwide. The model appeared on the Hugging Face platform and is available for free download, distinguishing it from the products of American industry leaders, which typically require paid access through API.
DeepSeek V3.1 processes up to 128,000 tokens in context, allowing the model to work with large volumes of text, such as documents hundreds of pages long. It supports various precision formats, including BF16 and FP8, enabling developers to tailor it to their technical needs. The model is based on a hybrid architecture that combines chat, coding, and logical reasoning functions into a single solution.
Testing showed that DeepSeek V3.1 achieves 71.6% on the well-known Aider benchmark, which is 1% higher than the Claude Opus 4 score, while being significantly cheaper to use. The community paid special attention to the new special tokens in the model, which allow for real-time search integration and internal logical operations, enhancing its flexibility in various tasks.
DeepSeek has abandoned the separation of model lines and now offers a single version V3.1 for all users. The model is about 700 GB in size, requiring powerful computing resources, but cloud service providers are already preparing solutions for its deployment. The openness and high quality of DeepSeek V3.1 have already impacted the distribution of power among AI developers, making advanced capabilities accessible to a wider range of users.