The San Francisco-based startup Deep Cogito, founded by former Google employees, introduced four new large language models, Cogito v2, whose main feature is the ability to independently improve reasoning skills over time. The models are available for download on Hugging Face, local deployment via Unsloth, and for use through API Together AI, Baseten, and RunPod. Users can choose between models of different sizes, ranging from 70 to 671 billion parameters, including both dense options and Mixture-of-Experts, tailored for various tasks and resources.
Cogito v2-671B MoE became the flagship of the series, as it combines a large number of parameters with efficient query routing, allowing for high accuracy with noticeably shorter reasoning chains. A compact version of this model in FP8 format is also available for developers, simplifying deployment on less powerful hardware and reducing infrastructure costs while maintaining up to 99% performance.
Each of the Cogito v2 models can not only quickly respond to a query but also, when needed, “ponder” over the answer, retaining these processes in its memory during training. As a result, the models gradually learn to choose the shortest and most efficient logical paths, enhancing their performance, reducing latency, and ensuring correctness even in complex tasks. Testing showed that Cogito v2-671B MoE outperforms other open models in tasks where strategy and logical analysis are important, and confidently handles multi-step questions and mathematical problems.
Deep Cogito emphasizes that for training all eight models, including previous versions, the company spent less than $3.5 million, which is significantly less than the budgets of other market leaders. This efficiency was achieved by training the models to skip unnecessary or erroneous reasoning and form “machine intuition.” Cogito v2 models are now available for use by developers, researchers, and corporate teams, and the company promises to maintain open access to all future versions.