The startup Nous Research has introduced a new line of language models, Hermes 4, which are focused on maximum open interaction with the user and minimal content restrictions. Hermes 4 supports the “hybrid reasoning” feature, allowing switching between quick responses and detailed step-by-step reasoning. When this mode is activated, the model shows its thought process in special tags before the final answer, ensuring transparency of the process.
In tests, the largest Hermes 4 model with 405 billion parameters scored 96.3% in the MATH-500 benchmark and 81.9% in the AIME’24 competition, reaching the level of closed systems. In the new RefusalBench test, which evaluates how often AI refuses to respond to requests, Hermes 4 showed a result of 57.1% in reasoning mode, significantly outperforming GPT-4o and Claude Sonnet 4.
The model was trained using two proprietary systems: DataForge, which generates complex training examples, and Atropos — an open platform for reinforcement learning with thousands of specialized environments. For training the largest version of Hermes 4, 192 Nvidia B200 GPUs and over 70,000 hours of computation were used.
Hermes 4 is available for download on Hugging Face, as well as through API and the new Nous Chat interface with support for parallel dialogues and a memory system. Users can apply the model for tasks requiring flexibility, lack of strict restrictions, and work with sensitive content.
Nous Research emphasizes maximum user control and transparency, providing a detailed technical report with test results and response examples. This approach makes Hermes 4 popular among developers and researchers who value openness and the ability to customize AI for their own needs.