A team of researchers from Microsoft has introduced the largest 1-bit generative AI model to date, called BitNet b1.58 2B4T, which is already available for free use under the MIT license. Unlike standard solutions, this model can even run on regular processors, including Apple M2 chips, opening up new possibilities for users with minimal hardware resources.
BitNet b1.58 2B4T uses a unique compression approach, storing parameters as only three values — minus one, zero, and one. This structure allows the model to remain extremely efficient in terms of memory and speed, outperforming traditional models of similar size. Training this model required a massive amount of data — four trillion tokens, which is roughly equivalent to thirty-three million books.
In tests, BitNet b1.58 2B4T outperformed competitors with a similar number of parameters, including Llama 3.2 1B from Meta, Gemma 3 1B from Google, and Qwen 2.5 1.5B from Alibaba. The model showed particularly impressive results in logical reasoning tasks and everyday math problems, and in terms of speed, it sometimes doubled the performance of similar solutions while using significantly less memory.
At the same time, achieving such results requires the use of a special Microsoft development — the bitnet.cpp framework, which currently supports only a limited range of hardware. Support for graphics processors, which dominate the AI field, is not yet available, limiting the model’s deployment across different platforms.