Mistral AI has announced (as expected) the Pixtral Large model — a 124-billion-parameter open-source multimodal model. This is the next generation after Mistral Large 2, combining text and image analysis. Pixtral Large features 123 billion parameters in its multimodal decoder and 1 billion in the visual encoder. It can process up to 30 high-quality images in a 128,000-token context window.
Pixtral Large outperforms other models on MathVista, DocVQA, and ChartQA tasks. On MathVista, which evaluates mathematical reasoning with visual data, the model achieved 69.4%, surpassing GPT-4o and Gemini-1.5 Pro. Pixtral Large also demonstrates high performance in multilingual OCR and chart analysis.
The updated Le Chat platform now uses Pixtral Large for document and image analysis, and also enables workflow automation with agents. New features include web search with citations, a Canvas tool for content creation, and image generation powered by Flux Pro.
Pixtral Large is available for testing via API or self-hosting, and all Le Chat features remain free in the beta version.