Mistral has just been updated its small open source from 3.1 to 3.2: this is the reason

Photo of author

By [email protected]


Join the event that the leaders of the institutions have been trusted for nearly two decades. VB Transform combines people who build AI’s strategy for real institutions. Learn more


French Ai Darling Mistral is to keep new versions coming this summer.

Just days after announcing it AI-AI-PTIMITIOTIONThe well -funded company It issued an update to the Open Source Source Model Small Mistral Small modelJumping from version 3.1 to 3.2- 24B AdDruct-2506.

The new version depends directly on Mistral Small 3.1, with the aim of improving specific behaviors such as the following instructions, stability of the output, and the durability of the invitation. Although comprehensive architectural details remain unchanged, the update provides targeted improvements that affect both internal assessments and general standards.

According to Mistral Ai, Small 3.2 is better to adhere to accurate instructions and reduces the possibility of infinite or repeated generations – a problem sometimes seen in previous versions when dealing with long or mysterious claims.

Likewise, the job call template has been upgraded to support the scenarios for the use of the most reliable tools, especially in the frames like VLM.

Meanwhile, it can be operated on a setup with the NVIDIA A100/H100 80GB graphics processing unit, which greatly opens the options of companies with narrow resources and/or budgets.

Modern model after only 3 months

Mistral Small 3.1 was announced in March 2025 As a major open version of the 24B parameter. I have provided complete multimedia capabilities, multi -language understanding, and long -context processing up to 128 kilos.

The model was explicitly placed against their royal peers such as GPT-4O Mini, Claude 3.5 Haiku and GEMMA 3-I -T- and according to Mistral, it surpassed it in many tasks.

Small 3.1 also emphasized the effective publication, with demands of inference at 150 icons per second and support for use on the device with a 32 GB RAM.

This version came with basic inspection points and guidance, which provides flexibility to control across fields such as legal, medical and technical fields.

On the contrary, small 3.2 focuses on surgical improvements on behavior and reliability. It does not aim to provide new capabilities or changes in architecture. Instead, it works as a maintenance version: cleaning the edge cases in the generation of output, tightening compliance with the instructions, and refining system interactions.

Small 3.2 for small 3.1: What has changed?

Instructions for tracking the instructions show a small but measurable improvement. Mistral’s interior resolution increased from 82.75 % in 3.1 to 84.78 % in a small 3.2.

Likewise, performance improved external data groups such as Wildbench V2 and Arena Hard V2 significantly – Wildbench increased by approximately 10 degrees Celsius, while Arena is more powerful than twice, as it jumped from 19.56 % to 43.10 %.

Interior standards also suggest reducing the repetition of the output. The infinite generations rate decreased from 2.11 % in 3.1 to 1.29 % in 3.2 – approximately 2 x reduction. This makes the model more reliable for developers building applications that require limited consistent responses.

Performance through text standards and coding provide a more accurate image. 33 gains on Humaneval Plus (88.99 % to 92.90 %), MBPP Pass@5 (74.63 % to 78.33 %), and Simpleqa. It also improved MMLU Pro and Hatt.

The criteria for vision are still often consistent, with slight fluctuations. Chartqa and Docvqa have seen marginal gains, while AI2D and Mathista decreased with less than two percentage. The average vision performance decreased slightly from 81.39 % in 3.1 to 81.00 % in a small 3.2.

This corresponds to the Mistral Mistral intention: Small 3.2 is not repairing the model, but polishing. As such, most criteria fall within the expected contrast, and some slopes appear to be differentials of targeted improvements elsewhere.

However, as an artificial intelligence user and influencer @chatgpt21 posted on x: “It has worsened on MMLU”, and this means the criteria for understanding the multi -task language, which is a multidisciplinary test with 57 questions designed to evaluate the broad LLM performance across areas. In fact, 3.2 small record 80.50 %, just less than 3.1 80.62 %.

It will make the source license open more attractive to users who focus on the cost and focus on the allocations

Small 3.1 and 3.2 is available under APache 2.0 license and can be accessed across popular. Artificial Intelligence Code sharing warehouse Embroidery (It itself is an emerging company based in France and Nix).

Small 3.2 is supported by frames such as VLLM and transformers and requires approximately 55 GB of RAM GPU to play in BF16 or FP16 resolution.

For developers who seek to build or applications service, system claims and examples of reasoning are provided in the model depot.

Although Mistral Small 3.1 was already integrated into platforms like Google Cloud Vertex Ai and is scheduled to be posted on NVIDIA NIM and Microsoft Azure, Small 3.2 currently shows access to self -service via embrace and direct publication.

What should institutions know when looking

Mistral Small 3.2 may not convert competitive sites in an open -weight model space, but it represents the wrong commitment of artificial intelligence to improve the repetitive model.

With noticeable improvements in reliability and task treatment – especially about the accuracy of instructions and the use of tools – SMLL 3.2 provides a cleaner user experience for developers and institutions based on the wrong ecosystem.

The fact that a French emerging company is compatible with the rules and regulations of the European Union, such as GDP and European Union Law, Amnesty International makes it attractive to institutions operating in this part of the world.

However, for those looking for the largest hops for standard performance, Small 3.1 reference points – especially given that in some cases, such as MMLU, Small 3.2 does not outperform its predecessor. This makes the update more than one option that focuses on stability more than a pure upgrade, depending on the state of use.




Source link

Leave a Comment