Mistral AI, the prominent French artificial intelligence company, has introduced its latest breakthrough, the Mistral Large 2. This new generation model significantly surpasses its predecessor in various aspects, including code generation, mathematical problem-solving, and multilingual support. The Mistral Large 2 is poised to redefine benchmarks in performance, cost efficiency, and speed within the AI landscape.
Mistral Large 2 Enhances Performance and Multilingual Capabilities
The Mistral Large 2 model boasts an impressive 128k context window and supports a wide array of languages. These include major global languages such as French, German, Spanish, Italian, Portuguese, Arabic, Hindi, Russian, Chinese, Japanese, and Korean, along with over 80 coding languages like Python, Java, C, C++, JavaScript, and Bash. This extensive language support positions Mistral Large 2 as a versatile tool for global applications.
The model is engineered for single-node inference, designed to handle long-context applications efficiently. With 123 billion parameters, Mistral Large 2 is optimized for high throughput on a single node. It is released under the Mistral Research License, which permits usage and modification for research and non-commercial purposes. For commercial deployments requiring self-hosting, users must obtain a Mistral Commercial License.
New Frontiers in Performance and Cost Efficiency
Mistral Large 2 has set new performance standards, particularly in evaluation metrics related to cost efficiency. The pretrained model achieved an accuracy of 84.0% on the MMLU benchmark, establishing a new point on the performance/cost Pareto frontier of open models. This performance highlights Mistral AI’s commitment to pushing the boundaries of AI technology.
Advancements in Code Generation and Reasoning
Building on the success of Codestral 22B and Codestral Mamba, Mistral Large 2 has been trained on an extensive code dataset. It outperforms the previous Mistral Large model and competes closely with leading models such as GPT-4o, Claude 3 Opus, and Llama 3 405B. A key focus during training was to enhance the model’s reasoning abilities. Mistral Large 2 has been fine-tuned to reduce “hallucination”—the generation of plausible but incorrect or irrelevant information. This refinement ensures the model delivers reliable and accurate outputs.
Furthermore, Mistral Large 2 is equipped to acknowledge when it cannot provide a confident answer or when information is insufficient. This feature is particularly beneficial for improving performance on mathematical benchmarks and ensuring accurate problem-solving capabilities. The model’s improved instruction-following and conversational abilities have been demonstrated through performance metrics on MT-Bench, Wild Bench, and Arena Hard benchmarks. While lengthy responses can enhance scores, Mistral Large 2 also excels in generating concise outputs, which are crucial for efficient business applications.
Multilingual Excellence and Enhanced Function Calling
Mistral Large 2 excels in handling multilingual documents, outperforming many models that focus primarily on English. It has been trained on a large multilingual dataset and performs exceptionally well in languages such as English, French, German, Spanish, Italian, Portuguese, Dutch, Russian, Chinese, Japanese, Korean, Arabic, and Hindi. Its performance on the multilingual MMLU benchmark is superior to both the previous Mistral Large and competing models like Llama 3.1 and Cohere’s Command R+.
The new model features enhanced function calling and retrieval capabilities, supporting both parallel and sequential function executions. This makes Mistral Large 2 a powerful engine for complex business applications. Users can access Mistral Large 2 via la Plateforme under the model name mistral-large-2407, available in version 24.07. The API is also named mistral-large-2407, with weights for the instruct model hosted on Hugging Face.
Consolidation and Expansion of Offerings
Mistral AI is consolidating its offerings on la Plateforme around two general-purpose models: Mistral Nemo and Mistral Large, along with two specialist models: Codestral and Embed. Older models, including Mistral 7B, Mixtral 8x7B, Mixtral 8x22B, Codestral Mamba, and Mathstral, remain available for deployment and fine-tuning using the SDKs mistral-inference and mistral-finetune.
Fine-tuning capabilities for Mistral Large, Mistral Nemo, and Codestral are now available on la Plateforme, providing users with expanded customization options.
Global Expansion and Cloud Partnerships
Mistral AI is expanding its reach through partnerships with major cloud service providers. The company is extending its collaboration with Google Cloud Platform to offer Mistral AI’s models on Vertex AI via a Managed API. In addition to Google Cloud Platform, Mistral AI’s models are now available on Azure AI Studio, Amazon Bedrock, and IBM watsonx.ai. This expansion ensures that Mistral Large 2 and other models are accessible to a global audience, enhancing their availability for diverse applications.
Discover more from Business-News-Today.com
Subscribe to get the latest posts sent to your email.