mistralai/Mistral-Medium-3.5-128B · Hugging Face

Reddit r/LocalLLaMA / 4/30/2026

📰 NewsTools & Practical UsageModels & Research

Key Points

  • Mistral AIは、Mistral Medium 3.5 128BをHugging Faceで公開し、単一の統合モデルとして指示追従・推論・コーディングをより高性能にすることを目指しています。
  • 同モデルは高密度128Bパラメータで、最大256kのコンテキスト長に対応し、画像を含むマルチモーダル入力(テキスト出力)と視覚理解も可能です。
  • 推論の「Reasoning Mode」はリクエストごとに計算量(reasoning effort)を調整でき、短い応答から複雑なエージェント実行まで同一モデルで使い分けできます。
  • エージェント機能としてネイティブなfunction callingとJSON出力を備え、システムプロンプトへの追従性も強化されています。
  • ライセンスはModified MIT Licenseで、従来のMistral Medium 3.1/Magistral(Le Chat)やDevstral 2(Vibeのコーディングエージェント)に置き換わる位置付けです。
mistralai/Mistral-Medium-3.5-128B · Hugging Face

https://huggingface.co/unsloth/Mistral-Medium-3.5-128B-GGUF

Mistral Medium 3.5 128B

Mistral Medium 3.5 is our first flagship merged model. It is a dense 128B model with a 256k context window, handling instruction-following, reasoning, and coding in a single set of weights. Mistral Medium 3.5 replaces its predecessor Mistral Medium 3.1 and Magistral in Le Chat. It also replaces Devstral 2 in our coding agent Vibe. Concretely, expect better performance for instruct, reasoning and coding tasks in a new unified model in comparison with our previous released models.

Reasoning effort is configurable per request, so the same model can answer a quick chat reply or work through a complex agentic run. We trained the vision encoder from scratch to handle variable image sizes and aspect ratios.

Find more information on our blog.

Key Features

Mistral Medium 3.5 includes the following architectural choices:

  • Dense 128B parameters.
  • 256k context length.
  • Multimodal input: Accepts both text and image input, with text output.
  • Instruct and Reasoning functionalities with function calls (reasoning effort configurable per request).

Mistral Medium 3.5 offers the following capabilities:

  • Reasoning Mode: Toggle between fast instant reply mode and reasoning mode, boosting performance with test-time compute when requested.
  • Vision: Analyzes images and provides insights based on visual content, in addition to text.
  • Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, and Arabic.
  • System Prompt: Strong adherence and support for system prompts.
  • Agentic: Best-in-class agentic capabilities with native function calling and JSON output.
  • Large Context Window: Supports a 256k context window.

We release this model under a Modified MIT License): Open-source license for both commercial and non-commercial use with exceptions for companies with large revenue.

Recommended Settings

  • Reasoning Effort:
    • 'none' → Do not use reasoning
    • 'high' → Use reasoning (recommended for complex prompts and agentic usage) Use reasoning_effort="high" for complex tasks and agentic coding.
  • Temperature: 0.7 for reasoning_effort="high". Temp between 0.0 and 0.7 for reasoning_effort="none" depending on the task. Generally, lower means answer that are more to the point and higher allows the model to be more creative. It is a good practice to try different values in order to improve the model performance to meet your demands.
submitted by /u/jacek2023
[link] [comments]