Minimax 2.7 をローカルでサブエージェントとして実行する

Reddit r/LocalLLaMA / 2026/4/13

💬 オピニオンDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageModels & Research

共有:

要点

Minimax 2.7 は「サブエージェント」をローカルで実行できると報告されており、Apple M3 Ultra 上で多数の並列エージェントタスクを迅速に処理します。
ユーザーは Minimax 2.7 を Opencode に接続し、バッチ化が利用可能なハードウェア性能を最大化するのに役立つことを確認します。
llama.cpp のログでは、LCP の類似性に基づくスロット選択、タスク用スロットの起動、最大で約 196k トークンまでの大規模コンテキストウィンドウと、反復的なプロンプト処理の進捗更新が示されています。
設定では、量子化モデル（llama.cpp、unsloth IQ2_XXS UD）と、一般的なサンプリング制御（top-k/top-p/min-p/typical など）を含むサンプラーチェーンが参照されています。

ローカルの Minimax 2.7 を自分の M3 Ultra にある Opencode に接続してみました。これだけ多くのエージェントを並列に動かして、ものすごく速く処理できるのはかなり感心しました！このようなバッチ処理は、ハードウェアを最大限に活用できている感じがします。

編集: もう少し詳しく

llama.cpp、unsloth IQ2_XXS UD

slot get_availabl: id 3 | task -1 | LCP の類似度によって選ばれたスロット、sim_best = 0.708 (> 0.100 thold)、f_keep = 1.000 slot launch_slot_: id 3 | task -1 | sampler chain: logits -> ?penalties -> ?dry -> ?top-n-sigma -> top-k -> ?typical -> top-p -> min-p -> ?xtc -> ?temp-ext -> dist slot launch_slot_: id 3 | task 2488 | 処理タスク、is_child = 0 slot update_slots: id 3 | task 2488 | 新しいプロンプト、n_ctx_slot = 196608、n_keep = 0、task.n_tokens = 49213 slot update_slots: id 3 | task 2488 | n_tokens = 34849、memory_seq_rm [34849, end) slot update_slots: id 3 | task 2488 | プロンプト処理の進捗、n_tokens = 36897、batch.n_tokens = 2048、progress = 0.749741 slot update_slots: id 3 | task 2488 | n_tokens = 36897、memory_seq_rm [36897, end) slot update_slots: id 3 | task 2488 | プロンプト処理の進捗、n_tokens = 38945、batch.n_tokens = 2048、progress = 0.791356 slot update_slots: id 3 | task 2488 | n_tokens = 38945、memory_seq_rm [38945, end) slot update_slots: id 3 | task 2488 | プロンプト処理の進捗、n_tokens = 40993、batch.n_tokens = 2048、progress = 0.832971 slot update_slots: id 3 | task 2488 | n_tokens = 40993、memory_seq_rm [40993, end) slot update_slots: id 3 | task 2488 | プロンプト処理の進捗、n_tokens = 43041、batch.n_tokens = 2048、progress = 0.874586 slot update_slots: id 3 | task 2488 | n_tokens = 43041、memory_seq_rm [43041, end) slot update_slots: id 3 | task 2488 | プロンプト処理の進捗、n_tokens = 45089、batch.n_tokens = 2048、progress = 0.916201 slot update_slots: id 3 | task 2488 | n_tokens = 45089、memory_seq_rm [45089, end) slot update_slots: id 3 | task 2488 | プロンプト処理の進捗、n_tokens = 47137、batch.n_tokens = 2048、progress = 0.957816 slot update_slots: id 3 | task 2488 | n_tokens = 47137、memory_seq_rm [47137, end) slot update_slots: id 3 | task 2488 | プロンプト処理の進捗、n_tokens = 49185、batch.n_tokens = 2048、progress = 0.999431 slot update_slots: id 3 | task 2488 | n_tokens = 49185、memory_seq_rm [49185, end) reasoning-budget: 有効化、budget=2147483647 tokens reasoning-budget: 無効化（自然終了） slot init_sampler: id 3 | task 2488 | init sampler、4.23 ms かかった、tokens: text = 49213, total = 49213 slot update_slots: id 3 | task 2488 | プロンプト処理完了、n_tokens = 49213、batch.n_tokens = 28 srv log_server_r: done request: POST /v1/chat/completions 200 slot print_timing: id 3 | task 2488 | prompt eval time = 72627.76 ms / 14364 tokens（1トークンあたり 5.06 ms、1秒あたり 197.78 tokens）eval time = 4712.60 ms / 118 tokens（1トークンあたり 39.94 ms、1秒あたり 25.04 tokens）total time = 77340.36 ms / 14482 tokens slot release: id 3 | task 2488 | 処理を停止: n_tokens = 49330、truncated = 0 srv update_slots: 利用可能な全スロットがアイドル状態

提出者 /u/-dysangel-
[リンク] [コメント]

Black Hat USA

AI Business

Black Hat Asia

AI Business

日本三大秘境の現場で最先端技術の活用、建機の遠隔・自律操作

日経XTECH

ヒューマノイドが建設現場にやってくる、フィジカルAIは人手不足を救うか

日経XTECH

人型ロボット、中国が圧倒的に先行日本はコア部品技術で挽回へ

日経XTECH

Minimax 2.7 をローカルでサブエージェントとして実行する

要点

関連記事

Black Hat USA

Black Hat Asia

日本三大秘境の現場で最先端技術の活用、建機の遠隔・自律操作

ヒューマノイドが建設現場にやってくる、フィジカルAIは人手不足を救うか

人型ロボット、中国が圧倒的に先行日本はコア部品技術で挽回へ

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

要点

関連記事

Black Hat USA

Black Hat Asia

日本三大秘境の現場で最先端技術の活用、建機の遠隔・自律操作

ヒューマノイドが建設現場にやってくる、フィジカルAIは人手不足を救うか

人型ロボット、中国が圧倒的に先行 日本はコア部品技術で挽回へ

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

人型ロボット、中国が圧倒的に先行日本はコア部品技術で挽回へ