AI Navigate

Mac Mini 4K 32GB Local LLM Performance

Reddit r/LocalLLaMA / 3/18/2026

💬 OpinionTools & Practical UsageModels & Research

Read original →

共有:

Key Points

The author posts concrete performance figures for a Mac Mini 4K with 32GB RAM running a local LLM setup.
The setup uses OpenClaw 2026.3.8, LM Studio 0.4.6+1, and Unsloth gpt-oss-20b-Q4_K_S.gguf.
Default-like settings were adjusted: GPU offload 18, CPU thread pool size 7, max concurrents 4, number of experts 4, and flash attention enabled.
They report a context size of 26035 tokens and achieve 34 tokens per second with a time-to-first-token of 0.7 seconds on the first prompt.
A link to the Reddit thread provides the full discussion and comments.

It is hard to find any concrete performance figures so I am posting mine:

OpenClaw 2026.3.8
LM Studio 0.4.6+1
Unsloth gpt-oss-20b-Q4_K_S.gguf
Context size 26035
All other model settings are at the defaults (GPU offload = 18, CPU thread pool size = 7, max concurrents = 4, number of experts = 4, flash attention = on)

With this, after the first prompt I get 34 tok/s and 0.7 time to first token

submitted by /u/groover75
[link] [comments]

Related Articles

1Password、AIエージェントのアクセス制御を統合管理する「Unified Access」発表人間・マシン・AIの資格情報を一元統制のサムネイル画像

1Password、AIエージェントのアクセス制御を統合管理する「Unified Access」発表人間・マシン・AIの資格情報を一元統制のサムネイル画像

Ledge.ai

『モンドーモンドー』｜夏目龍頭流闇文学｜AI画像生成｜自由詩｜散文詩｜ホラー｜ダークファンタジー｜深淵図書館

『モンドーモンドー』｜夏目龍頭流闇文学｜AI画像生成｜自由詩｜散文詩｜ホラー｜ダークファンタジー｜深淵図書館

note

報告：LLMにおける「自己言及的再帰」と「ステートフル・エミュレーション」の観測

報告：LLMにおける「自己言及的再帰」と「ステートフル・エミュレーション」の観測

note

「お金、見直したいけどどこから？」AIが改善ヒントを教えてくれる、公式プロンプトを公開

「お金、見直したいけどどこから？」AIが改善ヒントを教えてくれる、公式プロンプトを公開

note

Copilotと物語を作ってみた #213 めーっちゃボロボロこぼす女の子の物語

Copilotと物語を作ってみた #213 めーっちゃボロボロこぼす女の子の物語

note

関連おすすめサービス

※当サイトはアフィリエイト広告を利用しています

Notta搭載AI議事録イヤホン ZENCHORD1

AI時代の仕事術。Notta搭載で会議の議事録を自動生成するスマートイヤホン。

AI搭載ボイスレコーダー Plaud

世界100万人が愛用。AIで文字起こし・要約を自動化するボイスレコーダー。

画像高画質化AIツール Aiarty Image Enhancer

AIで画像を高画質化。写真・イラストを簡単にアップスケール。