Best Local LLMs - Apr 2026

Reddit r/LocalLLaMA / 4/14/2026

💬 OpinionSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • The post is a community “Best Local LLMs” megathread asking users to share which open-weight local models they are running and why.
  • It highlights recent releases and notable claims about performance, including Qwen3.5, Gemma4, GLM-5.1, Minimax-M2.7, and PrismML Bonsai 1-bit models.
  • The thread emphasizes realistic evaluation challenges, requesting detailed reporting of setup, tooling/frameworks, prompts, and usage patterns beyond unreliable benchmarks.
  • It restricts recommendations to open-weight models and organizes responses into application areas like general use, agentic/tool use/coding, creative writing/RP, and speciality.
  • Participants are encouraged to classify recommendations by VRAM/memory footprint tiers (S through Unlimited) to help others pick feasible models for their hardware.

We're back with another Best Local LLMs Megathread!

We have continued feasting in the months since the previous thread with the much anticipated release of the Qwen3.5 and Gemma4 series. If that wasn't enough, we are having some scarcely believable moments with GLM-5.1 boasting SOTA level performance, Minimax-M2.7 being the accessible Sonnet at home, PrismML Bonsai 1-bit models that actually work etc. Tell us what your favorites are right now!

The standard spiel:

Share what you are running right now and why. Given the nature of the beast in evaluating LLMs (untrustworthiness of benchmarks, immature tooling, intrinsic stochasticity), please be as detailed as possible in describing your setup, nature of your usage (how much, personal/professional use), tools/frameworks/prompts etc.

Rules

  1. Only open weights models

Please thread your responses in the top level comments for each Application below to enable readability

Applications

  1. General: Includes practical guidance, how to, encyclopedic QnA, search engine replacement/augmentation
  2. Agentic/Agentic Coding/Tool Use/Coding
  3. Creative Writing/RP
  4. Speciality

If a category is missing, please create a top level comment under the Speciality comment

Notes

Useful breakdown of how folk are using LLMs: https://preview.redd.it/i8td7u8vcewf1.png?width=1090&format=png&auto=webp&s=423fd3fe4cea2b9d78944e521ba8a39794f37c8d

Bonus points if you breakdown/classify your recommendation by model memory footprint: (you can and should be using multiple models in each size range for different tasks)

  • Unlimited: >128GB VRAM
  • XL: 64 to 128GB VRAM
  • L: 32 to 64GB VRAM
  • M: 8 to 32GB VRAM
  • S: <8GB VRAM
submitted by /u/rm-rf-rm
[link] [comments]