What model would you choose for your core?

Reddit r/LocalLLaMA / 3/29/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • A user is experimenting with multiple LLM options (including Qwen, Mistral, and Gemma) on a single GPU (an RTX 5090) and wants guidance on selecting one “core brain” model for an agentic build.
  • The proposal assumes the agent framework (memory, system prompt, and tool integrations) is already implemented, making the model choice the main remaining decision.
  • They believe 32B models lack sufficient headroom to support an evolving multi-agent ecosystem and are seeking a better-performing alternative.
  • The post is essentially a community question asking which model to choose and why, tailored to local deployment constraints and an agentic architecture.

I have been experimenting lately on trying out different models for a single gpu 5090. I am kinda shooting for the moon on a multi agency experiment, I’ve tried Qwen variants, mistral, Gemma, etc. if you were going to pick one model for your core agentic build. I have the memory , system , tools all ready to go, but I really can’t decide on the best “brain” for this project.. I know 32b models don’t give me enough headroom to build the evolving ecosystem… what would you choose and why… best core brain?

submitted by /u/RealFangedSpectre
[link] [comments]