| submitted by /u/Responsible_Case_376 [link] [comments] |
Running 8B Llama locally on Jetson Orin Nano (using only 2.5GB of memory)
Reddit r/LocalLLaMA / 3/13/2026
📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research
Key Points
- A Reddit user claims to run an 8B Llama model locally on a Jetson Orin Nano using only about 2.5 GB of memory.
- The post links to the Reddit submission and a video/demo, indicating a practical edge deployment.
- This suggests potential for running mid-size LLMs on low-memory edge devices, expanding on-device AI possibilities.
- The snippet notes the feasibility but does not provide detailed benchmarks or reproducible steps in the article itself.
Related Articles

Black Hat USA
AI Business
LlamaStation v0.9 — llama.cpp GUI for Windows with multi-backend support, TurboQuant, MTP and more
Reddit r/LocalLLaMA

Revolutionizing Hotel Front Desk with AI
Dev.to

Apple Silicon as a Serious AI Dev Box: What an M4 Max Actually Does With a 70B Model
Dev.to

LLM planner - pick a rig for your use-case/model/budget, or pick models for your rig. 60+ builds, 50+ models, 130+ cited t/s sources, 150+ reviewer YouTube videos, idle+active watts, multi-region prices, regular updates.
Reddit r/LocalLLaMA