| It is Audio-Image/vids-Text -> Text [link] [comments] |
Nemotron-3-Nano-Omni-30B-A3B-Reasoning, New model?
Reddit r/LocalLLaMA / 4/29/2026
📰 NewsSignals & Early TrendsTools & Practical UsageModels & Research
Key Points
- The post claims a new reasoning-capable multimodal model named “Nemotron-3-Nano-Omni-30B-A3B-Reasoning.”
- It describes the model’s input and output as Audio/Image/video/text → Text.
- The excerpt points to an original BF16 release hosted on Hugging Face from NVIDIA’s Nemotron 3 lineup.
- It also provides a GGUF link via an unsloth-hosted repository, suggesting availability in a popular local deployment format.
- The only source is a Reddit submission that links to Hugging Face assets, so details beyond the links are limited in the excerpt.
Related Articles

Black Hat USA
AI Business
How are LLMs 'corrected' when users identify them spreading misinformation or saying something harmful?
Reddit r/artificial

The future of software development: Now with less software development
The Register
The Landing: Portable Payload for AI Systems
Reddit r/artificial

AI Failures Happen When No One is Looking. Here's How to Fix Them.
Dev.to