| Phosphene is a free desktop panel for generating video on Apple Silicon Macs. It wraps Lightricks' LTX 2.3 model running natively on Apple's MLX framework, and exposes a one-click install through Pinokio. The differentiator is audio. LTX 2.3 generates video and audio in a single forward pass — they share the same diffusion process, so timing is tied at the frame level. Footsteps land on the correct frame. Lip movement matches dialogue. Ambient sound is conditioned on the visual content. Most other local video models (Wan, Hunyuan, Mochi) generate silent video; you add audio in post. What it can do Four generation modes:
Plus prompt rewriting via a local Gemma 3 12B 4-bit text encoder. The same model that reads your prompt for the diffusion stage can also rewrite it in the format LTX 2.3 was trained on. Runs offline, takes a few seconds. Quality tiers Three quality levels, picked per-job:
Hardware compatibility Apple Silicon only. The panel detects your Mac's RAM at boot and gates features accordingly:
This is enforced because LTX 2.3's working tensor footprint is real — there is no way to run a full 1280×704 5-second generation in less than ~30 GB of resident memory. The tier system is honest about it rather than letting users queue jobs that fall out of the OOM killer. Intel Macs and other platforms are not supported. There is no port path for them — MLX is Apple-only by design. Audio behavior Audio quality is conditioned on the prompt. A visual-only prompt produces faint ambient sound, which can read as "near-silent." A prompt with explicit audio cues produces layered foreground sound. Compare:
This is documented behavior of LTX 2.3, not a Phosphene quirk. Describe the soundscape in your prompt the same way you describe the visual. How it differs from existing tools Compared to other locally-runnable video models on a Mac:
Output format Lossless H.264 by default — yuv444p, CRF 0 — so your archive is the highest fidelity the renderer can produce. Web/social platforms will re-encode anyway. Override via env variables (LTX_OUTPUT_PIX_FMT, LTX_OUTPUT_CRF) if you want yuv420p directly. The +faststart movflag is on, so the moov atom is at the front of the file. Gallery thumbnails decode the first frame instantly without downloading the full clip. Install Search Phosphene in Pinokio's Discover tab and click Install. Pinokio handles the venv, Python 3.11 pin, MLX pipeline install, codec patches, and ~31 GB of model downloads (Q4 LTX 2.3 + Gemma text encoder). Resumable — if a download is interrupted, hitting Install again picks up where it left off. Optional: run "hf auth login" in Terminal first to authenticate the Hugging Face downloads. Anonymous downloads are throttled; authenticated downloads are roughly 10× faster, which matters for the optional 25 GB Q8 model. [ATTACH VIDEO: phosphene_hero_x.mp4] License + credits Phosphene panel: MIT. Built on:
Source: github.com/mrbizarro/phosphene. Issues and PRs welcome. [link] [comments] |
Phosphene local video and audio generation for Apple Silicon open source (LTX 2.3) [P]
Reddit r/MachineLearning / 5/1/2026
📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research
Key Points
- Phosphene is an open-source desktop app for Apple Silicon Macs that generates video using Lightricks’ LTX 2.3 model via the MLX framework, with one-click installation through Pinokio.
- A key differentiator is integrated audio generation: LTX 2.3 produces video and audio together in a single forward pass, keeping timing aligned at the frame level for events like footsteps and lip-sync.
- The tool supports multiple workflows—text-to-video, image-to-video, first/last-frame interpolation, and clip extension with continuous audio—plus local prompt rewriting using a Gemma 3 12B 4-bit encoder.
- Users can choose between Draft, Standard, and High quality tiers, with High featuring a two-stage TeaCache-accelerated setup that may require an additional on-demand model download.
- Feature availability and clip length are adapted to the user’s Mac RAM (e.g., 32GB, 64GB, and 96GB tiers), and generation runs offline in a few seconds per job.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.
Related Articles

Black Hat USA
AI Business

Why Autonomous Coding Agents Keep Failing — And What Actually Works
Dev.to

Text-to-image is easy. Chaining LLMs to generate, critique, and iterate on images autonomously is a routing nightmare. AgentSwarms now supports Image generation playground and creative media workflows!
Reddit r/artificial

Announcing the NVIDIA Nemotron 3 Super Build Contest
Dev.to

75% of Sites Blocking AI Bots Still Get Cited. Here Is Why Blocking Does Not Work.
Dev.to