Spent the weekend reading a local agent runtime repo. The TS-only packaging and persistent MCP ports are both very smart.

Reddit r/LocalLLaMA / 4/5/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsIdeas & Deep Analysis

Key Points

  • The article describes a local agent runtime repo that supports local LLM providers like Ollama, highlighting runtime engineering work rather than launch hype.
  • It emphasizes a shift to TypeScript-only packaging, where the runtime now includes the API layer, orchestration, workspace MCP hosting, and packaging while no longer ships Python sources or Python dependencies.
  • It notes a key reliability improvement: MCP ports are persisted in SQLite with uniqueness constraints and are restored across restarts using bootstrap-time merging of prepared MCP servers.
  • The author argues that as local models mature, the main differentiator becomes “harness quality,” including packaging, sidecar lifecycle, service discovery, and persisted runtime state.
  • The post ends with a question to the community about whether others still use static port/config management or persist orchestration state.

I like reading local LLM infra repos more than launch posts, and I ended up deep in one this weekend because it supports local providers like Ollama.

Two things gave me the “okay, someone actually cared about runtime engineering” reaction.

First, the runtime path was moved fully into TypeScript. The API layer, runner orchestration, workspace MCP hosting, and packaging all live there now, and the packaged runtime no longer ships Python source or Python deps. For local/self-hosted stacks that matters more than it sounds: smaller bundle, fewer moving pieces, less cross-language drift.

Second, they stopped doing hardcoded MCP port math. Ports are persisted in SQLite with UNIQUE(port) and (workspace_id, app_id) as the key, and the runner merges prepared MCP servers during bootstrap. So local sidecars come back on stable, collision-resistant ports across restarts instead of the usual 13100 + i guesswork.

The bigger takeaway for me is that once local models are good enough, a lot of the pain shifts from model quality to harness quality. Packaging, sidecar lifecycle, local service discovery, and runtime state are boring topics, but they decide whether a local agent stack actually feels solid.

For people here building on Ollama / llama.cpp / LM Studio + MCP, are you still doing static port/config management, or are you persisting orchestration state somewhere?

Repo if anyone wants to read through the same code:

https://github.com/holaboss-ai/holaboss-ai

submitted by /u/Hungry-Treat8953
[link] [comments]