| I was testing OpenCode and Roo Code with Gemma 26B on llama.cpp yesterday for about 10 hours. I was able to make progress on my project, both solutions work. But: OpenCode is kind of fucked up at the moment, because of that there is often long prompt processing.. Roo Code works correctly, but it has different issues (thinking takes longer, probably OpenCode has better prompts). The problem with OpenCode looks unsolvable on the llama.cpp side. I need to test it with other engines to confirm that, and then I will probably have to fix it on the OpenCode side. Maybe improving Roo Code’s prompts would be a better choice? My current command (after lots of experimenting) is: [link] [comments] |
opencode with gemma 26B
Reddit r/LocalLLaMA / 4/20/2026
💬 OpinionDeveloper Stack & InfrastructureTools & Practical Usage
Key Points
- A user tested OpenCode and Roo Code with the Gemma 26B model using llama.cpp for about 10 hours and reported that both can help progress on a coding project.
- OpenCode currently has a major issue where prompt processing sometimes takes a long time, while Roo Code works more correctly but may be slower due to longer “thinking.”
- The user suspects the OpenCode problem may be difficult to fix on the llama.cpp side and plans to validate it with other inference engines.
- If confirmed, the user expects the remedy would likely involve changes on the OpenCode side, and they also consider improving Roo Code’s prompts as an alternative.
- The post includes the user’s current llama-server command with specific performance and decoding parameters (e.g., large context settings, caching, and sampling options).

