PixelClaw: an LLM agent for image manipulation

Reddit r/artificial / 4/22/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • PixelClaw is a free and open-source LLM agent dedicated to image manipulation, combining conversational planning and tool use in one workflow.
  • The system supports multiple LLM backends for reasoning, and it includes AI-assisted image generation/editing via gpt-image.
  • It provides practical editing utilities such as background removal (rembg), pixelization (pyxelate), and additional effects like posterization and defringing using custom algorithms.
  • PixelClaw also adds multimodal capabilities, including speech-to-text (Whisper) and text-to-speech using Kokoro plus the HALO project.
  • A Raylib-based UI with features like file drag-and-drop is included, and the project offers demo videos and a GitHub repository for further exploration.
PixelClaw: an LLM agent for image manipulation

I'm making an LLM agent specialized for image processing. It combines:

  • an LLM for conversation, planning, and tool use (supports a variety of LLMs)
  • image generation/AI-based editing via gpt-image
  • background removal via rembg (several specialized models available)
  • pixelization using pyxelate
  • posterization and defringing using custom algorithms
  • speech-to-text (Whisper) and text-to-speech (Kokoro plus HALO)
  • a nice UI based on Raylib, including file drag-and-drop

PixelClaw is free and open-source at https://github.com/JoeStrout/PixelClaw/ . You can find more demo videos there too. While you're there, if you find it interesting, please click the star ⭐️ at the top of the page; that helps me gauge interest.

submitted by /u/JoeStrout
[link] [comments]