Multi-Agent Object Detection Framework Based on Raspberry Pi YOLO Detector and Slack-Ollama Natural Language Interface

arXiv cs.CV / 4/16/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a multi-agent, edge-based object detection and tracking framework implemented on a resource-constrained Raspberry Pi platform using a YOLO-based vision agent.
  • System control and communication are handled through a local Slack channel chatbot agent paired with a locally run Ollama LLM reporting agent, enabling natural-language interaction.
  • Agent coordination is achieved via a custom event-based message exchange subsystem, positioned as an alternative to fully autonomous orchestration patterns used in other LLM agent frameworks.
  • The authors emphasize a fast-prototyping development approach enabled by generative AI, applying these principles across both design and implementation on the same hardware.
  • Experiments analyze the practical limitations of low-cost edge testbeds for centralized multi-agent AI systems and compare the approach against designs requiring cloud resources.

Abstract

The paper presents design and prototype implementation of an edge based object detection system within the new paradigm of AI agents orchestration. It goes beyond traditional design approaches by leveraging on LLM based natural language interface for system control and communication and practically demonstrates integration of all system components into a single resource constrained hardware platform. The method is based on the proposed multi-agent object detection framework which tightly integrates different AI agents within the same task of providing object detection and tracking capabilities. The proposed design principles highlight the fast prototyping approach that is characteristic for transformational potential of generative AI systems, which are applied during both development and implementation stages. Instead of specialized communication and control interface, the system is made by using Slack channel chatbot agent and accompanying Ollama LLM reporting agent, which are both run locally on the same Raspberry Pi platform, alongside the dedicated YOLO based computer vision agent performing real time object detection and tracking. Agent orchestration is implemented through a specially designed event based message exchange subsystem, which represents an alternative to completely autonomous agent orchestration and control characteristic for contemporary LLM based frameworks like the recently proposed OpenClaw. Conducted experimental investigation provides valuable insights into limitations of the low cost testbed platforms in the design of completely centralized multi-agent AI systems. The paper also discusses comparative differences between presented approach and the solution that would require additional cloud based external resources.