My balcony has a pigeon problem → Built an AI tool to scare them away with YOLO + CLIP on a Chromebook 🐦

Reddit r/LocalLLaMA / 3/30/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

Key Points

  • A developer built a two-stage “dove/pigeon detector and scarer” that uses YOLO for fast bird detection and CLIP for pigeon-vs-non-pigeon classification on a CPU-only Chromebook setup.
  • The system captures video via an Android phone IP webcam, runs detection (~50ms for YOLO) and classification (~80ms for CLIP), and plays a loud alarm only when a pigeon/dove is identified.
  • CLIP zero-shot classification is highlighted as sufficiently accurate for the binary dove/pigeon task, avoiding the need to fine-tune a custom classifier.
  • An optional fallback “vision LLM” path (Qwen3-VL-4B via LM Studio) remains in the code but is described as slower/overkill compared with the CLIP approach for small-device performance.
  • The project is shared as open source on GitHub and includes example logs showing detections being saved and an alert triggered with a cooldown to prevent repeated scares.

Hey, r/LocalLLaMA !

I'm back with a - let's say - interesting new AI thing: an AI dove detector and scarer

So my balcony has a pigeon problem. They sit at my bird feeder, eat everything, and poop on absolutely everything else. Sparrows, blackbirds and tits are welcome – but pigeons? No.

So naturally I did the reasonable thing and built an AI system to scare them away with a loud noise. 🔊

How it works:

It's a two-stage hybrid pipeline:

  1. YOLOv8/YOLO26 watches the camera feed (I'm using my Android phone as an IP webcam via the "IP Webcam" app) and detects if there's any bird in the frame – super fast, ~50ms on CPU
  2. Only if YOLO sees a bird, CLIP (ViT-B/32) classifies the crop: pigeon/dove or not? This runs in ~80ms on CPU with only ~400MB RAM
  3. If it's a pigeon → 🔊 loud alarm sound plays (raptor scream should work great but you can use you own sound → you'll have to save it as `alarm.wav` in the same folder as the .py file)

The Vision LLM path (via LM Studio + Qwen3-VL-4B (or what model you want)) is still in the code as an optional fallback (USE_CLIP = False) if you want to go full overkill – but honestly CLIP is so much faster and works just as well for this binary task especially on small devices without a GPU in CPU-only mode.

Stack:

  • YOLO26m/l (Ultralytics) for bird detection
  • OpenCLIP ViT-B/32 for pigeon classification
  • Optional: Qwen3-VL-4B via LM Studio (OpenAI-compatible API)
  • OpenCV + Python, runs on a Chromebook (Crostini/Linux) or any other computer
  • Android phone as IP webcam via "IP Webcam" app → you can of course also use any other camera connected to your computer like a webcam

Why not just fine-tune a classifier? I thought about it, but CLIP zero-shot works surprisingly well here – it correctly distinguishes pigeons from sparrows, blackbirds, etc...

Actual output:

SCSS[11:47:31] 🐤 1 bird(s) recognized! → Checking with CLIP... Bird #1 (YOLO: 94%) → CLIP... 🕊️ DOVE DETECTED! (Rock Dove, HIGH, 87% confidence) [Overall dove count: 1] 💾 Saved: detections/20260330_114743_*.jpg 🔊 ALERT played! ⏸️ Cooldown 30s... [11:48:21] 🐤 1 bird(s) recognized! → Checking with CLIP... Bird #1 (YOLO: 89%) → CLIP... ✅ No problem (Sparrow, LOW confidence) 

Works on CPU-only, no GPU needed. First run downloads ~450MB of model data automatically.

GitHub: https://github.com/LH-Tech-AI/dove-detector

Feedback welcome – especially if anyone has ideas for improving the CLIP label set or threshold tuning! 🐦

Built on a Chromebook. With a phone as a camera. Pointing at a picture of a pigeon on my monitor for testing. AI is wild.

submitted by /u/LH-Tech_AI
[link] [comments]