Hey, r/LocalLLaMA !
I'm back with a - let's say - interesting new AI thing: an AI dove detector and scarer
So my balcony has a pigeon problem. They sit at my bird feeder, eat everything, and poop on absolutely everything else. Sparrows, blackbirds and tits are welcome – but pigeons? No.
So naturally I did the reasonable thing and built an AI system to scare them away with a loud noise. 🔊
How it works:
It's a two-stage hybrid pipeline:
- YOLOv8/YOLO26 watches the camera feed (I'm using my Android phone as an IP webcam via the "IP Webcam" app) and detects if there's any bird in the frame – super fast, ~50ms on CPU
- Only if YOLO sees a bird, CLIP (ViT-B/32) classifies the crop: pigeon/dove or not? This runs in ~80ms on CPU with only ~400MB RAM
- If it's a pigeon → 🔊 loud alarm sound plays (raptor scream should work great but you can use you own sound → you'll have to save it as `alarm.wav` in the same folder as the .py file)
The Vision LLM path (via LM Studio + Qwen3-VL-4B (or what model you want)) is still in the code as an optional fallback (USE_CLIP = False) if you want to go full overkill – but honestly CLIP is so much faster and works just as well for this binary task especially on small devices without a GPU in CPU-only mode.
Stack:
- YOLO26m/l (Ultralytics) for bird detection
- OpenCLIP ViT-B/32 for pigeon classification
- Optional: Qwen3-VL-4B via LM Studio (OpenAI-compatible API)
- OpenCV + Python, runs on a Chromebook (Crostini/Linux) or any other computer
- Android phone as IP webcam via "IP Webcam" app → you can of course also use any other camera connected to your computer like a webcam
Why not just fine-tune a classifier? I thought about it, but CLIP zero-shot works surprisingly well here – it correctly distinguishes pigeons from sparrows, blackbirds, etc...
Actual output:
SCSS[11:47:31] 🐤 1 bird(s) recognized! → Checking with CLIP... Bird #1 (YOLO: 94%) → CLIP... 🕊️ DOVE DETECTED! (Rock Dove, HIGH, 87% confidence) [Overall dove count: 1] 💾 Saved: detections/20260330_114743_*.jpg 🔊 ALERT played! ⏸️ Cooldown 30s... [11:48:21] 🐤 1 bird(s) recognized! → Checking with CLIP... Bird #1 (YOLO: 89%) → CLIP... ✅ No problem (Sparrow, LOW confidence) Works on CPU-only, no GPU needed. First run downloads ~450MB of model data automatically.
GitHub: https://github.com/LH-Tech-AI/dove-detector
Feedback welcome – especially if anyone has ideas for improving the CLIP label set or threshold tuning! 🐦
Built on a Chromebook. With a phone as a camera. Pointing at a picture of a pigeon on my monitor for testing. AI is wild.
[link] [comments]




