Auto-creation of agent SKILLs from observing your screen via Gemma 4 for any agent to execute and self-improve

Reddit r/LocalLLaMA / 4/7/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

共有:

Key Points

AgentHandover is an open-source macOS menu bar app that uses local Gemma 4 (via Ollama) to watch the user’s screen and convert repeated workflows into structured “Skill” files for agents to execute.
It supports both manual recording for specific tasks (Focus Record) and automatic background discovery of recurring actions (Passive Discovery), with Skills improving after each observation.
The system is described as a fully on-device 11-stage pipeline that keeps screen data on the machine, with encryption at rest and no data leaving the device.
Skills can be integrated one-click via MCP so any MCP-compatible agent tool (e.g., Claude Code, Cursor, OpenClaw) can use the learned Skills; a CLI is also available.
The project is intended to reduce the need to re-explain common processes to agents by “learning” them from the user’s behavior and refining steps, guardrails, and confidence scores over time.

Auto-creation of agent SKILLs from observing your screen via Gemma 4 for any agent to execute and self-improve

AgentHandover is an open-source Mac menu bar app that watches your screen through Gemma 4 (running locally via Ollama) and turns your repeated workflows into structured Skill files that any agent can follow.

I built it because every time I wanted an agent to handle something for me I had to explain the whole process from scratch, even for stuff I do daily. So AgentHandover just watches instead. You can either hit record for a specific task (Focus Record) or let it run in the background where it starts picking up patterns after seeing you repeat something a few times (Passive Discovery).
Skills get sharper with every observation, updating steps, guardrails, and confidence scores as it learns more. The whole thing is an 11-stage pipeline running fully on-device, nothing leaves your machine, encrypted at rest. One-click agent integration through MCP so Claude Code, Cursor, OpenClaw or anything that speaks MCP can just pick up your Skills. Also has a CLI if you prefer terminal.

SImple illustrative demo in the video, Apache 2.0, repo: https://github.com/sandroandric/AgentHandover

Would love feedback on the approach and curious if anyone has tried other local vision or OS models for screen understanding...thxxx

submitted by /u/Objective_River_5218
[link] [comments]

Black Hat USA

AI Business

Black Hat Asia

AI Business

Your AI Agent is Reading Poisoned Web Pages.. Here's How to Stop It

Dev.to

I Built a CLI AI Coding Assistant from Scratch — Here's What I Learned

Dev.to

🚀 OpenAI's Secret "Image V2" Just Leaked on LM Arena: The End of Mangled AI Text?

Dev.to

Auto-creation of agent SKILLs from observing your screen via Gemma 4 for any agent to execute and self-improve

Key Points

Related Articles

Black Hat USA

Black Hat Asia

Your AI Agent is Reading Poisoned Web Pages.. Here's How to Stop It

I Built a CLI AI Coding Assistant from Scratch — Here's What I Learned

🚀 OpenAI's Secret "Image V2" Just Leaked on LM Arena: The End of Mangled AI Text?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer