Building a daily AI news brief in 325 lines of Python

Dev.to / 5/4/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • The article explains how to build a simple, self-hosted daily AI news brief using only 325 lines of Python and running it on a low-cost VPS.
  • The pipeline has three stages—Collect, Synthesize, and Deliver—without any framework, orchestrator, or complex infrastructure.
  • For collecting stories, it primarily scrapes Hacker News top posts (via an unauthenticated Firebase endpoint) and filters titles by an AI/LLM keyword list, plus additional sources as needed.
  • For synthesis, it sends the collected raw stories to an LLM using a strict prompt to generate a one-screen summary brief.
  • The deliver stage posts the generated brief to a public Telegram channel and archives the markdown output to disk, costing about half a cent per brief.

Building a daily AI news brief in 325 lines of Python

I read too many AI newsletters. Most of them are 4,000 words of sponsor copy and "thought leadership" wrapped around two actually-useful items. So I wrote a script that does the compression itself, and now I read it instead.

It's 325 lines of Python. It runs once a day on a $5 VPS. It costs about half a cent per brief. The output goes to a public Telegram channel. Here's how it's put together.

The shape of it

Three stages, no framework, no orchestrator:

  1. Collect — pull stories from a few sources. No API keys for this part.
  2. Synthesize — feed the raw stories to an LLM with a strict prompt and get back a one-screen brief.
  3. Deliver — post to a Telegram channel, archive the markdown to disk.

That's the whole thing. The interesting part is how little code each stage needs.

Collection

Two sources are doing 90% of the work:

Hacker News top stories. The Firebase API is unauthenticated and unrestricted. Pull the top story IDs, fetch each one, filter titles against a keyword list (ai, llm, model, agent, gpt, claude, etc.), keep the ones that match.

ids = requests.get(
    "https://hacker-news.firebaseio.com/v0/topstories.json", timeout=15
).json()[:30]
stories = []
for sid in ids:
    s = requests.get(f"https://hacker-news.firebaseio.com/v0/item/{sid}.json").json()
    if any(k in s.get("title", "").lower() for k in KEYWORDS):
        stories.append({"title": s["title"], "url": s.get("url", ""), "score": s.get("score", 0)})

arXiv cs.AI feed. Atom XML, also unauthenticated. The only gotcha is the namespace — element lookups silently return None if you forget the atom: prefix.

ATOM = {"atom": "http://www.w3.org/2005/Atom"}
url = "http://export.arxiv.org/api/query?search_query=cat:cs.AI&sortBy=submittedDate&max_results=10"
root = ET.fromstring(requests.get(url, timeout=15).text)
papers = [{
    "title": e.find("atom:title", ATOM).text.strip(),
    "url": e.find("atom:id", ATOM).text,
    "summary": e.find("atom:summary", ATOM).text.strip()[:400],
} for e in root.findall("atom:entry", ATOM)]

That's collection. Total runtime ~3 seconds. Zero API spend.

Synthesis

This is the only paid step. I'm using OpenRouter as the gateway because it's pay-as-you-go, has every model behind one API, and I can swap models by changing a string. Currently on deepseek/deepseek-chat because it's cheap, fast, and follows formatting instructions reliably.

The prompt is the actual product. It took ~20 iterations to get right. The current shape:

You are a senior intelligence analyst writing for indie AI builders.
Compress the following raw stories into a one-screen brief with this exact structure:

# TL;DR (3 bullets, max 15 words each)
## Top Stories (3-4 items, each with "Why it matters" one-liner)
## Research Drop (2 papers, with "builder takeaway")
## Action for Indies (1 concrete action a developer can take today)
## Full Sources (numbered list, all URLs)

No emojis except section markers. No hedging language. If you're uncertain about
something, omit it rather than waffle.

Temperature 0.3, max_tokens 1200. Cost: ~$0.005/brief. Runtime: 15-20 seconds.

The "Action for Indies" line is what makes the brief stick. Most AI digests tell you what happened. This one tries to tell you what to do about it. Sometimes the model nails it. Sometimes it suggests building a Chrome extension to fix something that's already a Chrome extension. The hit rate is maybe 60%.

Delivery

Telegram bot API. The whole delivery layer is six lines:

def send_to_telegram(text, chat_id, token):
    r = requests.post(
        f"https://api.telegram.org/bot{token}/sendMessage",
        json={"chat_id": chat_id, "text": text, "parse_mode": "Markdown"},
        timeout=15,
    )
    return r.ok

Two pitfalls I tripped:

  • A bot has to be added as an administrator to a channel before it can post. Just being a member returns 403.
  • Telegram's parse_mode: Markdown is a stricter dialect than CommonMark. Unbalanced asterisks crash the message. I sanitize by stripping any character outside a small allow-list before sending.

The brief also gets archived to data/briefs/brief_{timestamp}.md with YAML frontmatter. That's the searchable history.

Scheduling

Cron, with flock so a long run can't get clobbered by the next tick:

0 6 * * * flock -n /tmp/brief.lock python3 /home/aiuser/agentic-agenting/brief_bot.py >> brief.log 2>&1

That's the whole "infrastructure." 06:00 UTC is the EU breakfast / Asia evening overlap. Output is in the channel by 06:01.

What it costs

Per brief:

  • Compute: ~25 seconds on a $5/mo VPS, so essentially zero marginal cost
  • LLM tokens: $0.005
  • Storage: 5KB markdown

Per month: about 15 cents in API fees. The VPS runs other things too, so I don't allocate that to this.

What it doesn't do

Things I considered and rejected, in case you're tempted:

  • Personalization. Easy to add (filter sources per user, custom prompts). Adds complexity, doesn't improve the core compression. Killed.
  • Web UI. The brief is the UI. Adding a dashboard means I have to maintain a dashboard.
  • Multi-source RAG. Tried it. The brief got worse, not better — more sources means more noise to compress, and the model started hallucinating connections between unrelated stories.
  • Sentiment / "trending" scores. Looks impressive in screenshots, makes the brief less honest. Killed.

The whole project is opinionated about staying small. Every feature I've added past the original 200 lines made it worse on some axis.

The honest part

The synthesis step is an LLM. The bullets are written by a model from scraped source text. I link the original source on every item so a reader can verify before quoting. I curate the prompt and I read every brief before I trust it, but I don't write the prose.

If you object to AI-generated digests on principle, this isn't for you. If you object to bad AI-generated digests, that's fair, and the only answer is whether the output is actually good. The samples are public.

See it run

The output goes to a public Telegram channel: t.me/Agentic_Intel. Today's brief is pinned. If you want the source code or have ideas for sources I should be pulling, leave a comment.

Written by a guy who got tired of newsletter sponsor blocks.