I built AI agents that play Pokemon Showdown autonomously using free LLM APIs via tool-calling [P]

Reddit r/MachineLearning / 5/1/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

共有:

Key Points

The article describes an AI-agent system that plays Pokémon Showdown autonomously using LLMs such as Llama 3, Qwen, and Gemma.
Instead of simple prompt-response, the agents analyze the complete battle state each turn (e.g., type matchups, HP, weather, field conditions, and known opponent info) and decide between attacking or switching.
Tool-calling is used to structure decisions and actions, with routing handled via LiteLLM.
The implementation relies on free API tiers from providers like Groq, Cerebras, OpenRouter, and Google AI Studio, enabling local runs without inference cost.
The project includes features such as human-vs-AI, AI-vs-AI battles, support for 15+ free models, and full observability through Langfuse, with a GitHub repo shared for feedback.

I built AI agents that play Pokemon Showdown autonomously using free LLM APIs via tool-calling [P]

I've built a system where models like Llama 3, Qwen, and Gemma play Pokémon Showdown battles autonomously. Instead of simple prompt-response, they analyze the full battle state every turn (type matchups, HP, weather, field conditions, revealed opponent info) and decide whether to attack or switch using structured tool calls.

The cool part: I routed everything through LiteLLM and exclusively used models with free API tiers (Groq, Cerebras, OpenRouter, Google AI Studio). So anyone can run this locally with zero inference cost.

Features:

- Human vs. AI (play against the bot)

- AI vs. AI (pit two models against each other)

- 15+ free models supported out of the box

- Full observability via Langfuse to see the exact tool calls and reasoning per turn.

https://i.redd.it/lzx2fd2s0eyg1.gif

▶️ Watch the full video demo with audio on YouTube: https://youtu.be/8ZNadmh-Sy8

GitHub Repo: https://github.com/MohamedMostafa259/pokemon-ai-agent

Would love feedback on the architecture or ideas for improving their reasoning during complex board states!

submitted by /u/ReplacementMoney2484
[link] [comments]

Black Hat USA

AI Business

Red-teaming a network of agents: Understanding what breaks when AI agents interact at scale

Microsoft Research Blog

langchain-fireworks==1.2.1

LangChain Releases

How PolySignals Works: Full Breakdown of Its AI Signal Engine

Dev.to

AI-Powered Prediction Market Signals: The Complete Polymarket Trading Guide for 2026

Dev.to

I built AI agents that play Pokemon Showdown autonomously using free LLM APIs via tool-calling [P]

Key Points

Related Articles

Black Hat USA

Red-teaming a network of agents: Understanding what breaks when AI agents interact at scale

langchain-fireworks==1.2.1

How PolySignals Works: Full Breakdown of Its AI Signal Engine

AI-Powered Prediction Market Signals: The Complete Polymarket Trading Guide for 2026

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer