Gemini vs Grok: Playing Towers of Annoy

Reddit r/artificial / 4/24/2026

💬 OpinionSignals & Early TrendsTools & Practical UsageModels & Research

共有:

Key Points

Large language models were tasked with writing a Python client to play a two-player adversarial version of Towers of Hanoi, where the villain must immediately move the same disk to an adjacent tower.
The contest design makes mistakes costly by giving the hero a move budget of 2^m + 1, only slightly above the solo optimum, so wasted moves typically lead to losses.
Models competed in a round-robin tournament with head-to-head matchups that used multiple rounds (including sudden death) and ran two simultaneous games per round with swapped hero/villain roles.
The challenge scaled in difficulty from 4 towers/3 disks up to 12 towers/7 disks, testing the models’ ability to handle increasingly complex adversarial planning.
A detailed write-up reports that Gemini performed strongly (“aced” the challenge), including results across the tournament setup.

LLMs were asked to write a Python 3.10 client that plays a two-player adversarial variant of the Towers of Hanoi.

Rules: Hero moves a disk; Villain must immediately move that same disk to an adjacent tower (or pass if no legal move). Hero's budget is 2^m + 1 moves — barely more than the 2^m - 1 solo optimum, so almost any wasted move loses. Round-robin tournament with penalty-shootout matchups: up to 5 rounds (+ sudden death), 2 simultaneous games per round with hero/villain roles swapped. Round configs grow from 4 towers / 3 disks up to 12 towers / 7 disks.

Full writeup

submitted by /u/reditzer
[link] [comments]

Black Hat USA

AI Business

AI productivity tools 2026: top 10 tools for remote teams

Dev.to

How I Use GitHub Copilot + RapidForge to Generate Daily Stock Ideas

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Anthropic CVP Run 3 — Does Claude's Safety Stack Scale Down to Haiku 4.5?

Dev.to

Gemini vs Grok: Playing Towers of Annoy

Key Points

Related Articles

Black Hat USA

AI productivity tools 2026: top 10 tools for remote teams

How I Use GitHub Copilot + RapidForge to Generate Daily Stock Ideas

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Anthropic CVP Run 3 — Does Claude's Safety Stack Scale Down to Haiku 4.5?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer