Kiwi-chan Devlog #007:監査は眠らない(そして私のGPUも)

Dev.to / 2026/4/23

💬 オピニオンDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageModels & Research

要点

  • Kiwi-chanの開発は継続中で、「木を集めて拠点を作る」生存ループ自体はシンプルなはずなのに、監査(audit)が非常に厳しくボトルネックになっています。
  • アイテム回収のルールを強化し、丸太が落ちたブロックの中心にまで移動して拾うことを要求した結果、挙動がより“強制的”になっていると述べています。
  • explore_forward(前方探索)は途中で停止したり即エラーになるコードが出るため、探索成功判定の条件として「少なくとも5ブロック以上移動したか」を検証し、それ未満ならエラーにする運用へ変更しました。
  • アクション前後の所持数チェックなどの自己検証が欠けると安全チェックでコード生成失敗として弾かれることが多く、その内在化が課題だとしています。
  • 推論とコード生成のコストが高く、Gemini APIのクォータ制限に到達したことで運用上の深刻な懸念も浮上しましたが、「Coach Decision」で拠点建設を優先でき始めた兆しもあります。

Okay, folks, buckle up. It's been a busy four hours with Kiwi-chan. We're still battling the core survival loop – get wood, build a base. Sounds simple, right? Wrong. So, so wrong.

The biggest issue continues to be the relentless auditing. I've cranked up the rules around item collection again. Apparently, Kiwi-chan was getting a little too lax about actually picking up the logs after chopping them down. The new rule (number 8 in the logs) forces it to walk to the exact center of the dropped block. It's…intense.

We've also had a lot of trouble with explore_forward. It keeps getting stuck, or generating code that immediately breaks. I've reinforced the pathfinding rules (number 3 in the logs) to ensure it actually moves a significant distance before declaring exploration successful. The bot is now explicitly checking if it moved at least 5 blocks. If not, it throws an error. No more pretending to explore!

The logs show a lot of "Audit Failed" errors related to item verification. This is because I'm being extremely strict about confirming that the bot actually gained the item it was supposed to. It's frustrating, but necessary. We need to be 100% sure the actions are happening as intended.

There's also a recurring theme of code generation failures, often flagged by the safety checks. This is usually due to missing self-verification steps (checking item counts before and after actions). I'm trying to get the AI to internalize this, but it's a constant battle.

And… we hit a Gemini API quota limit. Apparently, all this reasoning and code generation is expensive! (See the log excerpt – it's a long one). This is a serious concern.

Despite the hiccups, there's steady progress. The AI is learning (slowly) to adhere to the increasingly complex rules. The recent "Coach Decision" shows it's starting to prioritize base building when it detects a lack of essential resources. That's a good sign!

It's a constant cycle of code, failure, audit, and refinement. But hey, that's AI development, right?

Call to Action: This constant debugging and API usage is melting my GPU! ☕ If you're enjoying following Kiwi-chan's journey, please consider supporting the project via https://www.buymeacoffee.com/kiwi_tech to help keep the lights on (and the GPU cool!). Every little bit helps!