SLM to controll NPC in a game world

Reddit r/LocalLLaMA / 3/29/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical Usage

共有:

Key Points

A developer is using a small instruct/reasoning model (Qwen 1.5B) to convert player prompts into grammar-enforced JSON actions for NPCs operating in a JSON-described game world (with distances, directions, object types, and IDs).
The approach works about 80% of the time, but the remaining outputs are described as completely random despite structured input/output and reasoning requirements.
They ask whether the world state and action targets should be represented as JSON vs. natural-language encodings, and whether numeric distance values should be replaced with semantic terms like adjacent/near/far.
They seek recommendations for a better small-model family under 2B parameters to improve reliability and generation speed for this command-to-action task.

Hello everybody,

I am working on a project where the player gives commands to a creature in a structured game world and the creature shall react to the player's prompt in a sensible way.
The world is described as JSON with distances, directions, object type, unique id

The prompt examples are:

- Get the closest stone

- Go to the tree in the north

- Attack the wolf

- Get any stone but avoid the wolf

And the output is (grammar enforced) JSON with action (move, attack, idle, etc) and the target plus a reasoning for debugging.

I tried Qwen 1.5B instruct and reasoning models it works semi well. Like 80% of the time the action is correct and the reasoning, too and the rest is completely random.

I have some general questions when working with this kind of models:

- is JSON input and output a good idea or shall I encode the world state and output using natural language instead? Like "I move to stone_01 at distance 7 in north direction"

- are numeric values for distances good practice or rather a semantic encoding like "adjacent", "close", "near", "far"

- Is there a better model family for my task? in wanna stay below 2B if possible due to generation time and size.

Thanks for any advice.

submitted by /u/DrJamgo
[link] [comments]