We added OpenSimula to our open-source dataset tool AfterImage: an experimental Python implementation of the Simula mechanism-design recipe from Davidson et al. (TMLR, PDF; framing also in this research blog).
Problem it targets:
For some SFT/eval setups you care less about “one prompt → one answer” and more about controlled diversity over a reasoning space: which axes of variation exist, how you joint-sample them, and how you stress-test generations before they land in a JSONL file.
What the code actually does (high level):
LLM-built factor taxonomies → weighted mix sampling over factors → meta-prompt diversification (+ optional complexification) → requirement critic loop with refinement → optional double-critic gate for verifiable MCQ. Artifacts are a versioned opensimula/ checkpoint (manifest, taxonomy bundle, sampling strategy) plus append-only JSONL for accepted points. You can plug in the same GenerationMonitor we use elsewhere for observability into generation metrics, or bridge scenarios into ConversationGenerator via a small callback.
Hard disclaimers (please read):
- This is not a Google product, not a reference port of anything internal—just our read of the published recipe in the paper.
- API is explicitly experimental and may change.
- Cost and latency explode if you remove the caps on taxonomy width/depth; wide trees are many structured calls unless you tune bounds.
- “Mechanism design” here helps structure the data-generating process; it does not magically fix model collapse or bad teacher models.
Code & docs:
- Repo (whole library): https://github.com/altaidevorg/afterimage
- Simula examples: https://github.com/altaidevorg/afterimage/tree/main/examples/simula
- Short overview: https://afterimage.altai.dev/opensimula.html
- API reference: https://afterimage.altai.dev/api/simula.html
I genuinely would love hear your feedback if any.
[link] [comments]




