What it does:
- Takes natural language tasks ("copy logs to backup")
- Detects task type (atomic, repetitive, clarification)
- Generates execution plans (CLI commands + hotkeys)
- Runs entirely locally on CPU (no GPU, no cloud APIs)
Technical details:
- Base: Qwen2-0.5B
- Training: LoRA fine-tuning on ~1000 custom task examples
- Quantization: GGUF Q4_K_M (300MB)
- Inference: llama.cpp (3-10 sec on i3/i5)
Main challenges during training:
Data quality - had to regenerate dataset 2-3 times due to garbage examples
Overfitting - took multiple iterations to get validation loss stable
EOS token handling - model wouldn't stop generating until I fixed tokenizer config
GGUF conversion - needed BF16 dtype + imatrix quantization to get stable outputs
Limitations (v0.1):
- Requires full file paths (no smart file search yet)
- CPU inference only (slower on old hardware)
- Basic execution (no visual understanding)
Performance:
- i5 (2018+) + SSD: 3-5 seconds
- i3 (2015+) + SSD: 5-10 seconds
- Older hardware: 30-90 seconds (tested on Pentium + HDD)
Feedback welcome! Especially interested in:
- Performance on different hardware
- Edge cases that break the model
- Feature requests for v0.2
Links:
- GitHub: https://github.com/ansh0x/ace
Happy to answer questions about the training process or architecture!
[link] [comments]




