What's Changed
- mlxrunner: batch the sampler across multiple sequences by @jessegross in #15736
- tokenizer: fix multi-regex BPE offset handling by @dhiltgen in #15844
- mlx: Support NVIDIA TensorRT Model Optimizer import by @dhiltgen in #15566
- app/server: fix desktop app startup killing active
ollama launchsessions by @hoyyeva in #15657 - Model support for batching by @jessegross in #15814
- New models by @dhiltgen in #15861
Full Changelog: v0.21.3-rc0...v0.22.1-rc0
