What's Changed
- bench: add prompt calibration, context size flag, and NumCtx reporting by @dhiltgen in #15158
- model/parsers: fix gemma4 arg parsing when quoted strings contain " by @drifkin in #15254
- ggml: skip cublasGemmBatchedEx during graph reservation by @jessegross in #15301
- gemma4: enable flash attention by @dhiltgen in #15296
- ggml: fix ROCm build for cublasGemmBatchedEx reserve wrapper by @jessegross in #15305
- model/parsers: rework gemma4 tool call handling by @drifkin in #15306
Full Changelog: v0.20.0...v0.20.1




