ggml_cuda_init: found 1 CUDA devices: Device 0: NVIDIA H100 80GB HBM3, compute capability 9.0, VMM: yes | model | size | params | backend | ngl | fa | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: | | qwen3 8B Q1_0_g128 | 1.07 GiB | 8.19 B | CUDA | 999 | 1 | pp512 | 9061.72 ± 652.18 | | qwen3 8B Q1_0_g128 | 1.07 GiB | 8.19 B | CUDA | 999 | 1 | tg128 | 253.57 ± 0.35 | build: 1179bfc82 (8194) ggml_cuda_init: found 1 CUDA devices: Device 0: NVIDIA H100 80GB HBM3, compute capability 9.0, VMM: yes | model | size | params | backend | ngl | fa | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: | | qwen3 8B Q1_0_g128 | 1.07 GiB | 8.19 B | CUDA | 999 | 1 | pp512 | 9061.72 ± 652.18 | | qwen3 8B Q1_0_g128 | 1.07 GiB | 8.19 B | CUDA | 999 | 1 | tg128 | 253.57 ± 0.35 | build: 1179bfc82 (8194) [link] [comments]




