[P] fastrad: GPU-native radiomics library — 25× faster than PyRadiomics, 100% IBSI-compliant, all 8 feature classes

Reddit r/MachineLearning / 3/31/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

Key Points

  • fastrad is introduced as a PyTorch-native radiomics feature-extraction library designed to run entirely on torch.Tensor with automatic routing to GPU/CPU.
  • The library implements all eight IBSI feature classes (first-order, shape 2D/3D, GLCM, GLRLM, GLSZM, GLDM, NGTDM) as tensor operations.
  • Benchmarks on an RTX 4070 Ti report an end-to-end runtime of 0.116s per scan versus about 2.90s for PyRadiomics, yielding ~25× speedups with per-class improvements ranging from ~13× to ~49×.
  • Correctness is validated against the IBSI Phase 1 digital phantom (105 features; deviation ≤ 10⁻¹³%) and shows agreement with PyRadiomics on TCIA NSCLC CT data within ~10⁻¹¹ for all 105 features.
  • The project shares implementation details (notably challenging numerical parity for GLCM and GLSZM kernels) along with a preprint and a GitHub repository.

PyRadiomics is the de facto standard for radiomic feature extraction, but it's CPU-only and takes ~3 seconds per scan. At scale, that's a bottleneck.

I built fastrad — a PyTorch-native library that implements all 8 IBSI feature classes (first-order, shape 2D/3D, GLCM, GLRLM, GLSZM, GLDM, NGTDM) as native tensor operations. Everything runs on torch.Tensor with transparent device routing (auto/cuda/cpu).

Key numbers on an RTX 4070 Ti vs PyRadiomics:

• End-to-end: 0.116s vs 2.90s → 25× speedup

• Per-class gains range from 12.9× (GLRLM) to 49.3× (first-order)

• Single-thread CPU: 2.63× faster than PyRadiomics 32-thread on x86, 3.56× on Apple Silicon

• Peak VRAM: 654 MB

Correctness: validated against the IBSI Phase 1 digital phantom (105 features, max deviation ≤ 10⁻¹³%) and against PyRadiomics on a TCIA NSCLC CT — all 105 features agree to within 10⁻¹¹.

Happy to answer questions on the implementation — the GLCM and GLSZM kernels were the trickiest to get numerically identical to PyRadiomics.

Pre-print: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6436486

Github repo: https://github.com/helloerikaaa/fastrad

submitted by /u/helloerikaaa
[link] [comments]