Learning Tree-Based Models with Gradient Descent
arXiv cs.LG / 3/13/2026
📰 NewsModels & Research
Key Points
- The thesis introduces a method to learn hard, axis-aligned decision trees via gradient descent by applying backpropagation with a straight-through operator on a dense DT representation, enabling differentiable training of tree structures.
- It enables joint optimization of all tree parameters, overcoming the combinatorial and non-differentiable limitations of traditional DT methods like CART that rely on greedy splits.
- The approach is designed to integrate with existing gradient-descent-based ML pipelines, including multimodal and reinforcement learning tasks.
- The authors report state-of-the-art results across multiple domains, including interpretable trees for small tabular datasets, models for complex tabular data, and improvements in multimodal and interpretable reinforcement learning, without information loss.
Related Articles

PearlOS. We gave swarm intelligence a local desktop environment and code control to self-evolve. Has been pretty incredible to see so far. Open source and free if you want your own.
Reddit r/LocalLLaMA
QwenDean-4B | fine-tuned SLM for UIGen; our first attempt, looking for feedback!
Reddit r/LocalLLaMA
acestep.cpp: portable C++17 implementation of ACE-Step 1.5 music generation using GGML. Runs on CPU, CUDA, ROCm, Metal, Vulkan
Reddit r/LocalLLaMA

**Introducing SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding**
Hugging Face Blog

Newest GPU server in the lab! 72gb ampere vram!
Reddit r/LocalLLaMA