xRFM: Accurate, scalable, and interpretable feature learning models for tabular data

arXiv stat.ML / 4/7/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces xRFM, a new feature-learning model for tabular data that combines kernel machines with a tree structure to capture local data patterns while scaling to very large training sets.
  • It argues that tabular ML practices have lagged behind recent AI advances, where Gradient Boosted Decision Trees (GBDTs) remain dominant, and positions xRFM as a modern neural-feature-learning alternative.
  • In experiments across 100 regression datasets and 200 classification datasets, xRFM reportedly achieves top performance versus 31 baselines and is especially competitive with the best existing methods, including newer tabular foundation approaches and GBDTs.
  • The method also claims native interpretability via the Average Gradient Outer Product, aiming to address a common drawback of many neural tabular models.

Abstract

Inference from tabular data, collections of continuous and categorical variables organized into matrices, is a foundation for modern technology and science. Yet, in contrast to the explosive changes in the rest of AI, the best practice for these predictive tasks has been relatively unchanged and is still primarily based on variations of Gradient Boosted Decision Trees (GBDTs). Very recently, there has been renewed interest in developing state-of-the-art methods for tabular data based on recent developments in neural networks and feature learning methods. In this work, we introduce xRFM, an algorithm that combines feature learning kernel machines with a tree structure to both adapt to the local structure of the data and scale to essentially unlimited amounts of training data. We show that compared to 31 other methods, including recently introduced tabular foundation models (TabPFNv2) and GBDTs, xRFM achieves best performance across 100 regression datasets and is competitive to the best methods across 200 classification datasets outperforming GBDTs. Additionally, xRFM provides interpretability natively through the Average Gradient Outer Product.