SCOPE-FE: Structured Control of Operator and Pairwise Exploration for Feature Engineering

arXiv cs.LG / 5/1/2026

💬 OpinionDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

The paper proposes SCOPE-FE, a structured search-space control framework to make automatic feature engineering for tabular data more efficient as dimensionality increases.
It addresses combinatorial explosion from operator-feature combinations by jointly regulating both the operator space and the feature-pair candidate space before generating features.
OperatorProbing estimates operator utility on the specific dataset and removes low-contribution operators in advance to shrink the search space.
FeatureClustering uses spectral embedding and fuzzy c-means clustering to group related features, limiting feature-pair combinations to within clusters.
A ReliabilityScoring mechanism uses variance across subsamples to stabilize pruning decisions, and experiments on ten benchmarks show large time reductions while keeping competitive predictive performance, especially on high-dimensional datasets.

Abstract

Automatic feature engineering is an effective approach for improving predictive performance in tabular learning. However, expand-and-reduce methods, such as OpenFE, become increasingly computationally expensive as the input dimensionality grows. This limitation arises primarily from the combinatorial explosion of candidate features generated through operator-feature combinations. To address this issue, we propose SCOPE-FE, a structured search space control framework that improves efficiency by reducing the candidate space prior to feature generation. SCOPE-FE jointly regulates two major sources of combinatorial growth: the operator space and feature-pair space. First, OperatorProbing estimates the dataset-specific utility of candidate operators and eliminates low-contribution operators in advance. Second, FeatureClustering employs spectral embedding and fuzzy c-means clustering to group structurally related features, thereby restricting candidate generation to relevant within-cluster combinations. In addition, we introduce ReliabilityScoring, which incorporates variance across subsamples to stabilize pruning decisions. Experiments on ten benchmark datasets demonstrate that SCOPE-FE substantially reduces feature engineering time while maintaining competitive predictive performance relative to existing baselines. The efficiency gains are particularly pronounced for high-dimensional datasets. These results indicate that structured control of the search space is an effective strategy for scalable automatic feature engineering. The code will be made publicly available upon acceptance.

Why Autonomous Coding Agents Keep Failing — And What Actually Works

Dev.to

Text-to-image is easy. Chaining LLMs to generate, critique, and iterate on images autonomously is a routing nightmare. AgentSwarms now supports Image generation playground and creative media workflows!

Reddit r/artificial

Automating FDA Compliance: AI for Specialty Food Producers

Dev.to

Mistral's new flagship Medium 3.5 folds chat, reasoning, and code into one model

THE DECODER

I hate this group but not literally

Reddit r/LocalLLaMA

SCOPE-FE: Structured Control of Operator and Pairwise Exploration for Feature Engineering

Key Points

Abstract

Related Articles

Why Autonomous Coding Agents Keep Failing — And What Actually Works

Text-to-image is easy. Chaining LLMs to generate, critique, and iterate on images autonomously is a routing nightmare. AgentSwarms now supports Image generation playground and creative media workflows!

Automating FDA Compliance: AI for Specialty Food Producers

Mistral's new flagship Medium 3.5 folds chat, reasoning, and code into one model

I hate this group but not literally

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer