ArXiv TLDR

SCOPE-FE: Structured Control of Operator and Pairwise Exploration for Feature Engineering

🐦 Tweet
2604.27025

Minhee Park, Seongyeon Son, Yonghyun Lee, Eunchan Kim

stat.MLcs.LG

TLDR

SCOPE-FE reduces computational cost in automatic feature engineering by controlling operator and feature-pair search spaces, improving efficiency.

Key contributions

  • OperatorProbing estimates and eliminates low-utility operators to reduce search space.
  • FeatureClustering groups structurally related features to restrict candidate combinations.
  • ReliabilityScoring stabilizes pruning decisions by incorporating variance across subsamples.

Why it matters

Automatic feature engineering is often slow for high-dimensional data. SCOPE-FE addresses this by efficiently reducing the candidate feature space, making advanced feature engineering more practical and scalable for large datasets.

Original Abstract

Automatic feature engineering is an effective approach for improving predictive performance in tabular learning. However, expand-and-reduce methods, such as OpenFE, become increasingly computationally expensive as the input dimensionality grows. This limitation arises primarily from the combinatorial explosion of candidate features generated through operator-feature combinations. To address this issue, we propose SCOPE-FE, a structured search space control framework that improves efficiency by reducing the candidate space prior to feature generation. SCOPE-FE jointly regulates two major sources of combinatorial growth: the operator space and feature-pair space. First, OperatorProbing estimates the dataset-specific utility of candidate operators and eliminates low-contribution operators in advance. Second, FeatureClustering employs spectral embedding and fuzzy c-means clustering to group structurally related features, thereby restricting candidate generation to relevant within-cluster combinations. In addition, we introduce ReliabilityScoring, which incorporates variance across subsamples to stabilize pruning decisions. Experiments on ten benchmark datasets demonstrate that SCOPE-FE substantially reduces feature engineering time while maintaining competitive predictive performance relative to existing baselines. The efficiency gains are particularly pronounced for high-dimensional datasets. These results indicate that structured control of the search space is an effective strategy for scalable automatic feature engineering. The code will be made publicly available upon acceptance.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.