ArXiv TLDR

MinShap: A Modified Shapley Value Approach for Feature Selection

🐦 Tweet
2604.15107

Chenghui Zheng, Garvesh Raskutti

stat.MLcs.LG

TLDR

MinShap modifies Shapley values for accurate and stable feature selection, outperforming existing methods by focusing on minimum marginal contributions.

Key contributions

  • Introduces MinShap, a novel feature selection method adapting Shapley values for complex, non-linear data.
  • MinShap considers the minimum marginal contribution across feature permutations, unlike traditional Shapley averaging.
  • Provides theoretical guarantees, including Type I error control, motivated by faithfulness in DAG models.
  • Outperforms state-of-the-art feature selection algorithms like LOCO, GCM, and Lasso in accuracy and stability.

Why it matters

Feature selection remains challenging, especially with complex, dependent data where traditional Shapley values fall short. MinShap addresses this by adapting Shapley values for robust selection, offering strong theoretical guarantees. This method significantly improves accuracy and stability over current state-of-the-art techniques.

Original Abstract

Feature selection is a classical problem in statistics and machine learning, and it continues to remain an extremely challenging problem especially in the context of unknown non-linear relationships with dependent features. On the other hand, Shapley values are a classic solution concept from cooperative game theory that is widely used for feature attribution in general non-linear models with highly-dependent features. However, Shapley values are not naturally suited for feature selection since they tend to capture both direct effects from each feature to the response and indirect effects through other features. In this paper, we combine the advantages of Shapley values and adapt them to feature selection by proposing \emph{MinShap}, a modification of the Shapley value framework along with a suite of other related algorithms. In particular for MinShap, instead of taking the average marginal contributions over permutations of features, considers the minimum marginal contribution across permutations. We provide a theoretical foundation motivated by the faithfulness assumption in DAG (directed acyclic graphical models), a guarantee for the Type I error of MinShap, and show through numerical simulations and real data experiments that MinShap tends to outperform state-of-the-art feature selection algorithms such as LOCO, GCM and Lasso in terms of both accuracy and stability. We also introduce a suite of algorithms related to MinShap by using the multiple testing/p-value perspective that improves performance in lower-sample settings and provide supporting theoretical guarantees.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.