Mutual information has been successfully adopted in filter feature-selection methods to assess both the relevancy of a subset of features in predicting the target variable and the redundancy with respect to other variables. However, existing algorithms are mostly heuristic and do not offer any guarantee on the proposed solution. In this paper, we provide novel theoretical results showing that conditional mutual information naturally arises when bounding the ideal regression/classification errors achieved by different subsets of features. Leveraging on these insights, we propose a novel stopping condition for backward and forward greedy methods which ensures that the ideal prediction error using the selected feature subset remains bounded by a user-specified threshold. We provide numerical simulations to support our theoretical claims and compare to common heuristic methods.

Feature Selection via Mutual Information: New Theoretical Insights / M. Beraha, A.M. Metelli, M. Papini, A. Tirinzoni, M. Restelli - In: International Joint Conference on Neural Networks[s.l] : IEEE, 2019. - ISBN 978-1-7281-1985-4. - pp. 1-9 (( International Joint Conference on Neural Networks, IJCNN 2019 Budapest 2019 [10.1109/IJCNN.2019.8852410].

Feature Selection via Mutual Information: New Theoretical Insights

M. Papini;
2019

Abstract

Mutual information has been successfully adopted in filter feature-selection methods to assess both the relevancy of a subset of features in predicting the target variable and the redundancy with respect to other variables. However, existing algorithms are mostly heuristic and do not offer any guarantee on the proposed solution. In this paper, we provide novel theoretical results showing that conditional mutual information naturally arises when bounding the ideal regression/classification errors achieved by different subsets of features. Leveraging on these insights, we propose a novel stopping condition for backward and forward greedy methods which ensures that the ideal prediction error using the selected feature subset remains bounded by a user-specified threshold. We provide numerical simulations to support our theoretical claims and compare to common heuristic methods.
classification; feature selection; machine learning; mutual information; regression; supervised learning
Settore IINF-05/A - Sistemi di elaborazione delle informazioni
Settore INFO-01/A - Informatica
2019
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
1907.07384v1.pdf

accesso aperto

Tipologia: Pre-print (manoscritto inviato all'editore)
Licenza: Creative commons
Dimensione 577.78 kB
Formato Adobe PDF
577.78 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1225939
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 82
  • ???jsp.display-item.citation.isi??? 48
  • OpenAlex 8
social impact