Feature selection via Boolean Independent Component analysis

Apolloni, B.; Bassis, S.; Brega, A.A.F.

doi:10.1016/j.ins.2009.07.002

We devise a feature selection method in terms of a follow-out utility of a special classification procedure. In turn, we root the latter on binary features which we extract from the input patterns with a wrapper method. The whole contrivance results in a procedure that is progressive in two respects. As for features, first we compute a very essential representation of them in terms of Boolean independent components in order to reduce their entropy. Then we reverse the representation mapping to discover the subset of the original features supporting a successful classification. As for the classification, we split it into two less hard tasks. With the former we look for a clustering of input patterns that satisfies loose consistency constraints and benefits from the conciseness of binary representation. With the latter we attribute labels to the clusters through the combined use of basically linear separators. We implement out the method through a relatively quick numerical procedure by assembling a set of connectionist and symbolic routines. These we toss on the benchmark of feature selection of DNA microarray data in cancer diagnosis and other ancillary datasets.

Feature selection via Boolean Independent Component analysis / B. Apolloni, S. Bassis, A.A.F. Brega. - In: INFORMATION SCIENCES. - ISSN 0020-0255. - 179:22(2009 Nov), pp. 3815-3831.

Feature selection via Boolean Independent Component analysis

B. Apolloni^Primo;S. Bassis^Secondo;A.A.F. Brega^Ultimo

2009

Abstract

We devise a feature selection method in terms of a follow-out utility of a special classification procedure. In turn, we root the latter on binary features which we extract from the input patterns with a wrapper method. The whole contrivance results in a procedure that is progressive in two respects. As for features, first we compute a very essential representation of them in terms of Boolean independent components in order to reduce their entropy. Then we reverse the representation mapping to discover the subset of the original features supporting a successful classification. As for the classification, we split it into two less hard tasks. With the former we look for a clustering of input patterns that satisfies loose consistency constraints and benefits from the conciseness of binary representation. With the latter we attribute labels to the clusters through the combined use of basically linear separators. We implement out the method through a relatively quick numerical procedure by assembling a set of connectionist and symbolic routines. These we toss on the benchmark of feature selection of DNA microarray data in cancer diagnosis and other ancillary datasets.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				Boolean independent component analysis; Classification; Clustering; DNA microarray; Feature extraction; Feature selection; SVM ensemble
			
	Settori scientifico-disciplinari dell'articolo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Data di pubblicazione
	
				nov-2009
			
	Rivista in ANCE
	
				INFORMATION SCIENCES
			
	DOI
	
				https://dx.doi.org/10.1016/j.ins.2009.07.002
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/66719

Citazioni

ND

13

11

ND

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca