We devise a feature selection method in terms of a follow-out utility of a special classification procedure. In turn, we root the latter on binary features which we extract from the input patterns with a wrapper method. The whole contrivance results in a procedure that is progressive in two respects. As for features, first we compute a very essential representation of them in terms of Boolean independent components in order to reduce their entropy. Then we reverse the representation mapping to discover the subset of the original features supporting a successful classification. As for the classification, we split it into two less hard tasks. With the former we look for a clustering of input patterns that satisfies loose consistency constraints and benefits from the conciseness of binary representation. With the latter we attribute labels to the clusters through the combined use of basically linear separators. We implement out the method through a relatively quick numerical procedure by assembling a set of connectionist and symbolic routines. These we toss on the benchmark of feature selection of DNA microarray data in cancer diagnosis and other ancillary datasets.

Feature selection via Boolean Independent Component analysis / B. Apolloni, S. Bassis, A.A.F. Brega. - In: INFORMATION SCIENCES. - ISSN 0020-0255. - 179:22(2009 Nov), pp. 3815-3831.

Feature selection via Boolean Independent Component analysis

B. Apolloni
Primo
;
S. Bassis
Secondo
;
A.A.F. Brega
Ultimo
2009

Abstract

We devise a feature selection method in terms of a follow-out utility of a special classification procedure. In turn, we root the latter on binary features which we extract from the input patterns with a wrapper method. The whole contrivance results in a procedure that is progressive in two respects. As for features, first we compute a very essential representation of them in terms of Boolean independent components in order to reduce their entropy. Then we reverse the representation mapping to discover the subset of the original features supporting a successful classification. As for the classification, we split it into two less hard tasks. With the former we look for a clustering of input patterns that satisfies loose consistency constraints and benefits from the conciseness of binary representation. With the latter we attribute labels to the clusters through the combined use of basically linear separators. We implement out the method through a relatively quick numerical procedure by assembling a set of connectionist and symbolic routines. These we toss on the benchmark of feature selection of DNA microarray data in cancer diagnosis and other ancillary datasets.
Boolean independent component analysis; Classification; Clustering; DNA microarray; Feature extraction; Feature selection; SVM ensemble
Settore INF/01 - Informatica
nov-2009
Article (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/66719
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
  • ???jsp.display-item.citation.isi??? 10
social impact