We analyze the potentialities of an approach to represent general data records through Boolean vectors in the philosophy of ICA. We envisage these vectors at an intermediate step of a clustering procedure aimed at taking decisions from data. With a “divide et conquer” strategy we first look for a suitable representation of the data and then assign them to clusters. We assume a Boolean coding to be a proper representation of the input of the discrete function computing assignments. We demand the following of this coding: to preserve most information so as to prove appropriate independently of the particular clustering task; to be concise, in order to get understandable assignment rules; and to be sufficiently random, to prime statistical classification methods. In the paper we toss these properties in terms of entropic features and connectionist procedures, whose validation is checked on a series of benchmarks.

BICA: a Boolean Independent Component Analysis Approach / B. Apolloni, S. Bassis, A. Brega - In: Artificial Neural Networks : ICANN 2008 18th International Conference Prague, Czech Republic, September 3-6, 2008 : Proceedings, Part I / [a cura di] V. Kurkova, R. Neruda, J.Koutník. - Berlin : Springer, 2008. - ISBN 9783540875352. - pp. 99-108 (( Intervento presentato al 18. convegno International Conference on Artificial Neural Networks tenutosi a Prague, Chez Republic nel 2008.

BICA: a Boolean Independent Component Analysis Approach

B. Apolloni
Primo
;
S. Bassis
Secondo
;
A. Brega
Ultimo
2008

Abstract

We analyze the potentialities of an approach to represent general data records through Boolean vectors in the philosophy of ICA. We envisage these vectors at an intermediate step of a clustering procedure aimed at taking decisions from data. With a “divide et conquer” strategy we first look for a suitable representation of the data and then assign them to clusters. We assume a Boolean coding to be a proper representation of the input of the discrete function computing assignments. We demand the following of this coding: to preserve most information so as to prove appropriate independently of the particular clustering task; to be concise, in order to get understandable assignment rules; and to be sufficiently random, to prime statistical classification methods. In the paper we toss these properties in terms of entropic features and connectionist procedures, whose validation is checked on a series of benchmarks.
Settore INF/01 - Informatica
Book Part (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/2434/55491
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact