IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

Approaches to document classification belong to two major families: similarity-based (crisp) classification methods and neural networks (gradual) ones. For gradual techniques, a major open issue is controlling search space dimension. While similarity-based methods identify clusters based on the same number of variables used for document encoding, neural networks automatically identify variables that cause distinctions among clusters. Therefore, the variables’ number may vary depending on the documents structure and content, and is difficult to estimate it a priori. This paper proposes a hybrid classification method suitable for heterogeneous document bases like the ones commonly encountered in business and knowledge management applications. Our method is based on an evolutionary algorithm for tuning both neural network’s structure and weights. While searching the optimal neural network’s configuration it is possible to determine the minimal number of variables to be used in order to classify the given set of documents.

Evolutionary ANNs for improving accuracy and efficiency in document classification methods / A. Azzini, P. Ceravolo - In: Knowledge-based intelligent information and engineering systems : 10. international conference, KES 2006 : Bournemouth, UK, october 9-11, 2006 : proceedings / [a cura di] B. Gabrys, R.J. Howlett, L.C. Jain. - Berlin : Springer, 2006. - ISBN 9783540465423. - pp. 1111-1118 (( convegno International Conference on Knowledge-based & Intelligent Information & Engineering Systems tenutosi a Bournemouth, UK nel 2006 [10.1007/11893011_140].

Evolutionary ANNs for improving accuracy and efficiency in document classification methods

A. Azzini^Primo;P. Ceravolo^Ultimo

2006

Abstract

Approaches to document classification belong to two major families: similarity-based (crisp) classification methods and neural networks (gradual) ones. For gradual techniques, a major open issue is controlling search space dimension. While similarity-based methods identify clusters based on the same number of variables used for document encoding, neural networks automatically identify variables that cause distinctions among clusters. Therefore, the variables’ number may vary depending on the documents structure and content, and is difficult to estimate it a priori. This paper proposes a hybrid classification method suitable for heterogeneous document bases like the ones commonly encountered in business and knowledge management applications. Our method is based on an evolutionary algorithm for tuning both neural network’s structure and weights. While searching the optimal neural network’s configuration it is possible to determine the minimal number of variables to be used in order to classify the given set of documents.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				Ontology construction ; Formal concept analysis ; Fuzzy bags ; Neural networks ; Genetic algorithms.
			
	Data di pubblicazione
	
				2006
			
	DOI
	
				https://dx.doi.org/10.1007/11893011_140
			
	Tipologia
	
				Book Part (author)
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/49885

Citazioni

ND

2

2

ND

social impact