A fuzzy K-nearest neighbor classifier to deal with imperfect data

Cadenas, J.M.; Garrido, M.C.; Martínez, R.; Muñoz, E.; Bonissone, P.P.

doi:10.1007/s00500-017-2567-x

The k-nearest neighbors method (kNN) is a nonparametric, instance-based method used for regression and classification. To classify a new instance, the kNN method computes its k nearest neighbors and generates a class value from them. Usually, this method requires that the information available in the datasets be precise and accurate, except for the existence of missing values. However, data imperfection is inevitable when dealing with real-world scenarios. In this paper, we present the kNN(Formula presented.) classifier, a k-nearest neighbors method to perform classification from datasets with imperfect value. The importance of each neighbor in the output decision is based on relative distance and its degree of imperfection. Furthermore, by using external parameters, the classifier enables us to define the maximum allowed imperfection, and to decide if the final output could be derived solely from the greatest weight class (the best class) or from the best class and a weighted combination of the closest classes to the best one. To test the proposed method, we performed several experiments with both synthetic and real-world datasets with imperfect data. The results, validated through statistical tests, show that the kNN(Formula presented.) classifier is robust when working with imperfect data and maintains a good performance when compared with other methods in the literature, applied to datasets with or without imperfection.

A fuzzy K-nearest neighbor classifier to deal with imperfect data / J.M. Cadenas, M.C. Garrido, R. Martínez, E. Muñoz, P.P. Bonissone. - In: SOFT COMPUTING. - ISSN 1432-7643. - (2017 Apr 01), pp. 1-18. [Epub ahead of print] [10.1007/s00500-017-2567-x]

A fuzzy K-nearest neighbor classifier to deal with imperfect data

J. M. Cadenas;M. C. Garrido;R. Martínez;E. Muñoz;P. P. Bonissone

2017

Abstract

The k-nearest neighbors method (kNN) is a nonparametric, instance-based method used for regression and classification. To classify a new instance, the kNN method computes its k nearest neighbors and generates a class value from them. Usually, this method requires that the information available in the datasets be precise and accurate, except for the existence of missing values. However, data imperfection is inevitable when dealing with real-world scenarios. In this paper, we present the kNN(Formula presented.) classifier, a k-nearest neighbors method to perform classification from datasets with imperfect value. The importance of each neighbor in the output decision is based on relative distance and its degree of imperfection. Furthermore, by using external parameters, the classifier enables us to define the maximum allowed imperfection, and to decide if the final output could be derived solely from the greatest weight class (the best class) or from the best class and a weighted combination of the closest classes to the best one. To test the proposed method, we performed several experiments with both synthetic and real-world datasets with imperfect data. The results, validated through statistical tests, show that the kNN(Formula presented.) classifier is robust when working with imperfect data and maintains a good performance when compared with other methods in the literature, applied to datasets with or without imperfection.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				classification; combination methods; distance/dissimilarity measures; imperfect data; k-nearest neighbors; theoretical computer science; software; geometry and topology
			
	Settori scientifico-disciplinari dell'articolo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Data di pubblicazione
	
				1-apr-2017
			
	Data ahead of print o data di stampa
	
				1-apr-2017
			
	Rivista in ANCE
	
				SOFT COMPUTING
			
	DOI
	
				https://dx.doi.org/10.1007/s00500-017-2567-x
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/501168

Citazioni

ND

19

18

ND

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca