Extracting information from gene expression data is a difficult task, as these data are characterized by very high dimensional, small sized, samples and large degree of biological variability. However, a possible way of dealing with the curse of dimensionality is offered by feature selection algorithms, while variance problems arising from small samples and biological variability can be addressed through ensemble methods based on resampling techniques. These two approaches have been combined to improve the accuracy of Support Vector Machines (SVM) in the classification of malignant tissues from DNA microarray data. To assess the accuracy and the confidence of the predictions performed proper measures have been introduced. Presented results show that bagged ensembles of SVM are more reliable and achieve equal or better classification accuracy with respect to single SVM, whereas feature selection methods can further enhance classification accuracy.

Bagged ensembles of Support Vector Machines for gene expression data analysis / G. Valentini, M. Muselli, F. Ruffino - In: Neural Networks, 2003 : proceedings of the International Joint Conference on[s.l] : IEEE, 2003. - ISBN 0780378989. - pp. 1844-1849 (( convegno International Joint Conference on Neural Networks tenutosi a Portland nel 2003 [10.1109/IJCNN.2003.1223688].

Bagged ensembles of Support Vector Machines for gene expression data analysis

G. Valentini
Primo
;
2003

Abstract

Extracting information from gene expression data is a difficult task, as these data are characterized by very high dimensional, small sized, samples and large degree of biological variability. However, a possible way of dealing with the curse of dimensionality is offered by feature selection algorithms, while variance problems arising from small samples and biological variability can be addressed through ensemble methods based on resampling techniques. These two approaches have been combined to improve the accuracy of Support Vector Machines (SVM) in the classification of malignant tissues from DNA microarray data. To assess the accuracy and the confidence of the predictions performed proper measures have been introduced. Presented results show that bagged ensembles of SVM are more reliable and achieve equal or better classification accuracy with respect to single SVM, whereas feature selection methods can further enhance classification accuracy.
classification; cancer; prediction; networks; arrays
Settore INF/01 - Informatica
Settore ING-INF/05 - Sistemi di Elaborazione delle Informazioni
2003
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
01223688.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 350.67 kB
Formato Adobe PDF
350.67 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/440767
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 27
  • ???jsp.display-item.citation.isi??? 11
social impact