DNA microarray data are characterized by high-dimensional and low-sized samples, as only few tens of DNA microarray experiments, involving each one thousands of genes, are usually available for data processing. Considering also the large biological variability of gene expression and the noise introduced by the bio-technological machinery, we need robust and variance-reducing data analysis methods. To this purpose, we propose an application of a new ensemble method based on the bias-variance decomposition of the error, using Support Vector Machines (SVMs) as base learners. This approach, that we named Low bias bagging (Lobag), tries to reduce both the bias and the variance components of the error, selecting the base learners with the lowest bias, and combining them through bootstrap aggregating techniques. We applied Lobag to the classification of normal and heterogeneous malignant tissues, using DNA microarray gene expression data. Preliminary results on this challenging two-class classification problem show that Lobag, in association with simple feature selection methods, outperforms both single and bagged ensembles of SVMs.
An application of low bias bagged SVMs to the classification of heterogeneous malignant tissues / G. Valentini - In: Neural Nets / [a cura di] B. Apolloni, M. Marinaro, R. Tagliaferri. - [s.l] : Springer, 2003. - ISBN 9783540202271. - pp. 316-321 (( Intervento presentato al 14. convegno Italian Workshop on Neural Nets tenutosi a Vietri sul Mare nel 2003.
An application of low bias bagged SVMs to the classification of heterogeneous malignant tissues
G. ValentiniPrimo
2003
Abstract
DNA microarray data are characterized by high-dimensional and low-sized samples, as only few tens of DNA microarray experiments, involving each one thousands of genes, are usually available for data processing. Considering also the large biological variability of gene expression and the noise introduced by the bio-technological machinery, we need robust and variance-reducing data analysis methods. To this purpose, we propose an application of a new ensemble method based on the bias-variance decomposition of the error, using Support Vector Machines (SVMs) as base learners. This approach, that we named Low bias bagging (Lobag), tries to reduce both the bias and the variance components of the error, selecting the base learners with the lowest bias, and combining them through bootstrap aggregating techniques. We applied Lobag to the classification of normal and heterogeneous malignant tissues, using DNA microarray gene expression data. Preliminary results on this challenging two-class classification problem show that Lobag, in association with simple feature selection methods, outperforms both single and bagged ensembles of SVMs.File | Dimensione | Formato | |
---|---|---|---|
chp%3A10.1007%2F978-3-540-45216-4_36.pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Dimensione
130.75 kB
Formato
Adobe PDF
|
130.75 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.