We devise an SVM for partitioning a sample space affected by random binary labels. In the hypothesis that a smooth, possibly symmetric, conditional label distribution graduates the passage from the all 0-label domain to the all 1-label domain and under other regularity conditions, the algorithm supplies an estimate of the above probabilities. Within the Algorithmic Inference framework, the randomness of the labels maintains the main features of the binary classification problem, yet adding a further dimension to the search space. Namely the new dimension of each point in the original space hosts the uniform seeds accounting for the randomness of the labels, so that the problem becomes that of separating the points in the augmented space. We solve it with a new kind of bootstrap technique. As for error bounds of the proposed algorithm, we obtain confidence intervals that are up to an order narrower than those supplied in the literature. This benefit comes from the fact that: (i) we devise a special algorithm to take into account the random profile of the labels; (ii) we know the number of support vectors really employed, as an ancillary output of the learning procedure; and (iii) we can appreciate confidence intervals of misclassifying probability exactly in function of the cardinality of these vectors. We numerically check these results by measuring the coverage of the confidence intervals.

SVM with Random Labels / B. Apolloni, S. Bassis, D. Malchiodi (LECTURE NOTES IN ARTIFICIAL INTELLIGENCE). - In: Knowledge-Based Intelligent Information and Engineering Systems / [a cura di] B. Apolloni, R. Howlett and L. Jain. - Berlin : Springer, 2007. - ISBN 9783540748281. - pp. 184-193 (( convegno joint conference on KES 2007 tenutosi a Vietri sul Mare (SA), Italy nel 2007 [10.1007/978-3-540-74829-8_23].

SVM with Random Labels

B. Apolloni
Primo
;
S. Bassis
Secondo
;
D. Malchiodi
Ultimo
2007

Abstract

We devise an SVM for partitioning a sample space affected by random binary labels. In the hypothesis that a smooth, possibly symmetric, conditional label distribution graduates the passage from the all 0-label domain to the all 1-label domain and under other regularity conditions, the algorithm supplies an estimate of the above probabilities. Within the Algorithmic Inference framework, the randomness of the labels maintains the main features of the binary classification problem, yet adding a further dimension to the search space. Namely the new dimension of each point in the original space hosts the uniform seeds accounting for the randomness of the labels, so that the problem becomes that of separating the points in the augmented space. We solve it with a new kind of bootstrap technique. As for error bounds of the proposed algorithm, we obtain confidence intervals that are up to an order narrower than those supplied in the literature. This benefit comes from the fact that: (i) we devise a special algorithm to take into account the random profile of the labels; (ii) we know the number of support vectors really employed, as an ancillary output of the learning procedure; and (iii) we can appreciate confidence intervals of misclassifying probability exactly in function of the cardinality of these vectors. We numerically check these results by measuring the coverage of the confidence intervals.
Algorithmic inference; Classification; SVM; Uncertain labels
Settore INF/01 - Informatica
2007
Book Part (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/41457
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact