Large-scale adoption of Artificial Intelligence and Machine Learning (AI-ML) models fed by heterogeneous, possibly untrustworthy data sources has spurred interest in estimating degradation of such models due to spurious, adversarial or low quality data assets. We propose a quantitative estimate of the severity of classifiers’ training set degradation: an index expressing the deformation of the convex hulls of the classes computed on an held-out data set generated via an unsupervised technique. We show that our index is computationally light, can be calculated incrementally and complements well existing ML data assets’ quality measures. As an experimentation, we present the computation of our index on a benchmark convolutional image classifier.

Estimating Degradation of Machine Learning Data Assets / L. Mauri, E. Damiani. - In: ACM JOURNAL OF DATA AND INFORMATION QUALITY. - ISSN 1936-1955. - 14:2(2022 Jun), pp. 9.1-9.15. [10.1145/3446331]

Estimating Degradation of Machine Learning Data Assets

L. Mauri
Primo
;
E. Damiani
Secondo
2022

Abstract

Large-scale adoption of Artificial Intelligence and Machine Learning (AI-ML) models fed by heterogeneous, possibly untrustworthy data sources has spurred interest in estimating degradation of such models due to spurious, adversarial or low quality data assets. We propose a quantitative estimate of the severity of classifiers’ training set degradation: an index expressing the deformation of the convex hulls of the classes computed on an held-out data set generated via an unsupervised technique. We show that our index is computationally light, can be calculated incrementally and complements well existing ML data assets’ quality measures. As an experimentation, we present the computation of our index on a benchmark convolutional image classifier.
Machine Learning; Data Assets; data assets; ML models;
Settore INF/01 - Informatica
11-dic-2021
Article (author)
File in questo prodotto:
File Dimensione Formato  
PDF39967661-805881599.pdf

accesso aperto

Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione 1.46 MB
Formato Adobe PDF
1.46 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/890056
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 0
social impact