The traditional approach of statistical physics to supervised learning routinely assumes unrealistic generative models for the data: Usually inputs are independent random variables, uncorrelated with their labels. Only recently, statistical physicists started to explore more complex forms of data, such as equally labeled points lying on (possibly low-dimensional) object manifolds. Here we provide a bridge between this recently established research area and the framework of statistical learning theory, a branch of mathematics devoted to inference in machine learning. The overarching motivation is the inadequacy of the classic rigorous results in explaining the remarkable generalization properties of deep learning. We propose a way to integrate physical models of data into statistical learning theory and address, with both combinatorial and statistical mechanics methods, the computation of the Vapnik-Chervonenkis entropy, which counts the number of different binary classifications compatible with the loss class. As a proof of concept, we focus on kernel machines and on two simple realizations of data structure introduced in recent physics literature: k-dimensional simplexes with prescribed geometric relations and spherical manifolds (equivalent to margin classification). Entropy, contrary to what happens for unstructured data, is nonmonotonic in the sample size, in contrast with the rigorous bounds. Moreover, data structure induces a transition beyond the storage capacity, which we advocate as a proxy of the nonmonotonicity, and ultimately a cue of low generalization error. The identification of a synaptic volume vanishing at the transition allows a quantification of the impact of data structure within replica theory, applicable in cases where combinatorial methods are not available, as we demonstrate for margin learning.

Statistical learning theory of structured data / M. Pastore, P. Rotondo, V. Erba, M. Gherardi. - In: PHYSICAL REVIEW. E. - ISSN 2470-0045. - 102:3(2020 Sep 14). [10.1103/PhysRevE.102.032119]

Statistical learning theory of structured data

M. Pastore
Primo
;
P. Rotondo
Secondo
;
V. Erba
Penultimo
;
M. Gherardi
Ultimo
2020

Abstract

The traditional approach of statistical physics to supervised learning routinely assumes unrealistic generative models for the data: Usually inputs are independent random variables, uncorrelated with their labels. Only recently, statistical physicists started to explore more complex forms of data, such as equally labeled points lying on (possibly low-dimensional) object manifolds. Here we provide a bridge between this recently established research area and the framework of statistical learning theory, a branch of mathematics devoted to inference in machine learning. The overarching motivation is the inadequacy of the classic rigorous results in explaining the remarkable generalization properties of deep learning. We propose a way to integrate physical models of data into statistical learning theory and address, with both combinatorial and statistical mechanics methods, the computation of the Vapnik-Chervonenkis entropy, which counts the number of different binary classifications compatible with the loss class. As a proof of concept, we focus on kernel machines and on two simple realizations of data structure introduced in recent physics literature: k-dimensional simplexes with prescribed geometric relations and spherical manifolds (equivalent to margin classification). Entropy, contrary to what happens for unstructured data, is nonmonotonic in the sample size, in contrast with the rigorous bounds. Moreover, data structure induces a transition beyond the storage capacity, which we advocate as a proxy of the nonmonotonicity, and ultimately a cue of low generalization error. The identification of a synaptic volume vanishing at the transition allows a quantification of the impact of data structure within replica theory, applicable in cases where combinatorial methods are not available, as we demonstrate for margin learning.
Settore FIS/02 - Fisica Teorica, Modelli e Metodi Matematici
14-set-2020
Article (author)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/776162
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 16
  • ???jsp.display-item.citation.isi??? 11
social impact