The weight space of an artificial neural network can be systematically explored using tools from statistical mechanics. We employ a combination of a hybrid Monte Carlo algorithm which performs long exploration steps, a ratchet-based algorithm to investigate connectivity paths, and coupled replica models simulations to study subdominant flat regions. Our analysis focuses on one-hidden-layer networks and spans a range of energy levels and constrained density regimes. Near the interpolation threshold, the low-energy manifold shows a spiky topology. In the overparameterized regime, however, the low-energy manifold becomes entirely flat, forming an extended complex structure that is easy to sample. These numerical results are supported by an analytical study of the training error landscape, and we show numerically that the qualitative features of the loss landscape are robust across different data structures. Our study aims to provide new methodological insights for developing scalable methods for large networks.

Sampling the space of solutions of an artificial neural network / A. Zambon, E.M. Malatesta, G. Tiana, R. Zecchina. - In: PHYSICAL REVIEW. E. - ISSN 2470-0053. - 112:(2025), pp. 1-25. [10.1103/qs48-jzyq]

Sampling the space of solutions of an artificial neural network

E.M. Malatesta;G. Tiana
;
2025

Abstract

The weight space of an artificial neural network can be systematically explored using tools from statistical mechanics. We employ a combination of a hybrid Monte Carlo algorithm which performs long exploration steps, a ratchet-based algorithm to investigate connectivity paths, and coupled replica models simulations to study subdominant flat regions. Our analysis focuses on one-hidden-layer networks and spans a range of energy levels and constrained density regimes. Near the interpolation threshold, the low-energy manifold shows a spiky topology. In the overparameterized regime, however, the low-energy manifold becomes entirely flat, forming an extended complex structure that is easy to sample. These numerical results are supported by an analytical study of the training error landscape, and we show numerically that the qualitative features of the loss landscape are robust across different data structures. Our study aims to provide new methodological insights for developing scalable methods for large networks.
Settore PHYS-04/A - Fisica teorica della materia, modelli, metodi matematici e applicazioni
2025
https://journals.aps.org/pre/abstract/10.1103/qs48-jzyq
Article (author)
File in questo prodotto:
File Dimensione Formato  
2503.08266v1-2.pdf

accesso aperto

Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Licenza: Creative commons
Dimensione 1.77 MB
Formato Adobe PDF
1.77 MB Adobe PDF Visualizza/Apri
qs48-jzyq.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Licenza: Nessuna licenza
Dimensione 3.04 MB
Formato Adobe PDF
3.04 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1186037
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact