Predicting protein dynamics on a molecular level is central to understanding and ultimately controlling the biomolecular machines that govern life. Despite advances in molecular dynamics and AI-based structure prediction, the accurate and efficient simulation of complex self-assembly processes - such as protein aggregation, protein folding, and the dynamics of intrinsically disordered proteins - remains a limitation of most approaches due to system size and sampling limitations. This work presents the development and application of the multi-eGO model, a data-driven, hybrid structure-based approach designed to overcome these barriers. By combining an informative prior with high-resolution structural information or lower-resolution experimental data, the multi-eGO force field learns conformational ensembles across multiple energy minima while maintaining atomistic resolution. Applications include the folding dynamics of protein G and X11-PDZ1-PDZ2, the structural ensemble of amyloid-b42 the aggregation of transthyretin peptides, and the self-assembly of ferritin complexes. Complementary to these studies, conventional molecular dynamics simulations are used to research the effect of electric fields on the dynamics of amyloid b42 fibrils, revealing the potential of electric fields to disrupt assembly and secondary nucleation. Results demonstrate that multi-eGO can reproduce equilibrium, out-of-equilibrium, and kinetic features, and integrate heterogeneous data sources, such as SAXS and PRE NMR data, to refine the model without the need for explicit training. At the same time, limitations such as finite-size effects and kinetic trapping for intermolecular processes highlight the need for further refinement.

STRUCTURE-BASED APPROACHES TO DATA-DRIVEN PROTEIN FOLDING, AGGREGATION, AND SELF-ASSEMBLY / F. Bacic Toplek ; tutor: C. Camilloni ; coordinatore: S. Ricagno. Dipartimento di Bioscienze, 2025 Nov 24. 38. ciclo

STRUCTURE-BASED APPROACHES TO DATA-DRIVEN PROTEIN FOLDING, AGGREGATION, AND SELF-ASSEMBLY

F. Bacic Toplek
2025

Abstract

Predicting protein dynamics on a molecular level is central to understanding and ultimately controlling the biomolecular machines that govern life. Despite advances in molecular dynamics and AI-based structure prediction, the accurate and efficient simulation of complex self-assembly processes - such as protein aggregation, protein folding, and the dynamics of intrinsically disordered proteins - remains a limitation of most approaches due to system size and sampling limitations. This work presents the development and application of the multi-eGO model, a data-driven, hybrid structure-based approach designed to overcome these barriers. By combining an informative prior with high-resolution structural information or lower-resolution experimental data, the multi-eGO force field learns conformational ensembles across multiple energy minima while maintaining atomistic resolution. Applications include the folding dynamics of protein G and X11-PDZ1-PDZ2, the structural ensemble of amyloid-b42 the aggregation of transthyretin peptides, and the self-assembly of ferritin complexes. Complementary to these studies, conventional molecular dynamics simulations are used to research the effect of electric fields on the dynamics of amyloid b42 fibrils, revealing the potential of electric fields to disrupt assembly and secondary nucleation. Results demonstrate that multi-eGO can reproduce equilibrium, out-of-equilibrium, and kinetic features, and integrate heterogeneous data sources, such as SAXS and PRE NMR data, to refine the model without the need for explicit training. At the same time, limitations such as finite-size effects and kinetic trapping for intermolecular processes highlight the need for further refinement.
24-nov-2025
Settore BIOS-08/A - Biologia molecolare
CAMILLONI, CARLO
RICAGNO, STEFANO
Doctoral Thesis
STRUCTURE-BASED APPROACHES TO DATA-DRIVEN PROTEIN FOLDING, AGGREGATION, AND SELF-ASSEMBLY / F. Bacic Toplek ; tutor: C. Camilloni ; coordinatore: S. Ricagno. Dipartimento di Bioscienze, 2025 Nov 24. 38. ciclo
File in questo prodotto:
File Dimensione Formato  
phd_unimi_R13841.pdf

accesso aperto

Descrizione: Doctoral thesis
Tipologia: Publisher's version/PDF
Licenza: Creative commons
Dimensione 2.73 MB
Formato Adobe PDF
2.73 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1198095
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact