Predicting protein dynamics on a molecular level is central to understanding and ultimately controlling the biomolecular machines that govern life. Despite advances in molecular dynamics and AI-based structure prediction, the accurate and efficient simulation of complex self-assembly processes - such as protein aggregation, protein folding, and the dynamics of intrinsically disordered proteins - remains a limitation of most approaches due to system size and sampling limitations. This work presents the development and application of the multi-eGO model, a data-driven, hybrid structure-based approach designed to overcome these barriers. By combining an informative prior with high-resolution structural information or lower-resolution experimental data, the multi-eGO force field learns conformational ensembles across multiple energy minima while maintaining atomistic resolution. Applications include the folding dynamics of protein G and X11-PDZ1-PDZ2, the structural ensemble of amyloid-b42 the aggregation of transthyretin peptides, and the self-assembly of ferritin complexes. Complementary to these studies, conventional molecular dynamics simulations are used to research the effect of electric fields on the dynamics of amyloid b42 fibrils, revealing the potential of electric fields to disrupt assembly and secondary nucleation. Results demonstrate that multi-eGO can reproduce equilibrium, out-of-equilibrium, and kinetic features, and integrate heterogeneous data sources, such as SAXS and PRE NMR data, to refine the model without the need for explicit training. At the same time, limitations such as finite-size effects and kinetic trapping for intermolecular processes highlight the need for further refinement.
STRUCTURE-BASED APPROACHES TO DATA-DRIVEN PROTEIN FOLDING, AGGREGATION, AND SELF-ASSEMBLY / F. Bacic Toplek ; tutor: C. Camilloni ; coordinatore: S. Ricagno. Dipartimento di Bioscienze, 2025 Nov 24. 38. ciclo
STRUCTURE-BASED APPROACHES TO DATA-DRIVEN PROTEIN FOLDING, AGGREGATION, AND SELF-ASSEMBLY
F. Bacic Toplek
2025
Abstract
Predicting protein dynamics on a molecular level is central to understanding and ultimately controlling the biomolecular machines that govern life. Despite advances in molecular dynamics and AI-based structure prediction, the accurate and efficient simulation of complex self-assembly processes - such as protein aggregation, protein folding, and the dynamics of intrinsically disordered proteins - remains a limitation of most approaches due to system size and sampling limitations. This work presents the development and application of the multi-eGO model, a data-driven, hybrid structure-based approach designed to overcome these barriers. By combining an informative prior with high-resolution structural information or lower-resolution experimental data, the multi-eGO force field learns conformational ensembles across multiple energy minima while maintaining atomistic resolution. Applications include the folding dynamics of protein G and X11-PDZ1-PDZ2, the structural ensemble of amyloid-b42 the aggregation of transthyretin peptides, and the self-assembly of ferritin complexes. Complementary to these studies, conventional molecular dynamics simulations are used to research the effect of electric fields on the dynamics of amyloid b42 fibrils, revealing the potential of electric fields to disrupt assembly and secondary nucleation. Results demonstrate that multi-eGO can reproduce equilibrium, out-of-equilibrium, and kinetic features, and integrate heterogeneous data sources, such as SAXS and PRE NMR data, to refine the model without the need for explicit training. At the same time, limitations such as finite-size effects and kinetic trapping for intermolecular processes highlight the need for further refinement.| File | Dimensione | Formato | |
|---|---|---|---|
|
phd_unimi_R13841.pdf
accesso aperto
Descrizione: Doctoral thesis
Tipologia:
Publisher's version/PDF
Licenza:
Creative commons
Dimensione
2.73 MB
Formato
Adobe PDF
|
2.73 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




