COMPUTATIONAL MODELING OF PROTEINS: FROM STATISTICAL MECHANICS TO IMMUNOLOGY

Capelli, R.

doi:10.13130/r-capelli_phd2017-11-24

One of the biggest revolutions occurred during the second half of the 20th century in physics was the introduction of computers in research. In particular, the use of fast computing machines opened the possibility to study complex systems by simulating their dynamics, without the need to pursue analytical solutions, otherwise impossible to tackle. The consequences of this breakthrough were huge both in the study of equilibrium and non-equilibrium many-body problems, with the strong limitation given by the number of atoms involved in the calculation. The first technique used in biology-related problems was the Monte Carlo Method, and some years later Molecular Dynamics (MD) was formalized. In MD, for each atom of the system one can solve its Newton equations of motion, obtaining a trajectory in the phase space for the entire system, and study its behavior in equilibrium and non-equilibrium conditions. The constant rise in computational power gave the possibility to scientists to study larger and larger systems, while the advances in experimental techniques enhanced the possibility for direct comparisons between wet and in silico data at similar levels of resolution. Despite the validity of Moore’s Law (i.e., the exponential growth of the computing power due to transistors miniaturization) until now, the timescale of the events that can be simulated has an upper limit of the millisecond with tailor-made computers, which is not enough to study all the biologically-relevant phenomena. Since the birth of computational chemistry, a huge number of different statistical mechanics-based methods has been implemented to permit, given the computing power limit, an effective reliable use of MD simulations in biochemistry. One of the most relevant problems tackled by MD is the calculation of free energy differences, both in conformational changes and in sequence mutations of a protein. The main reason of this difficulty is represented by the frustrated nature of interactions in proteins and the size of these systems: this leads to a complex energy landscape which in principle needs very long sampling times to overcome all possible energy barriers. In the present thesis, we studied and improved a path-independent and system-independent free energy calculation technique, called Simplified Confinement Method. We describe this work in Chapter 1. Although MD has been successful in most of its applications, there are still many open problems: as mentioned before, the available parametrizations of interaction potentials (called force fields) are not completely reliable. In particular, the choice of force field parameters is performed comparing experimental data on a fixed set of (usually small) molecules with computed data on the same molecules. This raises a significant problem: large molecules can have a more complex behavior, and using these potentials can lead to a systematic error; furthermore, the timescale in which the force field is tested needs to be limited. Another strong limitation of MD depends on the equilibrium experiments used for parametrization: the kinetic properties of a system are not considered. Given the impossibility to reparametrize a general force field with non-equilibrium experimental data, we implemented a technique that uses equilibrium-based force fields, adding a potential term based on time series resulting from kinetic experiments. This approach, based on the principle of Maximum Caliber, restrains the system with an experimental-based bias, returning a more realistic behavior of the simulation in condition where the usual force fields show their limitations. We describe this work in Chapter 2. The application of computational methods in the study of proteins confirms its efficacy in other fields of life sciences: an actual and emerging topic is represented by vaccinology. With techniques developed by Louis Pasteur at the end of the 19 th century (isolation of the pathogen, its inactivation and subsequent inoculation in the host), various scientists developed vaccines for deadly diseases like poliomyelitis, diphterite and measles. None of the mentioned was developed with molecular biology-based approaches. Almost 50 years after the birth of molecular biology, the Human Genome Project decoded human DNA and, at the same time, the genome of the most dangerous pathogen was screened. This has laid the foundation of Reverse Vaccinology (RV), where the proteins responsible for immune reaction can be identified from the pathogen DNA and tested directly on animal models, obtaining a new vaccine candidate with little or no risk for the host, having removed the pathogen itself. At the beginning of the 21st century the first vaccine against Meningococcus B, responsible for the 50% of the meningococcal meningitis, was developed using this protocol. Since then, crystallographic data was inserted in RV workflow to exploit conformational data, creating the so-called Structural Vaccinology (SV). To enhance its efficacy, SV exploits all the aspects of molecular modeling like computer-aided drug/protein design and MD to integrate information that come from experimental sources. One of the most promising technique in this field is the grafting of an immunogenic sequence (i.e., a portion of a protein recognized by the immune system) on a foreign protein; this approach could lead to a new vaccine component which have no risk for the patient. To date, the grafting technique has been carried out by human-driven workflows. Motivated by this reason, we studied immunogenic peptides from a family of pathogens involved in respiratory diseases, exploiting Structural Vaccinology principles with both computational and experimental approach. Furthermore, we developed and implemented an unsupervisionated automated tool to design grafted protein sequences. We describe this work in Chapter 3.

COMPUTATIONAL MODELING OF PROTEINS: FROM STATISTICAL MECHANICS TO IMMUNOLOGY / R. Capelli ; tutor: G. Tiana, G. Colombo; coordinatore: F. Ragusa. DIPARTIMENTO DI FISICA, 2017 Nov 24. 30. ciclo, Anno Accademico 2017. [10.13130/r-capelli_phd2017-11-24].