Since about two decades the scientific community has discovered proteins that exhibit a self-entangled native structure. The folding of such knotted proteins is a challenging research topic of great interest in biology and biophysics. A complete understanding on how these...
Since about two decades the scientific community has discovered proteins that exhibit a self-entangled native structure. The folding of such knotted proteins is a challenging research topic of great interest in biology and biophysics. A complete understanding on how these proteins form topologically complex structures would unveil interesting details on the general paradigm of protein folding and, on top of this, enlighten the rationale behind the folding and misfolding of a class of proteins related to crucial biophysical functions in the human body. In this field, computer simulations have greatly contributed to advance our comprehension of the mechanisms that allow a polypeptide to form a knot during the folding. However, due to limitations in methods and resources, the state-of-the-art picture on the process is still incomplete, and mostly limited to the simplest nontrivial topology, the trefoil knot. The main objective of this action was to computationally investigate the folding of entangled proteins, in particular of human Ubiquitin C-terminal Hydrolase, whose backbone forms a Gordian knot with five crossings. The chosen methods are based on a multi-scale Molecular Dynamics strategy that combines coarse grained and all-atom models with enhanced sampling. The coarse grained model is employed to outline a general picture of the folding, and devise the preferential pathways and intermediate states. Building on this low resolution knowledge, a full-atom representation of the system can be built, targeting these more precise calculations to the most relevant regions of the protein\'s free energy landscape. To lift the computational time limitations of such an approach, the project proposes to employ enhanced sampling techniques such as Metadynamics and Variationally Enhanced Sampling. These advanced methodologies can drive the molecular dynamics algorithm to generate rare, but extremely relevant configurations for the study of protein folding, thus accelerating the calculations. The output of the project will generalize the current picture on knotted protein folding, introducing important methodological advancements, and contributing to the knowledge on a system of great biomedical interest, connected to diseases such as Parkinson\'s and Alzheimer\'s.
In the first part of the project we have mostly focused on methodology development. These first result have laid the foundations for the future investigation of knotted proteins, tuning well-suited techniques to address this and other related biophysical problems. We have first investigated the folding mechanism of Granulocyte-macrophage colony-stimulating factor, a glycoprotein that handles diverse functions in the human body. This protein folds in a rather common self-entangled conformation named Complex-Lasso. Understanding how a polypeptide encodes into its sequence the capability of tying itself into such kind of self-entangled structures represents a major advancement in the comprehension of protein folding. We have studied this mechanism using both a well-known minimalistic model of the protein, and an alternative model, specifically designed by us to highlight the preferential pathways of entangled folding. Our calculations have shown how the protein can avoid the kinetic traps related to self-entanglement, managing to fold in a reproducible and efficient way. In this work we have employed a genetic optimization strategy which was developed to tune the parameters of the coarse-grained protein model, favoring the most efficient and reproducible pathways towards the native state. At the same time, we have developed two suitable descriptors to classify the topological state of the protein backbone. Such topological variables have allowed to monitor the evolution of the protein entanglement topology along the folding trajectories. Nonetheless, the definition of such kind of descriptors is crucial for the future application of enhanced sampling techniques, as these are based on the definition of a bias potential that acts on the relevant degrees of freedom of the folding dynamics. The topological variables represent the ideal degrees of freedom in the framework of self-entangled proteins. This work outlined an effective strategy for the study of further systems, which have been carried out in the following part of the project. In the following, we have addressed the coarse-grained molecular modeling of Leptin, a self-entangled hormone which is known to regulate energetic processes in human cells. Leptin presents a self-entangled native structure as well. Thanks to the developed methodologies we could study the folding of this protein, obtaining precious information on its dynamics and preferential folding pathways. These fruitful techniques were also applied to another self-entangled molecule, namely Human Interleukin 3, a signaling protein that regulates the production, differentiation and function of granulocyte and macrophages in the human cells. Also in these case we could enlighten a detailed map of the kinetics of entangled folding. In all these cases, the obtained results could be exploited to direct molecular dynamics studied using more detailed atomistic models of the molecules. In the final work period we have also directed our work towards the study of Human Ubiquitin C-terminal Hydrolase, an enzyme whose expression is highly specific to neurons and to cells of the diffuse neuroendocrine system and their tumors. We have applied our well-tuned modeling techniques, starting the development of a coarse-grained representation capable to fold in an efficient way towards its complex native topology. The outcomes of the action were disseminated through a methodological paper, a review paper, and a number of seminars in conferences, workshops and research groups. In connection to the project and its results, a mini-course on enhanced sampling methods has also been delivered to M.Sc. and PhD students in the University of Trento.
The progress beyond the state-of-the-art associated to the project includes important methodological advancements, in the development of a coarse-grained description for self-entangled and complex-lasso protein folding. A genetic parametrization technique for the coarse-grained model has been implemented, which could be employed with all the minimalistic descriptions of proteins adopted in the literature. A further advancement is the definition of proper topological descriptors for the study of self-entangled protein folding. These results have a general impact on the computational study of topologically complex proteins, both for continuing with the next steps outlined in this project, and for further possible research. The outlined progresses were accompained by insights on the folding of different self-entangled proteins, that have an impact on current knowledge about folding and knotting pathways of these polypeptides. The action had an outstanding impact on my profile as a researcher. I have both acquired new experience in the area and fully exploited my previous expertise. The host environment has been deeply beneficial for my growth as a researcher, and the resulting networking has triggered several collaborations.
More info: http://www2.mpip-mainz.mpg.de/.