Skip to Main content Skip to Navigation
Theses

ALGORITHMS AND COMPUTATIONAL TOOLS FOR THE STUDY OF INTRINSICALLY DISORDERED PROTEINS

Alejandro Estaña Garcia 1
1 LAAS-RIS - Équipe Robotique et InteractionS
LAAS - Laboratoire d'analyse et d'architecture des systèmes
Abstract : Intrinsically Disordered Proteins (IDPs) are involved in many biological processes. Their inherent plasticity facilitates very specialized tasks in cell regulation and signalling, and their malfunction is linked to severe pathologies. Understanding the functional roles of IDPs requires their structural characterization, which is extremely challenging, and needs a tight coupling of experimental and computational methods. In contrast to structured/globular proteins, IDPs cannot be represented by a single conformation, and their models must be based on ensembles of conformations representing a distribution of states that the protein adopts in solution. While purely random coil ensembles can be reliably constructed by available bioinformatics tools, these tools fail to reproduce the conformational equilibrium present in partially-structured regions. In this thesis, we propose several computational methods that, combined with experimental data, provide a better structural characterization of IDPs. These methods can be grouped in two main categories: methods to construct conformational ensemble models, and methods to simulate conformational transitions. Contributing to the first type of methods, we propose a new approach to generate realistic conformational ensembles that improves previously existing methods, being able to reproduce the partially-structured regions in IDPs. This method exploits structural information encoded in a database of three-residue fragments (tripeptides) extracted from high-resolution experimentally-solved protein structures. We have shown that conformational ensembles generated by our method reproduce accurately structural descriptors obtained from NMR and SAXS experiments for a benchmark set of nine IDPs. Also exploiting the tripeptide database, we have developed an algorithm to predict the propensity of some fragments inside IDPs to form secondary structure elements. This new method provides more accurate results than those of the most commonly-used predictors available on our benchmark set of well-characterized IDPs. Contributing to the second type of methods, we have developed an original approach to model the folding mechanism of secondary structural elements. The computation of conformational transitions is formulated as a discrete path search problem using the tripeptide database. To evaluate the approach, we have applied the strategy to two small synthetic polypeptides mimicking two common structural motifs in proteins. The folding mechanisms extracted are very similar to those obtained when using traditional, computationally expensive approaches. Finally, we have developed a more general method to compute transition paths between a (possibly large) set of conformations of an IDP. This method builds on a multi-tree variant of the TRRT algorithm, developed at LAAS-CNRS, and which provided good results for small and middle-sized biomolecules. In order to apply this method to IDPs, we have proposed a hybrid strategy for the parallelization of the algorithm, enabling an efficient e! xecution in computer clusters. In addition to the aforementioned methodological work, I have been actively involved in multidisciplinary work, together with biophysicists and biologists, where I have applied these methods to the investigation of important biological systems, in particular the huntingtin protein, the causative agent of Huntington’s disease. In conclusion, the work carried out during my PhD thesis has enabled a better understanding of the relationship between sequence and structural properties of IDPs, paving the way to novel applications. For example, this deeper understanding of sequence-structure relationships will enable us to anticipate structural perturbations exerted by sequence mutations, and subsequently, the rational design of IDPs with tailored properties for biotechnological applications.
Complete list of metadatas

Cited literature [434 references]  Display  Hide  Download

https://hal.laas.fr/tel-02900397
Contributor : Laas Hal-Laas <>
Submitted on : Thursday, July 16, 2020 - 9:47:31 AM
Last modification on : Friday, July 17, 2020 - 5:11:31 AM

File

ESTANA GARCIA Alejandro.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-02900397, version 1

Citation

Alejandro Estaña Garcia. ALGORITHMS AND COMPUTATIONAL TOOLS FOR THE STUDY OF INTRINSICALLY DISORDERED PROTEINS. Bioinformatics [q-bio.QM]. Institut National des Sciences Appliquées de Toulouse, 2020. English. ⟨tel-02900397⟩

Share

Metrics

Record views

42

Files downloads

45