Proteins employ the information stored in the genetic code and translated into their sequences to carry out well-defined functions in the cellular environment. The possibility to encode for such functions is controlled by the balance between the amount of information supplied by the sequence and that left after that the protein has folded into its structure. We study the amount of information necessary to specify the protein structure, providing an estimate that keeps into account the thermodynamic properties of protein folding. We thus show that the information remaining in the protein sequence after encoding for its structure (the 'information gap') is very close to what needed to encode for its function and interactions. Then, by predicting the information gap directly from the protein sequence, we show that it may be possible to use these insights from information theory to discriminate between ordered and disordered proteins, to identify unknown functions, and to optimize artificially-designed protein sequences. This article is protected by copyright. All rights reserved.

A method for partitioning the information contained in a protein sequence between its structure and function / A. Possenti, M. Vendruscolo, C. Camilloni, G. Tiana. - In: PROTEINS. - ISSN 0887-3585. - 86:9(2018 May 23), pp. 956-964. [10.1002/prot.25527]

A method for partitioning the information contained in a protein sequence between its structure and function

C. Camilloni
Penultimo
;
G. Tiana
Ultimo
2018

Abstract

Proteins employ the information stored in the genetic code and translated into their sequences to carry out well-defined functions in the cellular environment. The possibility to encode for such functions is controlled by the balance between the amount of information supplied by the sequence and that left after that the protein has folded into its structure. We study the amount of information necessary to specify the protein structure, providing an estimate that keeps into account the thermodynamic properties of protein folding. We thus show that the information remaining in the protein sequence after encoding for its structure (the 'information gap') is very close to what needed to encode for its function and interactions. Then, by predicting the information gap directly from the protein sequence, we show that it may be possible to use these insights from information theory to discriminate between ordered and disordered proteins, to identify unknown functions, and to optimize artificially-designed protein sequences. This article is protected by copyright. All rights reserved.
designed proteins; information content; intrinsically disordered proteins; protein folding/function; structure prediction
Settore FIS/03 - Fisica della Materia
Settore FIS/07 - Fisica Applicata(Beni Culturali, Ambientali, Biol.e Medicin)
23-mag-2018
https://onlinelibrary.wiley.com/doi/abs/10.1002/prot.25527
Article (author)
File in questo prodotto:
File Dimensione Formato  
manuscript_v2_copy.pdf

accesso aperto

Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione 2.08 MB
Formato Adobe PDF
2.08 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/594816
Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 5
social impact