The Importance of Disordered Regions in Proteins: Detection, Prevalence, Prediction, and Biological Significance

A. K. Dunker

Abstract:
The central dogma of molecular biology is that DNA sequence determin es messenger RNA sequence which in turn determines amino acid sequence. A gi ven amino acid sequence then determines the one, specific, unique 3 dimensio nal structure for the given protein. The final 3D structure of the protein enables it to carry out its function. One of the c entral, unsolved problems in molecular biology is the code by which a given amino acid sequence determines a given 3D fold. This is called the "protein folding problem."
We recently noticed that many functional regions of proteins apparen tly don't fold into specific 3D structures, but rather remain in an unfolded or disordered state. Since amino acid sequence is supposed to determine prot ein folding, we reasoned that amino acid sequence should also determine non- folding as well. We used simple data analyses and neural networks to test w hether correlations can indeed be found between amino acid sequence and non- folding. Our results suggest that there are, indeed, understandable relation ships between lack of folding and amino acid sequence, and further, contrary to current views, that nature is evidently very rich in non-folding sequences.
These computer studies encouraged us to consider possible roles of u nfolded protein states in the realm of molecular biology. We have identified several functions that evidently require disordered protein, including: (1) protease sensitivity for controlling enzyme activity and turnover; (2) mech anical uncoupling of two or more structured domains, and (3) involvement in molecular recognition whereby the disordered regions become ordered upon com plex formation.
The difference between the old view of molecular recognition ("prior folding") and this new view ("coupled folding") seems small, yet the biologi cal implications are profound. For prior folding, affinitiy and specificity are linked and all molecular interactions are, to the first order of approximation, diffusion limited. Coupled folding, on the other ha nd, separates affinity and specificity over evolutionary time and enables co mplicated kinetic control of molecular interactions.
These ideas lead to a new classification scheme for molecular recogn ition and a proposal for a critical role of disordered regions in the evolut ion of complex biological networks.

  1. Romero, P., Obradovic, Z., Kissinger, C., Villafranca, J. E., and Dunker, A. K., Identifying Disordered Regions in Proteins from Their Ami no Acid Sequence. Proc. IEEE Conference on Neural Networks. 1:90-95 (1997)
  2. Dunker, A. K., Obradovic, Z., Romero, P., Kissinger, C., and Villafranca, J. E. On the Importance of Being Disordered. Protein Data Bank Newsletter 81: 3-5 (1997)
  3. Romero, P., Obradovic, Z., Kissinger, C., Villafranca, J. E., Garner, E., Guilliot, S. and Dunker, A. K. Thousands of Proteins Likely to Have Long Disordered Regions. Pacific Symposium on Biocomputing 3: 435-446 (1998) - in press
  4. Dunker, A. K., Garner, E., Guilliot, S., Romero, P., Albrecht, K., Hart, J., Obradovic, Z., Kissinger, C., and Villafranca, J. E., Protein Disorder and the Evolution of Molecular Recognition: Theory, Predictions and Observations. Pacific Symposium on Biocomputing 3:471-482 (1998) - in press
  5. Romero, P., Obradovic, Z., and Dunker, A. K. Sequence Data Analysis for Long Disordered Regions Prediction in the Calcineurin Family. Proc. of the 8th Workshop on Genome Informatics, December, 1997, Tokyo, Japan (In Press).