Structural Biochemistry/Inherently Disordered Proteins

There are several entire and localized proteins that do not fold into 3-D structures yet are able to function fully. Instead of the usual linear pathway of proteins (sequence-to-structure-to-function), these unfolded protein's functions come from different forms such as structured globules, collapsed disordered ensembles, and extended disordered ensembles). In addition, function can also arise from a disorder-to-structure transition. The understanding of these non 3-D structured proteins can help to diversify the knowledge of proteins and how they function in comparison to the globular 3-D structures.

Characteristics of Non-folding proteins
Since protein folding is directed by the amino acid sequence, the test to determine whether the non-folding proteins were guided by an amino acid sequence was carried out. The development of predictors to test out this hypothesis that the amino acid sequence specified no protein folding showed that the predictor accuracy was much better than expected by chance. This, in turn, revealed that protein non-folding is most likely within the amino acid sequence. The depletion of C, W, Y, F, I, V, and L residues, and the abundance of M, K, R, S, Q, P, and E residues reveals the decrease in residues that form hydrophobic interiors of structured proteins and the increase in residues that form the surface of structured proteins. This decrease and increase in the specific residues shows why the non-folding proteins do not fold into their 3-D structures.

Eukaryotes contain the biggest fraction of disordered proteins while archaea and eubacteria are in similar amounts, but far behind the amount present in eukaryotes. In addition, multicellular eukaryotes have more disordered proteins than mono-cellular eukaryotes.

Separating unstructured proteins into groups
The partitioning of structured proteins according to their amino acid sequence or function can be very useful because it allows for simple access to a wide variety of proteins and easy grouping of newly discovered ones. However, unstructured proteins and regions are hard to place into distinct groups because of their diversity, lack of a 3-D structure, and variance in their amino acid sequence. An example of this problem can be seen in the short amino acid linker in calmodulin, which forms a helix in the crystallized form but is flexible in solution. The disordered region in calmodulin allows for it to bind to a wide range of target sequence because the side-chains in the methionine-rich hydrophobic areas of the calcium-binding regions are flexible. Another example can be seen in the longer disordered region of PEVK in titin. PEVK can range from about 180 to 2174 residues, depending on the circumstances. The disordered region contains 180 residues in the cardiac muscle isoform while 2174 corresponds to the soleus muscle isoform. Both of these disordered regions help to maintain the appropriate length of muscle fibers. The wide range of function and variability in sequence portrays the difficulty in grouping these disordered proteins together.

Yet, partitioning was still accomplished through grouping the disordered proteins into homogeneous subsets. The disordered regions were randomly grouped into subsets and then different predictors were developed for each separate subset. The assembly of disordered regions were repartitioned into different groups again according to which predictor provided the best results. Then, new predictors would be constructed on the basis of the repartitioned subsets, and the steps would be repeated until there were no more changes with each of the new cycles. From this approach, three types or flavors were found and named V, C, and S. Flavor S contained a large amount of protein-binding regions, flavor V was rich in ribosomal proteins, and flavor C was high in the number of sites of protein modification.

Functions of disordered proteins and regions
Non-folding proteins and regions have significant duties in biological functions, taking part in signaling and regulatory pathways, through specific protein-protein, protein-nucleic acid, and protein-ligand interactions. Detailed functions of non-folding proteins and regions can be depicted in four categories: 1) molecular recognition, 2) molecular assembly, 3) protein modification, and 4) entropic chain activities. Non-folding proteins and their wide range of partners in interaction allow for the organization of complex protein-protein networks.

The disorder-associated and structure-associated functions in Swiss-Prot, a protein database, were identified recently. There were 310 structure-associated, 238 disorder-associated, and 170 structurally ambiguous, out of 710 functional keywords. This revealed the functional diversity of disordered proteins working in complement with structural proteins. Another test showed that the disordered proteins had more functions than the structural proteins, with the non-folding dealing with the signaling and regulatory processes while the folded proteins were associated with catalysis and transport.

Non-folding proteins and regions usually partook in molecular interactions controlled by localized binding sites such as eukaryotic linear motifs (ELMs), short linear motifs (SLiMs), and molecular recognition features (MoRFs). ELMs and SLiMs were both identified to be short sequence patterns in many proteins that bind to a common target. On the other hand, MoRFs are identified by a pattern in a disorder prediction output. In addition, non-folding regions are also primary loci for alternative splicing.

A summary of some protein functions associated with structural disorder :

The protein San1 functions as an E3 ubiquitin ligase and the role of the disorder consists of recognizing mis-folded substrates. The protein Hsp-33 functions as a redox chaperone and the role of the disorder consists of adhering mis-folded structures. The protein pHD functions as a bacterial antitoxin and the role of the disorder surrounds the allosteric regulation of bacterial toxins. The Sic1 protein functions as a cyclin-dependent kinase inhibitor and the role of the disorder includes "polyelectrostatic" interactions with Cdc4 ubiquitin ligase. The protein WASP functions as a regulator of actin polymerization and the role of the disorder is allosteric regulation. The protein p27 functions as a cyclin-depenedent kinase inhibitor like Sic1, however the role of its disorder is the regulation of targeted degradation. The protein CREB functions as a general transcription co-activator and the role of the disorder consists of interacting through induced folding by a large range of transcription factors. LEA proteins function as stress response proteins in plants and animals and the role of the disorder includes chaperone function in abiotic stress via disorder transfer.

Inherently Disordered Proteins in Diseases
Non-structured proteins have been figured out to be an influence in human diseases since many of the non-structured proteins are either wholly disordered or have a large stretch of disordered sequences. An important malfunction that occurs in the human body because of these disordered proteins is the aggregation of non-folding protein sequences to amyloid fibrils rich in ß-structure, which is associated with the pathogenesis of neurodegenerative diseases such as Alzheimer's, Parkinson's, Huntingtion's, and prion diseases.

Oligomers or protofibrils of the already disordered polypeptides seems to be the pathogenic entities that are involved in diseases such as Alzheimer's. It's been suggested that their mode of action may involve creating pores in the plasma membrane of the affected cells. Techniques have since been used to show that amyloid peptides that are involved in several diseases had similar channels. AFM or atomic force microscopy showed that pore-like structures for amyloid peptides were reorganized into the lipid bilayers. Another example is the family of synucleins which contain three homologous proteins called α-synuclein, ß-synuclein, and γ-synuclein. All of these three proteins contain roughly 130 amino acid residues, which are usually intrinsically disordered proteins. With the α-synuclein, it is typically the aggregation of it into oligomers, protofibrils, and fibrils that makes it closely related to Parkinson's disease, Lewy body dementia, and all kinds of other neurodegenerative diseases which are known as synucleinopathies. However, unlike α-synuclein, ß-synuclein and γ-synuclein have a smaller chance in fibrillating and can also prevent fibril formation in α-synuclein. α-synuclein has shown to be structurally plastic as it can adopt several structurally unrelated conformations. These features are very reliant on the protein environment and on the availability of binding partners. In addition, α-synuclein have also been known to conform to α-helices when associated with phospholipids or micelles.

Structural disorder was also found in multiple other disease-related-proteins such as p53 and the cystic fibrosis transmembrane conductance regulator (CFTR). Within these proteins associated with cancer, neurodegenerative diseases as stated earlier, cardiovascular diseases, and diabetes structural disorder was discovered. Scientists hypothesize that structural disorder allows for the cellular presence of oncogenic protein chimeras. A negative point of structural disorder is apparent in the dosage sensitivity of genes that produce agitation if over-expressed.

Structural disorder is critical for pathogens. Examples consist of virus entry, replication, and budding that have a basis for deregulating the signaling of the host cell and is carried out utilizing differing interactions of viral proteins with key host regulatory proteins.

Studies of these intrinsically disordered proteins ensure a more in depth apprehension of the cause and advancement of multiple disease states. Even better, they assist in improving antidotes against unfavorable conditions.

Drug Development
Intrinsically disordered proteins are different than other solutions to diseases on the market for they do not hold any enzymatic activity. Traditional drugs usually tag the active sites or the ligand-binding pocket of enzymes or receptors. Intrinsically disordered proteins commit in protein-protein interactions that are intervened with by way of small molecules. This partner-targeting method has been advocated for drug development. Current work still needs to improve this method in a cell.

Disorder Exists in vivo
In the past, many researchers were unclear if the structural disorder found in proteins occurred in vivo or in vitro due to the seclusion and large dilution of the protein in the test tube. Multiple studies demonstrate that macromolecular concentrations leading to crowding do not coerce intrinsically disordered proteins to fold completely in the cell. NMR experiments supported the argument of disorder existing in vivo for it was applied to study alpha-synuclein, which is over conveyed in E.coli cells. Also different functional studies prove in vivo for the study of chaperone function affiliated with structural disorder in a live cell. This was an example of indirect evidence.