Proteomics/Post-translational Modification/Glycosylation

This Section:

= Glycosylation =

Overview
Protein Glycosylation is the post-translational process by which saccharides are selectively added to specific protein residues utilizing two distinct mechanisms in order to convey more structural stability or function to the native protein structure. Specifically this process is necessary for proper modification of a protein such that it may anchor properly into a phospholipid bilayer or is conveyed a cell signaling function resultant of the enzyme mediated addition of sugars in a site-directed manner. However due to the lack of enzyme recognition or consensus sequence knowledge, the specificity of these mechanisms occurring on peptide sequences is largely unknown. As such there is a significant amount of work that has been done in order to define prediction models for glycosylation sites in order to aid in protein modeling as a whole. One such example in specific is the pNetOGlyc 3.1 Server, developed by the Center for Biological Sequence Analysis. This utility uses neural networks to extrapolate glycosylation sites of type O-GalNAc using experimentally verified glycosylation sites and structure homology from the PDB.

Purpose
Proteins are glycosylated for several reasons. Some glycoproteins are more stable once they have polysaccharides attached, others for cell recognition and communication. Still some proteins simply refuse to fold properly without their accompanying side chains. Because glycosylation is thought to be largely a function of the Golgi apparatus, experiments into the role of glycosylation in protein conformation in vivo are difficult to design.

Roles
Glycosylation plays a critical role in the proper maintenance of protein tertiary and quaternary structure, most notably as a retainer for the intricate structure of the Fab fragment of Immunoglobulins. Specifically Immunoglobulin G utilizes as many as thirty unique glycoprotein interactions in order to create and maintain a unique Fab structure during the lifespan of the sera glycoprotein. Other functions include membrane bound recognition and adhesion activities, metabolism components, transportation duties and protein folding / mediating functions. In general, where a cellular process requires a diverse protein which must retain structural stability, a glycoprotein is usually involved.

Glycosylation and Proteomic Analysis
=Mechanisms= All forms of glycosylation are enzymatic, site specific reactions which utilize an activated nucleotide sugar. Beyond these three base requirements, several specific types of glycosylation exist.

N-Acetylglucosamine
N-linked glycosylation is the most common form of glycosylation. It is widely employed by Eukaryotes and Archaea, but rarely in Prokaryotes. This occurs through a series of steps which begin with the formation of an oligosaccharide chain consisting of around 14 sugars. This chain is then anchored on a multimer known as dolichol. Then an enzyme known as Oligosaccharyltransferase transfers the bound glycosidic chain towards the ER lumen, where it can be affixed to the nascent polypeptide being produced.

Glycosylation sequon
Glycosylation is triggered by a short sequence of amino acids known as the glycosylation sequon. This sequence is characterized by (in the N-case) an Asparagine followed by any residue except proline, followed by either Cysteine, Serine, or Threonine.



Most N-linked oligosaccharides fall into two categories which are thought to be determined by the presence sugar modifying proteins during transcription:

Complex oligosaccharides
Complexes of oligosaccharides are formed by linking multiple other types of sugars together starting with two N-Acetylglucosamine molecules. This often is terminated by the addition of sialic acid on the branching "antennae" characteristic of the chain.

High-Mannose oligosaccharides
A subset of oligosaccharides attached to secreted or membrane proteins in eukaryotic cells. They contain 5-9 mannose residues, but lack the sialic acid terminated antennae of the so called complex type.

[O-linked glycosylation]
O-linked glycosylation occurs via the addition of sugars to the Threonine and Serine residues hydroxyl-oxygen sidechains that are in close proximity to a Proline residue. O-linked glycosylation takes place typically in the Golgi apparatus. The Principal difference from the N-linked variant of protein glycosylation is the sidechain interaction variety, with type O utilizing an oxygen based linker mechanism

O-N-acetylgalactosamine (O-GalNAc)
This specific type of O-linked Glycosylation is theorized to occur in the Golgi apparatus and is simplistically characterized as further modification of the protein in late processing via addition of N-acetyl-galactosaminyltransferase to Serine or Threonine (typically Threonine) residues. This mechanism is catalyzed by the enzyme UDP-N-acetyl-D-galactosamine:polypeptide N-acetylgalactosaminyltransferase and stabilized by various carbohydrates to form the complete O-GalNAc structure. This form of Glycosylation is critical for the formation of proteoglycans which is associated with extracellular matrix formation and as a component of mucosal secretions.

O-N-acetylglucosamine (O-GlcNAc)
O-N-acetylglucosamine type glycolosylation occurs as the addition of O-GlcNAc to Serine or Threonine residues if and only if phosphorylation has not occurred previously at those residues by serine or Threonine kinases. Via the mechanism of O-GlcNAc transferase, proteins glycosylated in this manner are found in both the nucleus and cytoplasm. Because of its relationship with phosphorylation sites, the recognition of Glycosylation events of this nature have significant implications in the current and ongoing cancer research that has been, up till now, centric on phosphorylation activity only. Additionally O-GlCNAc Glycosylation is a nutrient-sensing hexaosamine signaling pathway participant at the termination of the signal transduction pathway.

O-Mannose
Abundant in brain and muscle cells, this form of Glycosylation is catalyzed by O-mannose beta-1,2-N-acetylglucosaminyltransferase specific for mannose and has significant implications in Congenital Muscular Dystrophy (CMD), causing one particular disorder, specifically Duchene Muscular Dystrophy.

O-fucose & O-Glucose
These forms of Glycosylation takes place between cysteine residues in the Notch protein, an EGF signaling protein. In particular Human Notch-1 (one of four proteins in the Notch family) contains 12 O-Fucose and 17 O-Glucose sites, with the implications of improper Glycosylation resulting in abnormal development, leukemia, and Cerebral Artiopathy with Infarcts. For O-Fructose Glycosylation, the mechanism of linkage here involves GDP-fucose protein O-fucosyltransferase 1. Another mechanism of O-fucose Glycosylation involves fusion of GDP-fucose protein O-fucosyltransferase 2 to specific Thrombospondin repeats which are then selectively elongated by glucose addition. Interestingly however, both catalytic reactions occur in the Endoplasmic Reticulum, rather than in the traditional localization of glycosyltransferases in the Golgi apparatus.

C-mannosylation
C-mannosylation is a unique form of glycosylation which is characterized by the covalent bonding of alpha-mannopyranosyl to the indole second carbon of Tryptophan. Due to its incredible specificity, this form of glycosylation was not characterized until the early 1990's, and as such still remains a poorly understood mechanism with respect to its function in cellular processes.



=GPI Anchor= These membrane bound proteins are prevalent in most Eukaryotic systems and serve to regulate the release of molecules from cell surfaces and exchange of membrane molecules. Specifically they play a critical role in the recognition of enzymatic and antigenic molecules as well receptor mediated signal transduction pathways. Proteins destined for anchorage onto a membrane surface are first adhered at the carboxyterminus (c-terminal region) to a phosphodeister linkage system. This is comprised of the adsorption of phospothanolamine to a core of trimannosyl-non-acetylated Glucosamine. This Glucosamine structure is then linked to phosphatidylinositol, which is then finally adhered to the lipid bilayer via another phosphodeister linker. Solubilization of the membrane bound protein is achieved via cleavage of the phosphatidynisotil bond by Phospholipase C.

Authors
Jared Carter

Aubrey Bailey

Comparative glycoproteomics: approaches and applications
Reviewer: Corey W.

Main Focus
The finding that Post-Translational Modifications (PTMs) affect many protein sequences led to a shift in the focus of proteomics from sequences to structural and functional properties of proteins. When proteins are glycosylated it means that sugars are added to the string of amino acids which affects the structure, solubility and stability of proteins. These edits also have roles in “embryonic development, immune response and cell-to-cell interactions involving sugar-sugar- or sugar-protein-specific recognition.” Glycosylation is one PTM growing in interest because it lends itself to being used as a biomarker which has become useful in detecting prostate and breast cancer as well as some neurodegenerative diseases. This article focuses specifically on comparative glycoproteomics and its uses.

New Terms

 * Biomarker: A biomarker, or biological marker, is in general a substance used as an indicator of a biological state. (source: http://en.wikipedia.org/wiki/Biomarker)


 * Lectin: Lectins are sugar-binding proteins that are highly specific for their sugar moieties. (source: http://en.wikipedia.org/wiki/Lectin)


 * MALDI MS: Matrix-assisted laser desorption/ionization (MALDI) is a soft ionization technique used in mass spectrometry, allowing the analysis of biomolecules (biopolymers such as proteins, peptides and sugars) and large organic molecules (such as polymers, dendrimers and other macromolecules), which tend to be fragile and fragment when ionized by more conventional ionization methods. (source: http://en.wikipedia.org/wiki/Matrix-assisted_laser_desorption/ionization)


 * TOF MS: Time-of-Flight Mass Spectrometry is a method of mass spectrometry in which ions are accelerated by an electric field of known strength. This acceleration results in an ion having the same kinetic energy as any other ion that has the same charge. The velocity of the ion depends on the mass-to-charge ratio. The time that it subsequently takes for the particle to reach a detector at a known distance is measured. This time will depend on the mass-to-charge ratio of the particle (heavier particles reach lower speeds). From this time and the known experimental parameters one can find the mass-to-charge ratio of the ion.(source: http://en.wikipedia.org/wiki/Time-of-flight_mass_spectrometry)

Comparative Glycoproteomics

 * When dealing with proteins, there is a problem with the dynamic range particularly during separation. The most interesting proteins tend to be those at low concentrations. Glycoproteins, which constitute a subset of total cellular proteins, can be selectively enriched by the use of lectins. These proteins are found in plants, fungi, bacteria and animals and have an unique affinity toward carbohydrates. Lectin affinity chromatography (LAC) is based on the reversible, specific interaction of each lectin against different oligosaccharides. This method allows for the enrichment of glycoproteins and glycopeptides, but also discrimination of glycan structures as well as glycoforms of the same protein. Using lectins immobilized on a solid piece of agarose, glycoproteins can be isolated in a column. Peptide-N-glycosidase (PNGase) is then used to label the N-glycosylation site with 18O. This method is called isotope-coded glycosylation-site-specific tagging (IGOT). A mass spectrometer can clearly distinguish the resulting labeled peptides from peptides that were not isotope-coded. The major weakness of this protocol is that there is some non-specific binding can occur to non-glycosylated proteins.

Applications of Comparative Glycoproteomics

 * The functional groups found on these proteins are likely to reveal more about signaling pathways and may tell us more about mechanisms and pathogenesis of certain diseases. The use of separation techniques will also extend the dynamic range and allow us to target small subsets of proteins which will simplify systems and allow for discovery of more aberrantly glycosylated sites.

Cancer

 * Glycosylation profiles are known to change in the onset of oncogenesis. The author provides an example where up-regulation of activity of N-acetylglucosaminyltransferase V (which is responsible for the branching of N-linked glycans) has been linked to tumor invasion and metastasis in several cancers. From this knowledge of differential glycosylation in cancer cells, we can target some glycoproteins for biomarker discovery and diagnostics. In the case of better known biomarkers such as prostate-specific antigen (PSA) we may eventually be able to alter the aberrant sites for use in immunotherapy.

Neurodegenerative Diseases and Neurobiology

 * It has been shown that aberrant glycosylation might be part of the cause of Alzheimer's Disease. The change in folding of a certain protein makes a site easier to phosphorylate, but more difficult to dephosphorylate. It has also been found that glycopatterns change in Reelin, a protein used in cytoarchitectonic organization, which causes up-regulation!

Relevance to a Traditional Proteomics Course
Getting results previously mentioned used to be difficult, but with advances in structural glycobiology coupled with mass spectrometry technology experiments are getting easier to do. These new techniques allow for a more detailed view of glycosylation modifications. As more biomarkers are discovered there is hope that these types of analyses will become part of normal diagnostic routine. Putting these new types of proteomics into action will elucidate fundamental biological processes. There is also the possibility of these techniques to explain certain diseases by helping us discover these rare subsets of proteins.