Structural Biochemistry/Nucleic Acid/RNA/Other RNA/Long noncoding RNA

Long non-coding RNAs, also known as lncRNAs, are similar in structure to mRNAs. However, lncRNAs are unique in that they do not function in any protein-coding. Chemical probing and structural studies have been methods to understand the structures of several IncRNAs, this includes the structure of the ribsome [1]. Other research that applies to IncRNAs is the approach to measure the secondary structure of RNA. Furthermore, as the name suggests, lncRNAs are long (or large) transcripts produced by mammalian genomes.

History
Before many advances in technology that enabled the present day’s depth of research on RNA molecules, lncRNAs were discovered and characterized. Initially, these molecules were thought to function similarly to other RNA, coding for proteins. However, subsequent data from experiments showed that lncRNAs lacked open-reading frames (ORF), which proved that these transcriptions could not possibly code for proteins.

Even though it has been determined that lncRNAs do not participate in protein synthesis, the question still remains as to what are the functions of these transcriptions. Although the significant mysteries of lncRNA mechanisms remain unanswered, according to Moran, Perera and Khalil’s paper, “numerous publications in the past several years have now documented important functions for lncRNAs, affecting many biological processes, including regulation of gene expression, dosage compensation, genomic imprinting, nuclear organization and compartmentalization, and nuclear-cytoplasmic trafficking.”

Functional Long Non-Coding RNAs Present in Mammals
Long non-coding RNAs fall into three categories: intronic long-non coding RNAs, natural antisense transcripts, and long intervening non-coding RNAs. Although it is confirmed that the human and other mammalian genomes produce these various types of lncRNAs, it is yet to be discovered precisely the roles of these molecules in biological processes, nor their mechanisms of action. Such topics are still left to debate in many various hypotheses.

Intronic Long Non-Coding RNAs (Intronic lncRNAs)
As the name suggests, intronic long non-coding RNAs (intronic lncRNAs), are expressed only from the introns of protein coding genes.

Natural Antisense Transcripts (NATS)
Natural antisense transcripts, abbreviated as NATs, are transcripts that overlap with protein-coding genes and are transcribed in the antisense direction. These molecules were discovered in the early 2000s, when over 11,000 lncRNAs were identified in a mouse. After further inspection, it was determined that a significant amount of these lncRNAs were NATs. Moran, Perera and Khalil write that in a different study, it was found that “40% of protein-coding genes in human cells also express NATs.” NATs have been studied throughout the past decade, and research demonstrated that NATs have a specific function: they regulate the protein-coding genes with which they overlap.

Long Intervening Non-Coding RNAs (lincRNAs)
Recently, lncRNAs have been expressed in the “intergenic” regions of the genome, or stretches of DNA that have few or no genes. These lncRNAs have been named long intervening non-coding RNAs, or lincRNAs.

Before RNA sequencing technologies, researchers utilized the known fact that “actively transcribed protein-coding genes typically display a specific histone modification pattern” and were then able to isolate lincRNAs. These results showed that both human and mice genomes create over 3300 lincRNAs. RNA sequencing technology confirmed these results, exposing many more lincRNAs. Presently, it is projected that human DNA as a whole produces over 8000 lincRNA, over half of which are considered to be “high confidence lincRNAs.” These high confidence lincRNA are located in various regions of the cell, both in the nucleus and cytoplasm, and are believed to potentially play roles in cell identity.

Known Functions of Long Non-Coding RNAs
Even though only a small proportion of lncRNAs have been examined through experiments, the emerging result is that these molecules participate in many biological contexts.

Regulation of Gene Expression
Although gene regulation typically requires many different factors due to its complexitry, recent studies reveal that lncRNAs contribute to gene regulation by various mechanisms.

A known, well-studied lncRNA called Xist is an example of these gene-regulating lncRNAs. Xist is responsible for the initiation and spreading of the inactivation of X chromosomes in female somatic cells. Although Xist’s specific mechanism in achieving this function is unknown, it is accepted throughout the scientific community that Xist is required to inhibit hundreds of genes on the inactive X chromosome.

Another process that lncRNAs participate in to regulate gene expression is called genomic imprinting. Genomic imprinting requires very tight regulation, mainly because the expression of imprinted genes play critical roles on mammalian development, and the level of expression between the two alleles of an imprinted gene can vary greatly from one gene to another. A considerable amount of lncRNAs, alongside mRNAs, actively regulate the expression of protein-coding genes in cis. “Air,” one of these gene expression regulatory lncRNAs, silences three imprinted genes in cis by localizing to chromatin.

Genes can also be regulated by lncRNAs in trans. One such lncRNA called HOTAIR, which is actually a lincRNA, regulates human HOXD genes expression in trans.

Maintenance of Pluripotency
Pluripotency is the ability of a stem cell to give rise to the three germ layers: mesoderm, endoderm, and ectoderm.

Although previous studies expressed that there are key transcription and chromatin factors that are required to maintain pluripotency, it has also been discovered that some of these transcription factors bind to the promoters of many lincRNAs in mice. Another study also indicated that two specific lincRNAs are required to maintain pluripotency as well.

In-depth research focused on the role of lincRNAs in pluripotency maintenance; this lead to a specific study that demonstrated that 26 lincRNAs are required for pluripotency maintenance. Each of these lincRNAs was determined to lead to “either an exit from the pluripotent state or activation of lineage commitment programs.” It has also been suggested that lincRNAs act as sensors, sensing the environment of the stem cell and alerting the stem cell to remain pluripotency or to change according to alterations in the environment.

Nuclear Organization: Formation of Paraspeckles
The regulation of gene expression takes place at multiple levels, such as before, during or after transcription, or during translation. Recently, nuclear structures called paraspeckles are suggested to be participants in gene regulation after transcription has taken place. However, paraspeckles have been observed to be dynamic structures that are absent at certain stages during development. The fact that the appearance of paraspeckles coincide with the activation of a lncRNA called nuclear-enriched autosomal transcript 1, or NEAT1, is evidence that supports the conclusion that NEAT1 is a required molecule for the formation of paraspeckles. Furthermore, the depletion of NEAT1 causes the subsequent loss of paraspeckles in the nucleus; likewise, the overexpression of NEAT1 leads to an increase in paraspeckles. Such observations have suggested the essential role of NEAT1 in paraspeckle formation and maintanence in the nucleus.

Regulation of Alternative Splicing
Alternative splicing of pre-mRNAs leads to proteomic complexity due to its production of several protein products with different functions from a single mRNA. A lncRNA called metastasis-associated lung adenocarcinoma transcript 1, or MALAT1, is identified as having an important role in the alternative splicing of pre-mRNAs. This is supported by the fact that MALAT1 localizes to nuclear speckles, which actually contain several proteins associated with alternative splicing. In particular, MALAT1 regulates the phosphorylation of specific proteins that are involved in the regulation and splicing sites of pre-mRNAs. Furthermore, it has been proven that the depletion of MALAT1 affects patterns in alternative splicing of pre-mRNAs. However, despite the discovery of MALAT1, it has yet to be determined if other lncRNAs participate in the regulation of alternative splicing.

How Long Non-Coding RNAs Exert Their Effects
Although researchers have yet to discover all there is to know about the detailed mechanisms of action for lncRNAs, there are several mechanisms by which lncRNA exert their effects which can be explained.

Long Non-Coding RNAs as Guides for Chromatin-Modifying Complexes
Although chromatin-modifying complexes and DNA methytransferases function by enzymatically modifying chromatin and DNA to activate and repress genes, the question as to how these enzymes, without DNA binding capacity, recognized their target genes. It has recently been suggested that some lncRNAs act as guides for chromatin-modifying complexes and other nuclear proteins to specific locations so that these molecules can exert their respective effects.

Both the lncRNAs HOTAIR and Air target and direct chromatin-modifying complexes to their target genes; the only difference between both of these lncRNAs is that HOTAIR binds in trans, whereas Air binds in cis (this is concurrent with their gene-regulating functions, which are mentioned in the “Regulation of Gene Expression” section). Recent evidence illustrates that in this process, the lncRNA binds to chromatin first, and then serves as “docking station” for chromatin-modifying complexes.

Long Non-Coding RNAs as Structural Links in Ribonucleoprotein Complexes
Although the specific molecular composition of most ribonucleoprotein complexes (RNPs) that surround RNAs in the cell has yet to be established, studies have expressed that lncRNAs exist in the cell as RNPs. For example, the lncRNA Xist has been shown to interact with a specific transcription factor which aids in connecting Xist to the inactive chromosome; Xist forms a molecular bridge of sorts to connect this transcription factor and a chromatin-modifying complex in order to repress genes on the inactive chromosome.

Other lncRNAs, such as NEAT1 and MALAT1, form molecular scaffolds for several proteins. In doing so, they can regulate the functions of these proteins. However, it is not known as to how these lncRNAs recognize their respective proteins and commence the formation of these molecular scaffolds in the nucleus.

Long Non-Coding RNAs as Regulators for Distinct Transcriptional Programs
Several lncRNAs have been shown to respond to specific stimuli in the cell environment. Consequently, they are triggered to activate specific transcriptional programs that allow for cell response to these stimuli. In certain cases, lncRNAs act as negative regulators. For example, the lncRNA growth arrest specific 5 (GAS5) serves as a negative regulator of glucocorticoid receptors (GR), which are a specific class of nuclear receptors. When activated, GAS5 interacts with the GRs, inhibiting this class of nuclear receptors from binding to their specific DNA response elements and carrying out their function. Thus, lncRNAs act as modulators of the effects a transcription factor can lead in changes in gene expression, as well as the cell’s ability to respond to stimuli from the cell’s environment.

Potential Roles of Long Non-Coding RNAs in Human Disease
Although only a few lncRNAs have been linked to human disease, recently, in some cases, scientists have observed strong associations between lncRNAs and human disease. According to Moran, Perera and Khalil’s paper, “lncRNAs have been found to be dysregulated in a wide range of human diseases and disorders, including various types of cancers.” Even though the mechanisms of most lncRNAs have not been fully discovered in its entirety, several studies have begun to bring clues as to the mechanisms of few lncRNAs. In the article "Genome Regulation by Long Noncoding RNAs." Rinn and Chang mention about the alteration of IncRNAs in human cancers, "Dozens of lncRNAs have been documented to have altered expression in human cancers and are regulated by specific oncogenic and tumor-suppressor pathways, such as p53, MYC, and NF-κB recently described a class of lncRNAs that show periodic expression during the human cell cycle, and many of these are dysregulated in expression in human cancer samples." Cancer has been the disease most studied and researched and it leaves the possibility that IncRNAs are involved in the pathogenesis of many other diseases. In future studies, it is necessary to study potential IncRNA transcripts in regions that do not contain protein-coding genes since this are strongly associated with human diseases.