Structural Biochemistry/Nucleic Acid/DNA/Replication Process

DNA Replication is required for all cell division, which allows organisms to grow. In DNA replication, the DNA is first divided into two daughter strands in the genome, which carries the exact genetic information as the original cell. This starting point of the strand being separated is called the "origin". The double strand structure of the DNA aids the mechanism in replicating; these two strands are first separated into two separate strands. The complementary stands of the two separate strands are then recreated by DNA polymerase, an enzyme that specialize in making complementary strands; it will find the correct complementary base for each strand and it will extend from the 5' to 3'. The process by which the original strand is being preserved is called "semiconservative replication". DNA replication is essential in the life cycles for biological organisms. It is initiated when the double stranded DNA located at the origin of replication is separated or melted. When the double stranded DNA is melted, melted region is propagated and a mature replication fork forms. DNA melting, along with the replication fork formation is coordinated by initiators, helicases, and other cellular factors. Recent advancements in structural biochemistry studies of initiators and replicative helicases have been emphasized in archaeal and eukaryotic cells. The results of these studied have provided new insight to possible mechanisms of the early stages of DNA replication.

Genomic DNA is a common, necessary, and essential process in all living things. Replication can be divided into initiation, elongation, and termination steps.

Initiation of DNA replication
During initiation, initiators recognize and then bind the replication origin DNA, converting it to a replication fork. The steps of initiation are made of up of the following steps: initiators assemble around the origin of DNA, and  the dsDNA origin is melted. The melting of dsDNA produces a replication fork on each side of the origin to allow bi-directional replication. Before this step can happen, however, there are topological limitations that must be overcome to convert the melted origin to a fork structure. To induce the assembly of initiators at the origin, biochemical methods can be utilized to detect the initial melting of origin dsDNA. In the archaeal and eukaryotic cellular systems, the duration of origin melting is still unsure. However, the origin melting has been shown to be induced by the assembly of LTag. SV40 LTag is capable of inducing origin melting and unwinding, therefore it is considered to be the initiator in the eukaryotic system. It has been used as a model to study origin recognition, assembly, and melting process. To convert from a melted dsDNA origin, an assembly of initiators at an active replication fork expands the melted region and positions the helicase on the fork.

The initiation step is one of 3 steps in DNA replication (along with elongation and termination). In initiation, many replication proteins called initiators convert the DNA into a replication fork. This is accomplished first by the initiator proteins assembling around the DNA which causes melting of the dsDNA (double stranded DNA) origin. The origin melting then starts to produce a replication fork on each side of the melted origin. This produces bi-directional replication. Ring shaped helicases assists in this process. However, the mechanism of how the initiators and helicase melts and unwinds the origin DNA is not well understood due to the lack of high-resolution structures at the intermediate.

In eukaryotic and archaeal cellular system the initiator proteins includes Orc, Cdc6, Cdt1, and MCM (mini-chromosome maintenance) helicase. MCM is one of most important factors in the formation of the unwound fork. MCM forms hexamers that can dimerize into double- hexamers. The helicase for SV40 large T antigen (LTag) is able to recognize the origin DNA and can melt and unwind the DNA into a replication fork without the use of cofactors. SV40 LTag is considered the archetypal initiator/helicase in eukaryotic systems and is a model for studying recognition, assemble and melting.

Crystal structures of LTag hexamer reveals a channel of (13-17Å), which is enough for a ssDNA to go through but not dsDNA (20 Å). It is believed that melted ssDNA is encircled in the central channel for hexameric helicase, even during the assembly at the origin.

LTag also shows a β-hairpins in the central channel that is configured in a planar arrangement. β-hairpins form 2 adjacent planar rings with DR/F loops which contributes to the narrowest part of the channel in the AAA+ domain. It is questioned whether LTag can expand to accommodate dsDNA or is the dsDNA modified by initiator/helicase to fit the narrow channel. However for the latter to occur, LTag must squeeze and crush the dsDNA which disrupts the base pairs and melting of the dsDNA. This models often referred to as the squeeze to open model.

The most widely accepted model for fork unwinding is of the ring-shaped helicase that encircles and migrates down the DNA strand and splitting the dsDNA to ssDNA.

In Prokaryotic cells, bacterial replicases contain a polymerase, polymerase III (Pol III), a β2 factor, and a DnaX complex. They are very processive, and cycle faster during Okazaki fragment synthesis in many ways. DnaA (an origin recognition protein) can start the origin melting into single stranded DNA (ssDNA). The ssDNA is the site for loading hexameric helicase DnaB(which only exist as single-hexamers). One helicase that bacteria has is DnaB6, which can separate two strand at the replication fork. It translocates at the 5'-->3'. The DNA polymerase III holoenzyme (Pol III HE) makes contact at the replication fork and also function as a dimer that appears to have a regulated affinity on the lagging strand in order to recycles between primers during Okazaki fragments synthesis. DnaB uses ATP hydrolysis to go down the strand in order to split the two strand. Primase interacts with the helicase and combines with short RNA primers for Okazaki fragment synthesis. The RNA primers keep extending by the Pol III HE until a signal is received to replace to the next primer at the replication fork. During the process, the gaps between the Okazaki fragments are filled, RNA primers are deleted by DNA polymerase I, and is sealed by DNA ligase. DnaB has its N-terminal end free for docking primases making it easy for the primase to capture the ssDNA emerging from the N-terminal domain during fork unwinding.

Initiating Replication in Archae and Eukaryotes by Melting the Double Stranded DNA
Although not much is known about the initiation of replication by the melting of double-stranded DNA, recent studies have shed light on possible mechanisms for this process. Two co-crystal structures from archaea that have both the initiators and the origin DNA have been discovered to show how the initiators recognize the double-stranded origin of DNA. The complexes, Cdc6/Orc-dsDNA show the double stranded DNA deforming and bending, but not melting. Thus, researchers believe that in order to trigger the melting of the double-stranded DNA and to generate higher order complexes at the origin of replication, initiators like MCM mentioned in the above section must be needed.



This image represents an example of the structure of a DNA replication initiator—specifically showing the Cdc21 and Cdc54 (similar to the Cdc6 described above) N-terminal domain. The initiator, Cdc6/ORC 1 (which is not depicted here but can be represented by the picture above) binds to the origin of replication and bends the DNA. Citation: http://www.ebi.ac.uk/ http://upload.wikimedia.org/wikipedia/commons/c/c6/PDB_1ltl_EBI.jpg In eukaryotes, the SV40 Ltag at the origin is able to trigger the melting of the origin of replication and the subsequent unwinding of DNA, making it the initiator-helicase that is used as a model system for examining origin recognition, assembly, and the melting of the double-stranded DNA. The crystal structures of Ltag hexamers that are not bounded to DNA have been shown to have channels that seem to be able to bind to only single stranded DNA, but not double stranded DNA because the channels are usually about 13 to 17 Å (angstroms), while double stranded DNA molecules tend to have a diameter of about 20 Å, making a double stranded DNA molecule unable to fit inside the channel. Generally, studies of DNA translocation have shown that in order for a double stranded DNA to fit inside the channel of an Ltag hexamer, without changing its shape, the channels diameter must be at least 20 Å in diameter. In addition to not being big enough, crystal structures of Ltag hexamers have a planar arrangement of b-hairpins in the middle channel.



Here is an example of a b-hairpin, a component of the LTag hexamer structure. The b-strands in the b-hairpin are antiparallel—meaning that the N-terminus of one b-sheet is aligned with the C-terminus of another b-sheet. In the case of the LTag hexamer, the b-hairpins are on the same plane in the central region of the channel. Citation: http://commons.wikimedia.org/wiki/File:Beta_hairpin.png

Recently, cryo-EM has demonstrated that Ltag hexamer channels can bind double stranded DNA molecules by surrounding the double stranded DNA with two hexamers. Researchers however, still are unsure whether the double stranded DNA changes configuration because of the initiator-helicase or whether the Ltag widens to allow the double-stranded DNA to bind. One model, the squeeze-to-open model, asserts that the Ltag hexamer can fit the origin double stranded DNA into its narrower channel by squeezing the DNA through the channel. As a result, base-pairs are disrupted and the melting of the double stranded DNA origin occurs. This model has been proposed, and is in the process of being confirmed because it appears to be consistent with the known data regarding DNA melting.

The Formation of the Replication Fork by the Squeeze-Pumping Model:

The squeeze-pumping model derives from information that comes from the structure of the Ltag hexamers. The structure includes a narrow channel as mentioned above, an AAA+ motor domain, a side channel where single stranded DNA can exit, and inter-Zn domains. This model is based on the DNA being melted by the squeeze-to-open model described above, where the melted DNA is pumped to the Zn-Domain until it generates the single stranded DNA loop which can then leave the channel and form the replication fork.

Translocation of Single and Partially Hydrolyzed Double Stranded DNA: Researchers have demonstrated that double-hexameric LTag and MCM have the ability to unwind DNA. LTag has been shown to be able to unwind long double stranded DNA that include an internal origin sequence in its double-hexameric form. This differs from the steric exclusion model of fork unwinding—which is the most widely accepted model. This model is based on evidence showing that a ring-shaped helicase surrounds and moves down one of the DNA strands toward the double-stranded DNA fork while exposing the single stranded DNA strand in the process.



The above image represents a ring-shaped hexameric helicase structure that surrounds and moves down the DNA strands (which are not depicted in the photo). Citation: http://www.ebi.ac.uk/Information/termsofuse.html ; http://commons.wikimedia.org/wiki/File:PDB_1g8y_EBI.jpg

Sources:

Curr Opin Struct Biol. 2010 Sep 24. [Epub ahead of print] Origin DNA melting and unwinding in DNA replication. Gai D, Chang YP, Chen XS. Molecular and Computational Biology, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089, USA.

Links between DNA replication and protein synthesis
For decades, individual studies were done on DNA replication and protein synthesis. Not many scientists discuss the link between these two critical processes in living organisms. Jonathan Berthon, Ryosuke Fujikane, and Patrick Forterre came together in their article “When DNA replication and protein synthesis come together” to provide a detailed explanation of the connections of these seemingly independent fields of structural biochemistry. They suggest that the unexpected but real connections between DNA replication and protein synthesis are found in the three domains of life, especially in Archaea and Eukarya. They believe that there are mechanisms that couple DNA and protein synthesis. Such mechanisms can be found in the activities of (p)ppGpp – Guanosine polyphosphate derivative – and GTPases or the Obg family.
 * Stringent response is a phenomenon that can well link the processes of DNA replication in bacteria’s to the change in amino acid concentration in proteins. As starvation of amino acids occur, a dramatic increase in the intracellular (p)ppGpp concentration is observed that initiates the shut-down of rRNA gene transcription as well as protein synthesis. This process, however, varies among different bodies of bacteria. For instant, inside the system of Bacillus subtilis, amino acid starvation, along with the inhibition of rRNA gene transcription, blocks the elongation step of DNA replication. (p)ppGpp also inhibits the DnaG primase in Bacillus subtilis and could directly affect the Okazaki fragment synthesis in the lagging DNA strand, during the process of self-replication. On the other hand, stringent response in Escherichia coli leads to an instant interference of the DNA replication initiation. Such proofs are important in proving the direct connection between proteins and DNA replication process. The starvation of protein’s amino acid has the potential to stop DNA replication.
 * Another source of connection is Obg family. Obg is known for its ability to couple ribosome biogenesis, a critical step in production of proteins as protein synthesis is done inside ribosome through mRNA, with DNA replication. The link between ribosome biogenesis and DNA replication is argued by scientists to start from the proteins that are originally function in the making of ribosome. These proteins participate in the regulation of the stringent response in bacteria as well as in the stabilization of DNA replication forks. A type of Obg, called ObgE is useful in controlling the levels of (p)ppGpp. One important link between DNA replication and protein synthesis found in ObgE is the fact that the depletion of ObgE would cause problems in chromosomes segregation and cell separation. This study is significant in showing that changes in certain proteins within the body would directly affect the pattern of DNA replication and the organism’s genetic processing. For this reason, Obg studies were done to prove the direct role that this type of protein plays in connecting DNA replication and protein synthesis.
 * Similarly, a type of protein family called NOG1 – Nucleolar G-protein – also participates in the making of ribosome. Nog1p from this particular family belongs to a complex that contains many other proteins that directly take part in DNA replication such as Orc6p (origin recognition complex), Mcm6p, some subunits of MCM complex, Yph1p, and Rrb1p. A very important statement was made by Kilian that changes in proteins that connect ribosome biogenesis to DNA replication would cause “chromosome instability” and “tumor formation”. He also concludes that there exists a network of proteins that directly link the production of ribosome’s and DNA replication in Eukarya domain.
 * All of the above studies and conclusions apply only for Eukarya because there is no clear evidence found for the domain of Archaea. Scientists, however, found that there is a cluster of genes that encode both DNA replication and translation proteins. This cluster includes numerous genes including essential ones such as aIF-2, an excellent source for regulation of DNA replication and protein synthesis. eIF-2 phosphorylation from this cluster is a major component in the mechanism of protein synthesis in eukaryotic cells. Another important component is Nop10 – plays a part in rRNA development. From simply examining these components, a clear conclusion can be drawn that there is, indeed, a close relation in the studies of proteins and DNA replication. One important example is the phenomenon where the two ribosomal proteins L44E and S27E interferes with the DNA replication process under special conditions such as amino acid starvation, previously discussed in the case of stringent response.
 * In conclusion, in both Archaea and Eukarya, there are many experimental data that confirm or suggest the close connections between protein synthesis and DNA replication. Stringent response is one example of how starving amino acids would inhibit the process of DNA replication initiation.





The DNA Replication process works in an "assembly line" like fashion. The DNA double helix is ripped apart and a copy of each strand is produced. There are many biological enzymes that take part and must be present for this vital action to occur correctly.

Replication Fork
When DNA is being replicated, it forms a replication fork that was created during the helicase process that separates the DNA strand. The strands that are separated are called the leading strand and the lagging strand accordingly. The leading strand is synthesized in the 5'-3' direction. It is the new DNA strand, which is being synethized by DNA polymerase. The lagging strand, on the other hand, at the opposite side, which runs from 3' to 5' direction and are synthesized by okazaki fragments. Then primase will build up RNA primers, allowing DNA polymerases to use the 3' OH groups on the RNA primers to act on the DNA running from 5' to 3'. Then these RNA fragments are being substituted with new deoxyribonucleotides and the strand will then be joined together with DNA ligase to complete the chain.

As the DNA unwinds, it will automatically force the DNA to rotate, twisting the structure. This is actually a problem to replicating DNA because it will eventually be physically incapable of replicating when it is over-twisted. To solve this problem, a enzyme called DNA topoisomerases is used. Topoisomerases I will cut the backbone of the DNA to allow the DNA to unwind itself and topoisomerases II will cut the backbones of both strands to allow interconnections with other DNA molecules, eliminating the chances of tangling together.

Helicase
Helicases are motor proteins that move along the double-stranded nucleic acids and actively unwind the double helix. The enzyme uses the energy produced by the hydrolysis of ATP to ADP to unwind and separate a strand of DNA. This is done by the breaking of the hydrogen bonds between the annealed nucleotide bases. Helicase opening of the double strand can be categorized into two different cases: active opening and passive opening. In the active opening case, helicase directly destabilizes the double strand nucleic acid (dsNA) to promote the separation of the two strands. In the case of passive opening, the helicase enzyme binds to a single strand nucleic acid (ssNA) that existed due to thermal fluctuation which induces the opening of part of the double strand. It is found that active opening can increase the rate of unwinding of the DNA strand by 7 folds compared to passive opening. The product of this action is two template strands. One is known as the Leading Strand and the other is known as the Lagging Strand.

The leading stand is the single strand of the parental DNA that is synthesized continuously without interruption while the lagging strand of the parental DNA is formed in fragments. These fragments are called the Okazaki fragments. This is important in explaining how both strands of the parental DNA forms in a 5'->3' direction despite the fact that the two strands are antiparallel. The fragmentary synthesis enables the 5'->3' growth while appearing to form in a 3'->5' direction.

Single-Stranded DNA Binding Proteins
The Single-Stranded DNA Binding Proteins bind to the DNA templates in a way that ceases the two newly formed strands from reannealing. these proteins keep the strands separated so that both of the strands can serve as templates for replication. This allows the remainder of the replication machinery to get into position and begin making new DNA strands.

DNA Polymerase
(see DNA Polymerase Section)

RNA Primase
The RNA Primase attaches itself to the Lagging Strand in a position adjacent to the Helicase. The RNA Primase's Function in DNA Replication is to lay down RNA Primers in 3' to 5' fashion. These RNA Primers act as starting and ending locations for the DNA Polymerases addition of complementary nucleotides. The nucleotide sequences between RNA Primers are known as Okazaki Fragments. The RNA Primase is only necessary in the Lagging Strand because DNA Polymerase can only add complementary bases in a 5' to 3' direction, and the lagging strand is being unwound in the 3' to 5' direction.

DNA Replicases from a Bacterial Perspective

Mitochondrial DNA Replication
Mitochondrial DNA (mtDNA) is maintained apart from nuclear DNA. Because of mtDNA’s small size, it can only boast 37 genes and 13 protein products whereas the haploid nuclear genome encodes over 20,000 genes. However, it can provide a model system for studying nuclear DNA replication. The genome for the circular mtDNA contains approximately 16,600 base pairs in human beings. The encoded genes are also found to be necessary for making ATP by way of oxidative phosphorylation. There seems to be no specific phase for mtDNA to be replicated, meaning the replication can take place over and over during a cell cycle.

The endosymbiont hypothesis is the idea that mitochondria were engulfed to create the first eukaryote. Evidence supporting this hypothesis comes from the existence of mtDNA itself. Because mitochondria were once free-living bacteria, it might be anticipated that the mechanics of mtDNA maintenance would show greater similarity to prokaryotes over eukaryotes.

The mechanism in which mtDNA replicated was discovered in 1972 by electron microscopy. All replicating mtDNA molecules had a single-stranded branch. This further resulted in the leading-strand and lagging-strand synthesis uncoupled in mitochondria, which was different compared to the replication fork for nuclear DNA. The human mtDNA is typically arranged in covalently closed circles that are about one genome in length. In mtDNA replication, there is a strand-displacement replication fork in which leading-strand DNA synthesis occurs in the absence of lagging-strand DNA synthesis. DNA synthesis is carried out by conventional coupled leading- and lagging-strand. Then, delayed lagging-strand DNA synthesis is accompanied by incorporation of RNA on the lagging strand termed RITOLS for RNA incorporation throughout the lagging strand.

The issue of how mammals replicate their mtDNA gave rise to mtDNA replication redux which is an attempt to test the idea that biased segregation of human pathological mtDNA variants was related to replicative advantage, as suggested for yeast mtDNA. 2D agarose gel electrophoresis (2D-AGE) was used to resolve replication intermediates from mitochondria. This was used to define details of the mechanisms of replication for nuclear, plasmid, and viral genomes. It was found that many replication intermediates from crude mitochondrial preparations were sensitive to single-strand nuclease as predicted by SDM, a subset formed arcs indistinguishable from those associated with replicating nuclear and prokaryotic DNA.

However, purer preparations of mitochondria yielded not partially single-stranded DNA but RNA/DNA hybrids. This concluded that the SDM intermediates from earlier studies could be explained by RNA loss during isolation and processing.

In conclusion, there is still controversy about the mechanisms of mtDNA replication. The strand-displacement model of mtDNA replication is where there is a minimum of two primer maturation events for each strand, which applies to the RITOLS replication as well. The identification of Dna2 and Fen1 in mitochondria provides new tools for studying mtDNA replication. By manipulating their expression and studying mutant variants that disrupt mtDNA replication, it might prove to be very informative.

It was found that mutations, deletions, and other problematic arrangements of mtDNA increased in correlation to a mammal's aging. The accumulation of mutated mtDNA in single cells cause respiratory chain deficiency. This causes shorter life spans for the mammals. It can also cause aging phenotypes when there are many mutations such as weight loss and loss of hair.