Principles of Biochemistry/Cell Metabolism I: DNA replication

DNA replication is a biological process that occurs in all living organisms and copies their DNA; it is the basis for biological inheritance. The process starts with one double-stranded DNA molecule and produces two identical copies of the molecule. Each strand of the original double-stranded DNA molecule serves as template for the production of the complementary strand. Cellular proofreading and error toe-checking mechanisms ensure near perfect fidelity for DNA replication. In a cell, DNA replication begins at specific locations in the genome, called "origins". Unwinding of DNA at the origin, and synthesis of new strands, forms a replication fork. In addition to DNA polymerase, the enzyme that synthesizes the new DNA by adding nucleotides matched to the template strand, a number of other proteins are associated with the fork and assist in the initiation and continuation of DNA synthesis. DNA replication can also be performed in vitro (artificially, outside a cell). DNA polymerases, isolated from cells, and artificial DNA primers are used to initiate DNA synthesis at known sequences in a template molecule. The polymerase chain reaction (PCR), a common laboratory technique, employs such artificial synthesis in a cyclic manner to amplify a specific target DNA fragment from a pool of DNA.

Replication
In a cell, DNA replication begins at specific locations in the genome, called "origins". Unwinding of DNA at the origin, and synthesis of new strands, forms a replication fork. In addition to DNA polymerase, the enzyme that synthesizes the new DNA by adding nucleotides matched to the template strand, a number of other proteins are associated with the fork and assist in the initiation and continuation of DNA synthesis. DNA replication can also be performed in vitro (outside a cell). DNA polymerases, isolated from cells, and artificial DNA primers are used to initiate DNA synthesis at known sequences in a template molecule. The polymerase chain reaction (PCR), a common laboratory technique, employs such artificial synthesis in a cyclic manner to amplify a specific target DNA fragment from a pool of DNA.

Leading strand
The leading strand template is the template strand of the DNA double helix that is oriented in a 3' to 5' manner. All DNA synthesis occurs 5'-3'. The original DNA strand must be read 3'-5' to produce a 5'-3' nascent strand. The leading strand is formed along the leading strand template as a polymerase "reads" the template DNA and continuously adds nucleotides to the 3' end of the elongating strand. This polymerase is DNA polymerase III (DNA Pol III) in prokaryotes and presumably Pol ε in eukaryotes.

Lagging strand
The lagging strand template is the coding strand of the DNA double helix that is oriented in a 5' to 3' manner. The newly made lagging strand still is synthesized 5'-3'. However, since the DNA is oriented in a manner that does not allow continual synthesis, only small sections can be read at a time. An RNA primer is placed on the DNA strand 3' to the origin of replication. Just as before, DNA Polymerase reads 3'-5' on the original DNA to produce a 5'-3' nascent strand. Polymerase reaches the origin of replication and stops replication until a new RNA primer is placed 3' to the last RNA primer. These fragments of DNA produced on the lagging strand are called Okazaki fragments. The orientation of the original DNA on the lagging strand prevents continual synthesis. As a result, replication of the lagging strand is more complicated than of the leading strand. On the lagging strand template, primase "reads" the DNA and adds RNA to it in short, separated segments. In eukaryotes, primase is intrinsic to Pol α. DNA polymerase III or Pol δ lengthens the primed segments, forming Okazaki fragments. Primer removal in eukaryotes is also performed by Pol δ. In prokaryotes, DNA polymerase I "reads" the fragments, removes the RNA using its flap endonuclease domain, and replaces the RNA nucleotides with DNA nucleotides (this is necessary because RNA and DNA use slightly different kinds of nucleotides). DNA ligase joins the fragments together.

Okazaki fragment
An Okazaki fragment is a relatively short fragment of DNA (with no RNA primer at the 5' terminus) created on the lagging strand during DNA replication. The lengths of Okazaki fragments are between 1,000 to 2,000 nucleotides long in E. coli and are generally between 100 to 200 nucleotides long in eukaryotes. It was originally discovered in 1968 by Reiji Okazaki, Tsuneko Okazaki, and their colleagues while studying replication of bacteriophage DNA in Escherichia coli.

Classification of DNA polymerase
Based on sequence homology, DNA polymerases r subdivided into seven different families: A, B, C, D, X, Y, and RT.

1.Family A Polymerases contain both replicative and repair polymerases. Replicative members from this family include the extensively-studied T7 DNA polymerase, as well as the eukaryotic mitochondrial DNA Polymerase γ. Among the repair polymerases are Escherichia coli DNA pol I, Thermus aquaticus pol I, and Bacillus stearothermophilus pol I. These repair polymerases are involved in excision repair and processing of Okazaki fragments generated during lagging strand synthesis.

2.Family B In XPV patients, alternative error-prone polymerases, e.g., Pol ζ (zeta) (polymerase ζ is a B Family polymerase a complex of the catalytic subunit REV3L with Rev7, which associates with Rev1), are thought to be involved in mistakes that result in the cancer predisposition of these patients. The DNA polymerase which belongs to B family contain DTDS motif. The other members are Pol ε, Pol α, Pol δ.

3.Family C Polymerases are the primary bacterial chromosomal replicative enzymes. DNA Polymerase III alpha subunit from E. coli is the catalytic subunit and possesses no known nuclease activity. A separate subunit, the epsilon subunit, possesses the 3'-5' exonuclease activity used for editing during chromosomal replication. Recent research has classified Family C polymerases as a subcategory of Family X.

4.Family D Polymerases are still not very well characterized. All known examples are found in the Euryarchaeota subdomain of Archaea and are thought to be replicative polymerases.

5.Family X Contains the well-known eukaryotic polymerase pol β, as well as other eukaryotic polymerases such as pol σ, pol λ, pol μ, and terminal deoxynucleotidyl transferase (TdT). Pol β is required for short-patch base excision repair, a DNA repair pathway that is essential for repairing abasic sites. Pol λ and Pol μ are involved in non-homologous end-joining, a mechanism for rejoining DNA double-strand breaks. TdT is expressed only in lymphoid tissue, and adds "n nucleotides" to double-strand breaks formed during V(D)J recombination to promote immunological diversity. The yeast Saccharomyces cerevisiae has only one Pol X polymerase, Pol IV, which is involved in non-homologous end-joining.

6.Family Y Y Polymerases differ from others in having a low fidelity on undamaged templates and in their ability to replicate through damaged DNA. Members of this family are hence called translesion synthesis (TLS) polymerases. Depending on the lesion, TLS polymerases can bypass the damage in an error-free or error-prone fashion, the latter resulting in elevated mutagenesis. Xeroderma pigmentosum variant (XPV) patients for instance have mutations in the gene encoding Pol η (eta), which is error-free for UV-lesions. Other members in humans are Pol ι (iota), Pol κ (kappa), and Rev1 (terminal deoxycytidyl transferase). In E. coli, two TLS polymerases, Pol IV (DINB) and Pol V (UmuD'2C), are known.

7.Family RT (reverse transcriptase) The reverse transcriptase family contains examples from both retroviruses and eukaryotic polymerases. The eukaryotic polymerases are usually restricted to telomerases. These polymerases use an RNA template to synthesize the DNA strand.

DNA Replication is semiconservative
The Meselson and Stahl experiment was an experiment by Matthew Meselson and Franklin Stahl in 1958 which supported the hypothesis that DNA replication was semiconservative. Semiconservative replication means that when the double stranded DNA helix was replicated, each of the two double stranded DNA helices consisted of one strand coming from the original helix and one newly synthesized. It has been called "the most beautiful experiment in biology. "

Three hypotheses had been previously proposed for the method of replication of DNA.

In the semiconservative hypothesis, proposed by Watson and Crick, the two strands of a DNA molecule separate during replication. Each strand then acts as a template for synthesis of a new strand.

The conservative hypothesis proposed that the entire DNA molecule acted as a template for synthesis of an entirely new one. According to this model, histone proteins bound to the DNA, distorting it in such a way as to expose both strands' bases for hydrogen bonding.

The dispersive hypothesis is exemplified by a model proposed by Max Delbrück, which attempts to solve the problem of unwinding the two strands of the double helix by a mechanism that breaks the DNA backbone every 10 nucleotides or so, untwists the molecule, and attaches the old strand to the end of the newly synthesized one. This would synthesize the DNA in short pieces alternating from one strand to the other.

Each of these three models makes a different prediction about the distribution of the "old" DNA in molecules formed after replication. In the conservative hypothesis, after replication, one molecule is the entirely conserved "old" molecule, and the other is all newly synthesized DNA. The semiconservative hypothesis predicts that each molecule after replication will contain one old and one new strand. The dispersive model predicts that each strand of each new molecule will contain a mixture of old and new DNA.

The semi-conservative theory can be confirmed by making use of the fact that DNA is made up of nitrogen bases. Nitrogen has an isotope N15 (N14 is the most common isotope) called heavy nitrogen. The experiment that confirms the predictions of the semi-conservative theory makes use of this isotope and runs as follows: Bacterial (E coli) DNA is placed in a media containing heavy nitrogen(N15), which binds to the DNA, making it identifiable. This DNA is then placed in a media with the presence of N14 and left to replicate only once. The new bases will contain nitrogen 14 while the originals will contain N15 The DNA is placed in test tubes containing caesium chloride (heavy compound) and centrifuged at 40,000 revolutions per minute. The caesium chloride molecules sink to the bottom of the test tubes creating a density gradient. The DNA molecules will position at their corresponding level of density (taking into account that N15 is more dense than N14) These test tubes are observed under ultraviolet rays. DNA appears as a fine layer in the test tubes at different heights according to their density. According to the semi-conservative theory, after one replication of DNA, we should obtain 2 hybrid (part N14 part N15) molecules from each original strand of DNA. This would appear as a single line in the test tube. This result would be the same for the dispersive theory. On the other hand, according to the conservative theory, we should obtain one original DNA strand and a completely new one i.e. two fine lines in the test tube placed separately one from the other. Up to this point, either the semi-conservative or the dispersive theories could be truthful, as experimental evidence confirmed that only one line appeared after one replication. In order to conclude between those two, DNA had to be left to replicate again, still in a media containing N14. In the dispersive theory, after 2 divisions we should obtain a single line, but further up in the test tube, as the DNA molecules become less dense as N14 becomes more abundant in the molecule According to the semi-conservative theory, 2 hybrid molecules and 2 fully N14 molecules should be produced, so two fine lines at different heights in the test tubes should be observed. Experimental evidence confirmed that two lines were observed therefore offering compelling evidence for the semi-conservative theory.

Genetic evidence

An independent 'genetic' evidence for the semi-conservative theory was provided more recently by high throughput genomic sequencing of individual mutagenized bacteria.  E. coli were treated with Ethyl methanesulfonate (EMS), known to induce G:C → A:T transitions due to generation of abnormal base O-6-ethylguanine, which is further misrecognized during DNA replication and paired with T instead of C. The sequenced DNA from individual colonies of EMS-mutagenized bacteria exhibited long stretches of solely G → A or C → T transitions, which in some cases were spanning entire bacterial genome. The elementary explanation of this observation is based on semi-conservative mechanism: one should expect the segregation between daughter strands into different cells after replication, which leads to each descendant cell having exclusively G → A or C → T conversions.

DNA Replication in Eukaryotes
DNA replication in eukaryotes is much more complicated than in prokaryotes, although there are many similar aspects. Eukaryotic cells can only initiate DNA replication at a specific point in the cell cycle, the beginning of S phase.

DNA replication in eukaryotes occurs only in the S phase of the cell cycle. However, pre-initiation occurs in the G1 phase. Thus, the separation of pre-initiation and activation ensures that the origin can only fire once per cell cycle. Due to the sheer size of chromosomes in eukaryotes, eukaryotic chromosomes contain multiple origins of replication. Some origins are well characterized, such as the autonomously replicating sequences (ARS) of yeast while other eukaryotic origins, particularly those in metazoa, can be found in spans of thousands of basepairs.

Eukaryotic DNA polymerase
There are at least 15 known Eukaryotic DNA polymerase:

POLA1, POLA2: Pol α (also called RNA primase): forms a complex with a small catalytic (PriS) and a large noncatalytic (PriL) subunit, with the Pri subunits acting as a primase (synthesizing an RNA primer), and then with DNA Pol α elongating that primer with DNA nucleotides. After around 20 nucleotides[3] elongation is taken over by Pol ε (on the leading strand) and δ (on the lagging strand).

POLB: Pol β: Implicated in repairing DNA, in base excision repair and gap-filling synthesis.

POLG, POLG2: Pol γ: Replicates and repairs mitochondrial DNA and has proofreading 3'->5' exonuclease activity.

POLD1, POLD2, POLD3, POLD4: Pol δ: Highly processive and has proofreading 3'->5' exonuclease activity. Thought to be the main polymerase involved in lagging strand synthesis, though there is still debate about its role.

POLE, POLE2, POLE3: Pol ε: Also highly processive and has proofreading 3'->5' exonuclease activity. Highly related to pol δ, and thought to be the main polymerase involved in leading strand synthesis[5], though there is again still debate about its role.

POLH, POLI, POLK, : η, ι, κ, and Rev1 are Y-family DNA polymerases and Pol ζ is a B-family DNA polymerase. These polymerases are involved in the bypass of DNA damage.

There are also other eukaryotic polymerases known, which are not as well characterized: POLQ: 'θ POLL: λ φ σ POLM: μ None of the eukaryotic polymerases can remove primers (5'->3' exonuclease activity); that function is carried out by other enzymes. Only the polymerases that deal with the elongation (γ, δ and ε) have proofreading ability (3'->5' exonuclease).

Preparation in G1 phase

The first step in DNA replication is the formation of the pre-initiation replication complex (the pre-RC). The formation of this complex occurs in two stages. The first stage requires that there is no CDK activity. This can only occur in early G1. The formation of the pre-RC is known as licensing, but a licensed pre-RC cannot initiate replication in the G1 phase Current models hold that it begins with the binding of the origin recognition complex (ORC) to the origin. This complex is a hexamer of related proteins and remains bound to the origin, even after DNA replication occurs. Furthermore, ORC is the functional analogue of prokaryotic DnaA. Following the binding of ORC to the origin, Cdc6/Cdc18 and Cdt1 coordinate the loading of the MCM (Mini Chromosome Maintenance) complex to the origin by first binding to ORC and then binding to the MCM complex. The MCM complex is thought to be the major DNA helicase in eukaryotic organisms. Once binding of MCM occurs, a fully licensed pre-RC exists.

Replication take place in S phase
Activation of the complex occurs in S-phase and requires Cdk2-Cyclin E and Ddk. The activation process begins with the addition of Mcm10 to the pre-RC, which displaces Cdt1. Following this, Ddk phosphorylates Mcm3-7, which activates the helicase. It is believed that ORC and Cdc6/18 are phosphorylated by Cdk2-Cyclin E. Ddk and the Cdk complex then recruits another protein called Cdc45, which then recruits all of the DNA replication proteins to the replication fork. At this stage the origin fires and DNA synthesis begins. Activation of a new round of replication is prevented through the actions of the cyclin dependent kinases and a protein known as geminin. Geminin binds to Cdt1 and sequesters it. It is a periodic protein that first appears in S-phase and is degraded in late M-phase, possibly through the action of the anaphase promoting complex (APC). In addition, phosphorylation of Cdc6/18 prevent it from binding to the ORC (thus inhibiting loading of the MCM complex) while the phosphorylation of ORC remains unclear. Cells in the G0 stage of the cell cycle are prevented from initiating a round of replication because the Mcm proteins are not expressed.

At least three different types of eukaryotic DNA polymerases are involved in the replication of DNA in animal cells (POL α, Pol δ and POL ε).

Pol α forms a complex with a small catalytic (PriS) and a large noncatalytic (PriL) subunit, with the Pri subunits acting as a primase (synthesizing an RNA primer), and then with DNA Pol α elongating that primer with DNA nucleotides. After around 20 nucleotides elongation is taken over by Pol ε (on the leading strand) and δ (on the lagging strand).

Pol δ: Highly processive and has proofreading 3'->5' exonuclease activity. Thought to be the main polymerase involved in leading strand synthesis, though there is still debate about its role.

Pol ε: Also highly processive and has proofreading 3'->5' exonuclease activity. Highly related to pol δ, and thought to be the main polymerase involved in lagging strand synthesis[, though there is again still debate about its role.

DNA Replication in prokaryote
DNA replication in prokaryotes is extensively studied in E. coli. It is bi-directional and originates at a single origin of replication (OriC).

Primase
In bacteria, primase binds to the DNA helicase forming a complex called the primosome. Primase is activated by DNA helicase where it then synthesizes a short RNA primer approximately 11 ±1 nucleotides long, to which new nucleotides can be added by DNA polymerase.

Primosome
A primosome is a protein complex responsible for creating RNA primers on single stranded DNA during DNA replication.Primosomes are nucleoproteins assemblies that activate DNA replication forks. Their primary role is to recruit the replicative helicase onto single-stranded DNA. The "replication restart" primosome, defined in Escherichia coli, is involved in the reactivation of arrested replication forks.

Assembly of the Escherichia coli primosome requires six proteins, PriA, PriB, PriC, DnaB, DnaC, and DnaT, acting at a primosome assembly site (pas) on an SSBcoated single-stranded (8s) DNA. Assembly is initiated by interactions of PriA and PriB with ssDNA and the pas. PriC, DnaB, DnaC, and DnaT then act on the PriAPriB- DNA complex to yield the primosome.

The primosome consists of seven proteins: DnaG primase, DnaB helicase, DnaC helicase assistant, DnaT, PriA, Pri B, and PriC. The primosome is utilized once on the leading strand of DNA and repeatedly, initiating each Okazaki fragment, on the lagging DNA strand. Initially the complex formed by PriA, PriB, and PriC binds to DNA. Then the DnaB-DnaC helicase complex attaches along with DnaT. This structure is referred to as the pre-primosome. Finally, DnaG will bind to the pre-primosome forming a complete primosome. The primosome attaches 1-10 RNA nucleotides to the single stranded DNA creating a DNA-RNA hybrid. This sequence of RNA is used as a primer to initiate DNA polymerase III. The RNA bases are ultimately replaced with DNA bases by RNase H nuclease (eukaryotes) or DNA polymerase I nuclease (prokaryotes). DNA Ligase then acts to join the two ends together.

Elongation of DNA strand
Once priming is complete, DNA polymerase III holoenzyme is loaded into the DNA and replication starts. The catalytic mechanism of DNA polymerase III involves the use of two metal ions in the active site, and a region in the active site that can discriminate between deoxyribonucleotides and ribonucleotides. The metal ions are general divalent cations that help the 3' OH initiate a nucleophilic attack onto the alpha phosphate of the deoxyribonucleotide and orient and stabilize the negatively charged triphosphate on the deoxyribonucleotide. Nucleophilic attack by the 3' OH on the alpha phosphate releases pyrophosphate, which is then subsequently hydrolyzed (by inorganic phosphatase) into two phosphates. This hydrolysis drives DNA synthesis to completion.

Furthermore, DNA polymerase III must be able to distinguish between correctly paired bases and incorrectly paired bases. This is accomplished by distinguishing Watson-Crick base pairs through the use of an active site pocket that is complementary in shape to the structure of correctly paired nucleotides. This pocket has a tyrosine residue that is able to form van der Waals interactions with the correctly paired nucleotide. In addition, dsDNA (double stranded DNA) in the active site has a wider and shallower minor groove that permits the formation of hydrogen bonds with the third nitrogen of purine bases and the second oxygen of pyrimidine bases. Finally, the active site makes extensive hydrogen bonds with the DNA backbone. These interactions result in the DNA polymerase III closing around a correctly paired base. If a base is inserted and incorrectly paired, these interactions could not occur due to disruptions in hydrogen bonding and van der Waals interactions.

DNA is read in the 3' → 5' direction, therefore, nucleotides are synthesized (or attached to the template strand) in the 5' → 3' direction. However, one of the parent strands of DNA is 3' → 5' while the other is 5' → 3'. To solve this, replication occurs in opposite directions. Heading towards the replication fork, the leading strand is synthesized in a continuous fashion, only requiring one primer. On the other hand, the lagging strand, heading away from the replication fork, is synthesized in a series of short fragments known as Okazaki fragments, consequently requiring many primers. The RNA primers of Okazaki fragments are subsequently degraded by RNAse H and DNA Polymerase I (exonuclease), and the gap (or nicks) are filled with deoxyribonucleotides and sealed by the enzyme ligase.

Termination
Termination of DNA replication in E. coli is completed through the use of termination sequences and the Tus protein. Tus is a sequence-specific DNA binding protein that promotes termination in prokaryotic DNA replication. In E. Coli, Tus binds to ten closely related 23 basepair binding sites encoded in the bacterial chromosome. These sites, called Ter sites, are designated TerA, TerB, ..., TerJ. The binding sites are asymmetric, such that when a Tus-Ter complex (Tus protein bound to a Ter site) is encountered by a replication fork from one direction, the complex is dissociated and replication continues (permissive). When encountered from the other direction, however, the Tus-Ter complex provides a much larger kinetic barrier and halts replication (non-permissive). The multiple Ter sites in the chromosome are oriented such that the two oppositely moving replication forks are both stalled in the desired termination region.

DNA damage and its repair
DNA damage, due to environmental factors and normal metabolic processes inside the cell, occurs at a rate of 1,000 to 1,000,000 molecular lesions per cell per day. While this constitutes only 0.000165% of the human genome's approximately 6 billion bases (3 billion base pairs), unrepaired lesions in critical genes (such as tumor suppressor genes) can impede a cell's ability to carry out its function and appreciably increase the likelihood of tumor formation.

The vast majority of DNA damage affects the primary structure of the double helix; that is, the bases themselves are chemically modified. These modifications can in turn disrupt the molecules' regular helical structure by introducing non-native chemical bonds or bulky adducts that do not fit in the standard double helix. Unlike proteins and RNA, DNA usually lacks tertiary structure and therefore damage or disturbance does not occur at that level. DNA is, however, supercoiled and wound around "packaging" proteins called histones (in eukaryotes), and both superstructures are vulnerable to the effects of DNA damage.

Types of DNA damage
There are five main types of damage to DNA due to endogenous cellular processes: oxidation of bases [e.g. 8-oxo-7,8-dihydroguanine (8-oxoG)] and generation of DNA strand interruptions from reactive oxygen species, alkylation of bases (usually methylation), such as formation of 7-methylguanine, 1-methyladenine, 6-O-Methylguanine hydrolysis of bases, such as deamination, depurination, and depyrimidination. "bulky adduct formation" (i.e., benzo[a]pyrene diol epoxide-dG adduct) mismatch of bases, due to errors in DNA replication, in which the wrong DNA base is stitched into place in a newly forming DNA strand, or a DNA base is skipped over or mistakenly inserted. Damage caused by exogenous agents Damage caused by exogenous agents comes in many forms. Some examples are described below.

UV-B light causes crosslinking between adjacent cytosine and thymine bases creating pyrimidine dimers. This is called direct DNA damage.

UV-A light creates mostly free radicals. The damage caused by free radicals is called indirect DNA damage.

Ionizing radiation such as that created by radioactive decay or in cosmic rays causes breaks in DNA strands. Low-level ionizing radiation may induce irreparable DNA damage (leading to replicational and transcriptional errors needed for neoplasia or may trigger viral interactions) leading to pre-mature aging and cancer.

Thermal disruption at elevated temperature increases the rate of depurination (loss of purine bases from the DNA backbone) and single-strand breaks. For example, hydrolytic depurination is seen in the thermophilic bacteria, which grow in hot springs at 40-80 °C. The rate of depurination (300 purine residues per genome per generation) is too high in these species to be repaired by normal repair machinery, hence a possibility of an adaptive response cannot be ruled out.

Industrial chemicals also play very important role in DNA damage, such as vinyl chloride and hydrogen peroxide, and environmental chemicals such as polycyclic hydrocarbons found in smoke, soot and tar create a huge diversity of DNA adducts- ethenobases, oxidized bases, alkylated phosphotriesters and Crosslinking of DNA just to name a few. UV damage, alkylation/methylation, X-ray damage and oxidative damage are examples of induced damage. Spontaneous damage can include the loss of a base, deamination, sugar ring puckering and tautomeric shift.

Sources of damage
DNA damage can be subdivided into two main types:

Endogenous damage such as attack by reactive oxygen species produced from normal metabolic byproducts (spontaneous mutation),

especially the process of oxidative deamination

also includes replication errors

Exogenous damage caused by external agents such as

ultraviolet [UV 200-300nm] radiation from the sun

other radiation frequencies, including x-rays and gamma rays

hydrolysis or thermal disruption

certain plant toxins

human-made mutagenic chemicals, especially aromatic compounds that act as DNA intercalating agents

cancer chemotherapy and radiotherapy

viruses

Transition In molecular biology, a transition is a point mutation that changes a purine nucleotide to another purine (A ↔ G) or a pyrimidine nucleotide to another pyrimidine (C ↔ T). Approximately two out of three single nucleotide polymorphisms (SNPs) are transitions. Transitions can be caused by oxidative deamination and tautomerization. Although there are twice as many possible transversions, transitions appear more often in genomes, possibly due to the molecular mechanisms that generate them. 5-Methylcytosine is more prone to transition than unmethylated cytosine, due to spontaneous deamination. This mechanism is important because it dictates the rarity of CpG islands.

Transversion In molecular biology, transversion refers to the substitution of a purine for a pyrimidine or vice versa. It can only be reverted by a spontaneous reversion. Because this type of mutation changes the chemical structure dramatically, the consequences of this change tend to be more severe and less common than that of transitions. Transversions can be caused by ionizing radiation and alkylating agents.

Defects in the NER mechanism are responsible for several genetic disorders, including:

Xeroderma pigmentosum: hypersensitivity to sunlight/UV, resulting in increased skin cancer incidence and premature aging

Cockayne syndrome: hypersensitivity to UV and chemical agents

Trichothiodystrophy: sensitive skin, brittle hair and nails Mental retardation often accompanies the latter two disorders, suggesting increased vulnerability of developmental neurons.

Other DNA repair disorders include:

Werner's syndrome: premature aging and retarded growth

Bloom's syndrome: sunlight hypersensitivity, high incidence of malignancies (especially leukemias).

Ataxia telangiectasia: sensitivity to ionizing radiation and some chemical agents

All of the above diseases are often called "segmental progerias" ("accelerated aging diseases") because their victims appear elderly and suffer from aging-related diseases at an abnormally young age, while not manifesting all the symptoms of old age.

Other diseases associated with reduced DNA repair function include Fanconi's anemia, hereditary breast cancer and hereditary colon cancer.

Polymerase chain reaction or PCR
A 1971 paper in the Journal of Molecular Biology by Kleppe and co-workers first described a method using an enzymatic assay to replicate a short DNA template with primers in vitro. However, this early manifestation of the basic PCR principle did not receive much attention, and the invention of the polymerase chain reaction in 1983 is generally credited to Kary Mullis. At the core of the PCR method is the use of a suitable DNA polymerase able to withstand the high temperatures of >90 °C (194 °F) required for separation of the two DNA strands in the DNA double helix after each replication cycle. The DNA polymerases initially employed for in vitro experiments presaging PCR were unable to withstand these high temperatures. So the early procedures for DNA replication were very inefficient and time consuming, and required large amounts of DNA polymerase and continuous handling throughout the process. The discovery in 1976 of Taq polymerase — a DNA polymerase purified from the thermophilic bacterium, Thermus aquaticus, which naturally lives in hot (50 to 80 °C (122 to 176 °F)) environments such as hot springs — paved the way for dramatic improvements of the PCR method. The DNA polymerase isolated from T. aquaticus is stable at high temperatures remaining active even after DNA denaturation, thus obviating the need to add new DNA polymerase after each cycle. This allowed an automated thermocycler-based process for DNA amplification. When Mullis developed the PCR in 1983, he was working in Emeryville, California for Cetus Corporation, one of the first biotechnology companies. There, he was responsible for synthesizing short chains of DNA. Mullis has written that he conceived of PCR while cruising along the Pacific Coast Highway one night in his car.He was playing in his mind with a new way of analyzing changes (mutations) in DNA when he realized that he had instead invented a method of amplifying any DNA region through repeated cycles of duplication driven by DNA polymerase. In Scientific American, Mullis summarized the procedure: "Beginning with a single molecule of the genetic material DNA, the PCR can generate 100 billion similar molecules in an afternoon. The reaction is easy to execute. It requires no more than a test tube, a few simple reagents, and a source of heat." He was awarded the Nobel Prize in Chemistry in 1993 for his invention, seven years after he and his colleagues at Cetus first put his proposal to practice. However, some controversies have remained about the intellectual and practical contributions of other scientists to Mullis' work, and whether he had been the sole inventor of the PCR principle.

PCR

PCR is used to amplify a specific region of a DNA strand (the DNA target). Most PCR methods typically amplify DNA fragments of up to ~10 kilo base pairs (kb), although some techniques allow for amplification of fragments up to 40 kb in size. A basic PCR set up requires several components and reagents.These components include:

DNA template that contains the DNA region (target) to be amplified.

Two primers that are complementary to the 3' (three prime) ends of each of the sense and anti-sense strand of the DNA target. Taq polymerase or another DNA polymerase with a temperature optimum at around 70 °C. Deoxynucleotide triphosphates (dNTPs), the building-blocks from which the DNA polymerase synthesizes a new DNA strand. Buffer solution, providing a suitable chemical environment for optimum activity and stability of the DNA polymerase. Divalent cations, magnesium or manganese ions; generally Mg2+ is used, but Mn2+ can be utilized for PCR-mediated DNA mutagenesis, as higher Mn2+ concentration increases the error rate during DNA synthesis Monovalent cation potassium ions. The PCR is commonly carried out in a reaction volume of 10–200 μl in small reaction tubes (0.2–0.5 ml volumes) in a thermal cycler. The thermal cycler heats and cools the reaction tubes to achieve the temperatures required at each step of the reaction (see below). Many modern thermal cyclers make use of the Peltier effect, which permits both heating and cooling of the block holding the PCR tubes simply by reversing the electric current. Thin-walled reaction tubes permit favorable thermal conductivity to allow for rapid thermal equilibration. Most thermal cyclers have heated lids to prevent condensation at the top of the reaction tube. Older thermocyclers lacking a heated lid require a layer of oil on top of the reaction mixture or a ball of wax inside the tube.

Procedure

Figure 1: Schematic drawing of the PCR cycle. (1) Denaturing at 94–96 °C. (2) Annealing at ~65 °C (3) Elongation at 72 °C. Four cycles are shown here. The blue lines represent the DNA template to which primers (red arrows) anneal that are extended by the DNA polymerase (light green circles), to give shorter DNA products (green lines), which themselves are used as templates as PCR progresses. Typically, PCR consists of a series of 20-40 repeated temperature changes, called cycles, with each cycle commonly consisting of 2-3 discrete temperature steps, usually three. The cycling is often preceded by a single temperature step (called hold) at a high temperature (>90°C), and followed by one hold at the end for final product extension or brief storage. The temperatures used and the length of time they are applied in each cycle depend on a variety of parameters. These include the enzyme used for DNA synthesis, the concentration of divalent ions and dNTPs in the reaction, and the melting temperature (Tm) of the primers.Initialization step: This step consists of heating the reaction to a temperature of 94–96 °C (or 98 °C if extremely thermostable polymerases are used), which is held for 1–9 minutes. It is only required for DNA polymerases that require heat activation by hot-start PCR. Denaturation step: This step is the first regular cycling event and consists of heating the reaction to 94–98 °C for 20–30 seconds. It causes DNA melting of the DNA template by disrupting the hydrogen bonds between complementary bases, yielding single-stranded DNA molecules. Annealing step: The reaction temperature is lowered to 50–65 °C for 20–40 seconds allowing annealing of the primers to the single-stranded DNA template. Typically the annealing temperature is about 3-5 degrees Celsius below the Tm of the primers used. Stable DNA-DNA hydrogen bonds are only formed when the primer sequence very closely matches the template sequence. The polymerase binds to the primer-template hybrid and begins DNA synthesis. Extension/elongation step: The temperature at this step depends on the DNA polymerase used; Taq polymerase has its optimum activity temperature at 75–80 °C, and commonly a temperature of 72 °C is used with this enzyme. At this step the DNA polymerase synthesizes a new DNA strand complementary to the DNA template strand by adding dNTPs that are complementary to the template in 5' to 3' direction, condensing the 5'-phosphate group of the dNTPs with the 3'-hydroxyl group at the end of the nascent (extending) DNA strand. The extension time depends both on the DNA polymerase used and on the length of the DNA fragment to be amplified. As a rule-of-thumb, at its optimum temperature, the DNA polymerase will polymerize a thousand bases per minute. Under optimum conditions, i.e., if there are no limitations due to limiting substrates or reagents, at each extension step, the amount of DNA target is doubled, leading to exponential (geometric) amplification of the specific DNA fragment. Final elongation: This single step is occasionally performed at a temperature of 70–74 °C for 5–15 minutes after the last PCR cycle to ensure that any remaining single-stranded DNA is fully extended. Final hold: This step at 4–15 °C for an indefinite time may be employed for short-term storage of the reaction.

To check whether the PCR generated the anticipated DNA fragment (also sometimes referred to as the amplimer or amplicon), agarose gel electrophoresis is employed for size separation of the PCR products. The size(s) of PCR products is determined by comparison with a DNA ladder (a molecular weight marker), which contains DNA fragments of known size, run on the gel alongside the PCR products.