Structural Biochemistry/Nucleic Acid/DNA/Palindromic Sequencing

Palindromic Sequencing
A palindromic sequence is a sequence made up of nucleic acids within double helix of DNA and/or RNA that is the same when read from 5’ to 3’ on one strand and 5’ to 3' on the other, complementary, strand. It is also known as a palindrome or an inverted-reverse sequence.

The pairing of nucleotides within the DNA double-helix is complementary which consist of Adenine (A) pairing with either Thymine (T) in DNA or Uracil (U) in RNA, while Cytosine (C) pairs with Guanine (G). So if a sequence is palindromic, the nucleotide sequence of one strand would be the same as its reverse complementary strand. An example of a palindromic sequence is 5’-GGATCC-3’, which has a complementary strand, 3’-CCTAGG-5’. This is the sequence where the restriction endonuclease, BamHI, binds to and cleaves at a specific cleavage site. When the complementary strand is read backwards, the sequence is 5’-GGATCC-3’ which is identical to the first one, making it a palindromic sequence.

Another restriction enzyme called EcoR1 recognizes and cleaves the following palindromic sequence:

5’ – G A A T T C – 3’ 3’ – C T T A A G – 5’

Relationship between Sequence and Protein Structure
There have been many researchers who have studied the relationship between palindromic sequences and protein structures. Studies have shown that the frequent appearances of palindromic sequences, also called palindromic peptides, in protein sequences are not just by chance. Scientists suggest that these sequences are important for protein structure and protein function in different proteins. Some of these protein groups include DNA binding proteins, ion channels and Rhodopsin, metal binding proteins and receptors, and etc. By comparing palindromes with set sequences from the database, scientists can try to find the roles of palindromic sequences.

Another topic within palindromic sequences which is being studied is whether the symmetry of palindromic sequences affects the structure and folds of peptides. One hypothesis is that by reversing the sequence, the resulting folds would be mirror-images of the original fold. The conclusion states that because both the original and reverse proteins have identical amino acid compositions which lead to similar hydrophobic-hydrophilic patterns, the reversing sequence results in the same folds as opposed to the mirror-image folds. Another hypothesis guided by research is that by reversing a sequence, the fold could change or possibly be destroyed. This shows evidence that the similarity in reverse sequencing does not reflect structural similarity, which means that they do not form symmetrical protein structures.

Effect on genomic instability in yeast
Palindromic sequences have been tied to different genomic rearrangements in different organisms depending on the length of the repeated sequences. Shorter palindromic sequences (shorter than 30 bp) are very stable while longer sequences are not stable in vivo. These sequences occur in both eukaryotes and prokaryotes. These sequences also increase inter and intrachromosomal recombination between homologous sequences. Hairpin structures can form from palindromic sequences due to base pairing in single-stranded DNA. These structures can be substrates for structure-specific nucleases and repair enzymes which can lead to a double-strand break in the DNA. This then leads to loss of genomic material which can cause meiotic recombination. Studies with a 140-bp long mutated palindromic sequence inserted in yeast have shown to lower postmeiotic segregation and increase rate of gene conversions, while shorter sequences do the opposite. Research also found that during meiosis, double-strand breaks are induced by the long 140-bp palindromic sequence. In the long hairpin structure, the entire stem-loop is not covered and the processing endonuclease is exposed, which makes nicks in the loop. This nick creates a gap which is repaired by the wild-type strand. The induction of double-strand breaks during meiosis is what causes genomic instability.

Likelihood of palindromic sequences in proteins
There have not been an abundance of studies focusing on the significance of palindromic sequences in protein, but there have been some which tell us a lot about the relationship between palindromic sequencing and protein function. But by understanding the actual formation of these palindromic sequences and their properties, researchers can tie these sequences to functions. It has been found that decreasing amino acid composition complexity increases the likelihood of a palindromic sequence. The next step relates to the likelihood of palindromic sequences in proteins which can be due to the frequent formation of alpha helices by palindromes.