Wednesday, November 13, 2013

Covering All Our Bases

Biology concepts – nucleoside, tRNA, RNA editing, nonstandard bases, DNA oxidation

Specialized pieces are needed to best build special Lincoln
Log structures, like this castle. This is much like how
specialized nucleosides are needed to carry out special
functions of RNAs. Really – a log castle? Wouldn’t the
Black Knight just burn it?
Last week, we used Lincoln Logs as a model for the different nucleic acids. The small logs mean little until you put them together in an order of which you can make – a cabin, for example. This week we can take the analogy a little further.

Some editions of Lincoln Logs have specialized pieces for building special buildings. These buildings have different purposes, like a sawmill or a bank, and the specialized pieces help them carry out their function of being that building.

Low and behold, there are special building blocks for building specialized nucleic acid structures; usually these are RNAs for which the usual building blocks just won’t do. These are the exceptions to the nucleotide rules of A, C, G, and T for DNA and A C, G, and U for RNA.

There are a few different nucleotides located in DNA molecules, but to date all these have been found to be damaged bases. Oxidized guanosine bases have been the most commonly identified mutations, because guanine is more susceptible to oxidation than the other bases. However, a recent study has identified a 6-oxothymidine in the placental DNA of a smoker.  

More than 20 oxidized DNA bases have been found at one time or another. Their importance lies in their inability to direct correct base pairing in a replicating DNA or a transcribed RNA. In particular, 8-oxoguanosine in a DNA molecule often base pairs with A instead of C, while an oxidized 8-oxoguanosine nucleotide (damaged before it is incorporated into a DNA) will often be put in where a T should rightfully have been placed.

Both of these problems would lead to mistakes in replication or transcription. Some of these mistakes could be in places that matter. If they change a codon, they might cause the wrong amino acid to be incorporated and the resulting protein might be nonfunctional. Or they could create or destroy a stop codon or a splice site. These would definitely alter the resulting protein. Mistakes like this spell disease or cancer.

The top left image shows how 8-oxoguanine is produced by
oxidative damage or radiation. The bottom left shows it
effects on DNA. There can be a miscmatch base pairing
between G and A instead of G and C when the G is damaged.
One possible result is shon on the right. Huntington’s
disease may involve the mismatching of unrepaired
8-oxoguanosines with adneosines. As a result, areas of the
brain are lost and the fluid filled sinuses are enlarged.

Oxoguanosine has been the most studied of the oxidized bases, and several diseases have been linked to this mutation. Many cancers have shown this mutation – leukemias, breast cancer, colorectal cancer, etc. But in addition, things like Parkinson’s disease, Huntington’s disease, Lou Gherig’s disease (ALS), and cystic fibrosis have been correlated with 8-oxoguanosine.

Don’t make the mistake of assuming that an 8-oxoguanosine is the cause of any or all of these diseases, most have many potential causes. The point is that this mutation may contribute to these diseases in some cases. The point then is to find out how to better prevent or repair them. However, your body is pretty good at doing this itself – if everything is behaving normally.

There are specific repair pathways dedicated to removing and replacing oxidized bases (base excision repair or BER) or for nucleotides that contain oxidized bases (nucleotide excision repair or NER) in DNA. In RNA, the major process to deal with 8-oxoguanosine is to destroy the damaged RNA. There are actually several overlapping and redundant repair pathways for 8-oxoguanosine, suggesting that this mutation is particularly damaging and must be dealt with for proper cell function.

It is when the body’s sensing and repair mechanisms don’t work that the problems begin. Therefore, science needs to find better ways to tell when the natural processes aren’t working and develop artificial ways to reverse the damage. A 2013 review is showing the way to detecting mutated guanines in bodily fluids and tissues.

Specifically, this study looked at methods of detecting 8-oxoguanosine levels in plasma, urine, and cerebrospinal fluid and what those changes might mean. The levels found represent a balance between the production and repair of the mutations, so an increase means that more mistakes are being made, or fewer are being repaired. Either way, it means that something must be done.

This is a cartoon showing RNA processing. IT IS NOT TO BE
CONFUSED WITH RNA EDITING!! In processing of eukaryotic
mRNAs, the front end (5’ terminus) is capped so it will last
longer. Then the end is augmented with a bunch of A’s, called
the poly-A tail. Finally, the introns are removed and the
exons (the parts that code for a protein) end up in a
continuous sequence.
But what about nonstandard bases that are actually supposed to be in nucleic acids? The vast majority of these are found in the RNAs and help to point out yet another exception. You think that the RNA transcribed from DNA is the same RNA that functions or is translated to protein? Not always.

RNA editing takes place all the time, where RNA bases are changed after the RNA is transcribed from DNA. In the majority of cases, the RNA editing modifies a standard nucleoside to another standard nucleoside, or add/subtract nucleotides.

Insertion/deletion edits for uracils can increase or decrease the length of the transcript. The mRNA is paired with a guide RNA (gRNA) and base-pairing takes place. For insertion, when there is a mismatch between the mRNA and the gRNA, the editosome inserts a U, so the mRNA transcript gets longer. In deletion editing, if there is an unpaired U in the mRNA, it gets cut out, so the transcript gets shorter.

This was first discovered in a parasite called Trypanosoma brucei, the causative agent of African Sleeping Sickness. There are so many positions at which these insertions/deletions take place that it has come to be known as pan-editing.

In other cases, the editing takes the form of C being replaced by a U. In some cases this results in a protein sequence different than that coded for by the DNA - on purpose!! If that isn’t an exception, I don’t know what is. Other times, the changing of a C to a U creates a stop codon.

In the human apolipoprotein B transcript, the intestinal version undergoes the C to U editing and creates a stop codon, so the apolipoprotein B is 48 kD in mass (B48). In the liver, no editing takes place, so the protein is much larger (B100).

Here are two examples of RNA editing. The top image
shows the insertion/deletion mechanism, where a guide
RNA binds to the mRNA and where there are mismatches
a U is inserted and where there are unmatched U’s, they
are removed. The bottom example is an example where
a base is changed, and this changes the codon, so a
different amino acid is inserted when translated.
There is a lot of C to U editing in plants – I mean, a lot. So much editing goes on that there is now a 2013 database and algorithm to do nothing but predict C to U and U to C edits. Yes, there are U to C edits as well, but only in plant mitochondria and plastids. As far as is known, U to C edits work to destroy stop codons.

Then there is A to I editing. Wait you say, there’s no I in nucleic acids (well, there are actually two “i”s, but you know what I mean). “I” stands for inosine, the first specialized Lincoln Log and our first nonstandard nucleoside. Adenosine (A) is deaminated to form an inosine (I).

There are many functions for inosine editing. Changes from A to I in mRNA alter the protein made since the inosines get read as G’s. Genomically coded A’s end up being read as G’s in the mRNA, and this it changes the gene product! We have many more inosine changes than other primates do. Many of these A to I edits in humans are related to brain development and are a big reason why we are smarter than chimps.

There is also A to I editing in regulatory RNAs called miRNAs (micro RNA). The miRNAs suppress (prevent) translation of some transcripts, but editing of the pre-miRNA makes it bind less well to protein complexes that process the pre- to mature miRNA. More editing mean less binding of miRNAs, which leads to decreased regulation, more transcript translation, and increased protein. This may be one way A to I editing increases human brain power.

Micro RNA is important for controlling the amount of a
transcript that will be translated to protein. The miRNA
can be edited, which will change the amount that is
processed by the protein complex, and therefore changes
the amount that is incorporated into the complex
that will degrade mRNAs.
The search is on to discover the regulation of which A’s get turned to I’s in several types of RNAs ; called the inosome (like genome). The inosome is yet another code we haven’t figured out yet. But inosine doesn’t have to be in a nucleic acid to have an effect. Sometimes it functions just by itself.

Inosine and adenosine accumulate extracellularly during hypoxia/ischaemia (lack of oxygen or blood flow) in the brain and may act as neuroprotectants. A new study extends this protective action to the spinal cord in rats in a hypoxic environment. To characterize hypoxia-evoked A and I accumulation, they examined the effect of hypoxia on the extracellular levels of adenosine and inosine in isolated spinal cords from rats. "Isolated" means the rats and their spinal cords were not necessarily in the same room at the time - so it could be a while before this helps humans.

But perhaps the most common use for I is to alter tRNA binding to amino acids and to the target codons. A to I editing can occur in the anticodon, and change which amino acid is placed in the growing peptide. This is especially true in many organisms for the amino acid isoleucine. Many tRNAs will insert an isoleucine into the protein only when the anticodon of the tRNA has been edited to contain an I in the first position (equivalent to the wobble position of the mRNA codon).

This menacing creature is a worm that lives at the bottom
of the Ocean in the Sea of Cortez. It thrives in the methane
ice on the ocean floor, making it a psychrophile. It can’t
even survive or reproduce if keep above freezing.
What is more, there are other nonstandard nucleosides that serve similar functions, usually with isoleucine or methionine amino acids. Agamantidine is present in many archaeal anticodons and codes for isoleucine. Agamantidine is also present at other points in the tRNA for isoleucine and is important for adding the isoleucine amino acid to the tRNA.

Other nonstandard (modified) nucleosides also work in tRNAs. Lysidine, dihydrouridine, and pseudouridine are some of the more common specialized Lincoln Logs – or maybe we should stick to calling them nonstandard nucleosides. They can be found in the tRNAs of organisms from each of the three domains of life (archaea, bacteria, and eukaryotes). For example, psycrophiles – organisms that grow at very low temperatures – have 70% more dihydrouridines because they help the tRNAs to flex as they need to, even at subfreezing temperatures.

Found mostly in tRNAs, but not exclusively in tRNAs, there are over 100 non-standard nucleosides. Many times they function to increase tRNA binding to transcripts via the anticodon-codon, or increase the binding of the amino acid to the tRNA. They ultimately work to increase translation efficiency. They are weird and are exceptions, but we can’t live without them.

Next week we can spend some time talking about exceptions in the realm of lipids, the last of our four biomolecules.

Paz-Yaacov N, Levanon EY, Nevo E, Kinar Y, Harmelin A, Jacob-Hirsch J, Amariglio N, Eisenberg E, & Rechavi G (2010). Adenosine-to-inosine RNA editing shapes transcriptome diversity in primates. Proceedings of the National Academy of Sciences of the United States of America, 107 (27), 12174-9 PMID: 20566853

Takahashi T, Otsuguro K, Ohta T, & Ito S (2010). Adenosine and inosine release during hypoxia in the isolated spinal cord of neonatal rats. British journal of pharmacology, 161 (8), 1806-16 PMID: 20735412

Lenz H, & Knoop V (2013). PREPACT 2.0: Predicting C-to-U and U-to-C RNA Editing in Organelle Genome Sequences with Multiple References and Curated RNA Editing Annotation. Bioinformatics and biology insights, 7, 1-19 PMID: 23362369

Poulsen HE, Nadal LL, Broedbaek K, Nielsen PE, & Weimann A (2013). Detection and interpretation of 8-oxodG and 8-oxoGua in urine, plasma and cerebrospinal fluid. Biochimica et biophysica acta PMID: 23791936

Wang P, Fisher D, Rao A, & Giese RW (2012). Nontargeted nucleotide analysis based on benzoylhistamine labeling-MALDI-TOF/TOF-MS: discovery of putative 6-oxo-thymine in DNA. Analytical chemistry, 84 (8), 3811-9 PMID: 22409256

For more information or classroom activities, see:

RNA editing –

Wednesday, November 6, 2013

Rewriting the Genetic Code

Biology concepts – DNA, RNA, tRNA, nonstandard nucleotides, codon, anticodon, genetic code, selenocysteine, isodecoder, mitochondria

Just looking the Imperial Hotel in Tokyo doesn’t really give us an idea
of why they inspired Frank Lloyd Wright’s son to invent Lincoln Logs.
 It was the interlocking beams of the basement in which his vision was
born. They were supposed to protect the hotel from earthquake
damage. It worked. In 1923, the same year the hotel was finished,
there was a great earthquake in Tokyo and the Imperial was one of
the few buildings that survived. It also survived the bombings of
WWII. So they tore it down in 1968.
In 1916 John Lloyd Wright invented Lincoln Logs. The construction set was based on his memory of the Imperial Hotel in Tokyo, an edifice designed by his father, Frank Lloyd Wright. The construction set had specific pieces that fit together in a specific way.

The first edition of Lincoln Logs, sold in 1918, gave instructions for building Abraham Lincoln’s boyhood home and Uncle Tom’s cabin. The parts were commensurate for building those structures. Each set of instructions called for the small pieces to be put together in a certain order so that the resulting product conferred a meaning – this is where Lincoln grew up or this is where Tom lived.

DNA and RNA are similarly constructed. There are a few pieces (nucleotides A, C, G, T, and U) that can be used to build different structures. Each small piece can be joined with other small pieces to become part of the whole structure, a structure with meaning. In the case of DNA and mRNA, three nucleotides in a row can confer meaning for one protein building block. The entire series of nucleotides then has the meaning of an entire protein.

The three nucleotide codons relate to a certain amino acid building block to be inserted into a growing protein. This code, the genetic code, gives meaning to the string of DNA nucleotides in genes and the string of nucleotides in the mRNA transcribed from the gene. This is usually where our learning about nucleic acids ends.

The top left picture is Marshall Nirenberg, the initial decoder of the
genetic code. The right photo is Robert Holley, discoverer of tRNA.
Below is the genetic code in graphic style. The four large letters
represent the possible first bases of a codon (in mRNA). The light
yellow letters are the possible 2nd bases, and the darker yellow letters
are the final possibilities. Outside are the amino acids that are coded
by the individual codons. Note that most have more than one codon
and some have codons that begins with different letters, like serine
at 1:00 and 8:30.
The history of the genetic code is worth knowing, as is the history of about every part of science. I often use history to illustrate points in the blog. It is said that those who ignore history are doomed to repeat it, and science has its own version of this axiom, “Six months in the lab can save you a whole afternoon in the library.” Think about it. And besides saving you from repeating others' work, knowing history helps you ask better questions.

But I digress – let’s talk briefly how we decoded the pathway of gene to protein. It begins with Watson and Crick publishing the structure of DNA in 1953. We knew how the different bases could be ordered, but we still didn’t know how they called for a specific amino acid sequence.

In 1955, Francis Crick thought he had an idea about how it might occur, but he didn’t have all the players. He called his idea the Adapter Hypothesis. What he was missing was the adapter, the piece that he said carried amino acids and put them in the correct order.

One neat trick came from George Gamow, a nuclear physicist best known for his role in theorizing the Big Bang (the birth of elements from a cosmic explosion, not the TV show). We had four nucleotides to encode information and 20 (you and I know there are 22) amino acids to be coded for. He used some “way beyond me” math to determine that the most efficient mechanism would have three nucleotides code for one amino acid.

This was followed by an interesting experiment done by Marshall Nirenberg at the National Institutes of Health near Washington, DC. He made a synthetic RNA of a single nucleotide (UUUUU….). He then combined this with the innards of a bunch of cells (cell lysate) so that everything needed to make a protein would be present. He detected a peptide of phenylalanine amino acids. What is more, there were 1/3 as many amino acids as there were nucleotides!

So UUU coded for phenylalanine. This was followed by many more experiments using different sequences of nucleotides, and the code was decoded. Along with this knowledge came the discovery of tRNA by Robert Holley in 1965. This RNA combined an anticodon sequence to recognize a codon on mRNA and carried the appropriate amino acid at the other end. The tRNA was Crick’s adapter, and perhaps the code would have discovered years earlier if the adapter had been pursued in earnest.

The process of turning an mRNA into a protein involves the ribosome and
the tRNAs. When an mRNA is bound to a ribosome, the three nucleotides
in the codon (pink letters) match with three letters of a tRNA anticodon
(blue letters). Different tRNAs will drift in and out until the right one is
bound. The tRNA has the amino acid (aa) bound to the end opposite the
anticodon. If this is the first position of the peptide, it will occur in the P
(peptidyl) site. The second tRNA will be added to the A (acceptor) site and
the ribosome will shift as it creates a peptide bond between the two aa’s.
The shift puts the first aa in the E (exit) site to reelase the tRNA, the 2nd aa
goes to the P site, and the A site is open for the next tRNA.
There are 64 possible codons that can be made from four nucleotides (4 x 4 x 4), but Holley found fewer than 64 tRNAs, one for each codon. Even I know that this kind of math doesn’t work. It turned out that the genetic code was degenerate; more than one codon calls for a particular amino acid. Most amino acids have 2-4 codons assigned to them (we have talked about the exceptions to that rule).

In most cases, codons that call for the same amino acid have the same first two nucleotides; it’s the third position (wobble position) that varies. It was discovered that the third position of the anticodon binds to the DNA very loosely, so the codon/anticodon binding is usually determined by the first two nucleotides. This allows a single tRNA to recognize more than one codon.

It turns out that there are 40-55 different tRNAs, depending on the organism. Why so many? As an example, arginine is coded for by several codons (CGG, CGA, CGC, CGU, AAC, and AAU). It is impossible for one tRNA to recognize both AAC and CGG, so there must be more than one tRNA for arginine.

Serine and leucine are like this as well, and there are most certainly some amino acids whose tRNAs can’t bind to all four possible nucleotides in the wobble position (like glycine), so they would need more than one tRNA. These are the isodecoder tRNAs (different anticodons, but code for same amino acid).

There are also different isodecoder tRNA genes, having different sequences outside the anticodon, but code for the same amino acid. Humans have about 274 genes for our 55 different tRNAs. This implies that the different sequences might have some functions other than just helping to add the right amino acid to a growing peptide sequence.

A 2010 minireview talked about those possible tRNA functions. In one discussed study, a cleaved tRNA is shown to have increased expression when cells are proliferating. Reducing the levels of this cleavage product reduced the rate of cell division. In another study, a tRNA cleavage product silenced the expression of a specific gene. I’ve said it before: nature abhors a unitasker.

UGA, UAA, and UAG are the most common stop codons (see the text for the
exceptions). When the stop codon ends up in the A site, no tRNA fits properly,
but a releasing factor (RF) can be bound. There are at least two RF, RF-1
recognizes UAA and UAG; RF-2 recognizes UAA and UGA. When bound they
cause the ribosome to fall apart.

There are also three codons that don’t code for an amino acid. These are the stop codons that tell the ribosome to stop making the protein and release it.

So we have coding codons and noncoding codons. Experiments in other organisms in the 1960’s and 1970’s indicated that all life uses the same genetic code, making it the universal genetic code. And here begins the exceptions.

The genetic code is almost universal. Considering how many genes from how many organisms there are, the number of exceptions is relatively low. But they are still too numerous for us to talk about them all. That doesn’t mean we should talk about a few of the most interesting.

Mitochondria are the source of many of the exceptions. The endosymbiotic theory states that a bacterium was engulfed by an archaea and they agreed to allow each other to do what they do best. These engulfed bacteria became mitochondria and chloroplasts. But they didn’t always follow the same path.

We are finding that tRNAs can have multiple functions. 1- This is the
usual route, the tRNA codes for an amino acid in a growing peptide.
2- Some tRNAs code for the carry the same amino acid, but have
differences in structure. The change in structure means they don’t
bind the amino acid, so they are free to do other things. 3 and 4- These
non-aa bound tRNAs may be used for regulating expression of specific
genes, usually in the end of the gene. 5- there are probably functions
we don’t know yet.
Remember that mitochondria have their own genomes and machinery for transcribing DNA to mRNA, and translating mRNA to protein. This includes their own set of tRNAs. Since they are all packaged in a closed system, there is no demand for mitochondria to use the same genetic code as nuclear genes. And in many cases, they don’t.

In animal and protist mitochondria, but not plants, the stop codon UGA instead codes for the amino acid tryptophan. You’d think that this would leave them with just two possible stop codons, and some do. But in vertebrates, the codons AGA and AGG (usually code for arginine) have been converted to stop codons. So we actually have four mitochondrial stop codons.

Furthermore, animal mitochondria have switched up another codon; AUA codes for methionine instead of isoleucine. In yeast mitochondria, all the CU_ codons code for threonine instead of leucine. Again I ask… why? I ask that a lot. Not so much why the genetic code has changed in mitochondria, but why it hasn’t in plants.  You tackle that one on your own.

Nuclear genes have far fewer exceptions to the universality of the genetic code. A protist or two have converted two stop codons to code for glutamine, and the bacterium Mycobacterium capricolum has converted the stop codon UGA to a tryptophan codon. Beyond that, we have couple exceptions we have already discussed a bit, selenocysteine and pyrrolysine.

The interesting story is selenocysteine (SeC). We said that it is coded for by a stop codon plus a special stem/loop structure downstream called the SECIS structure. This makes it the 21st amino acid. If it is coded for, even indirectly, it’s going to need a tRNA. In this case, a serine tRNA is modified in a two-step process to carry a SeC.

These are two marine ciliate protist Euplotes crassus organisms
undergoing sexual reproduction, a marine ciliate protist. They are
interesting for many reasons, but one is that they use a slight
variation of the genetic code, and the other reason has to do with
something called a frameshift. The codons are read in 3’s, but some
genes in E crassus require a shift in the reading frame to produce
the correct protein. This means that they go along a 3, 3, 3, 3, then
the ribosome has to move 1 nucleotide over, and then it starts
reading 3, 3, 3, 3 again. The one nucleotide doesn’t code for
anything, but must be there to change the reading frame.
A recent paper identified that the stop codon UGA in Euplotes crassa codes for both Sec and cysteine. Which one gets put in to the growing peptide is based on how far the site is from the SECIS structure.

The same group has a new paper that says humans can also end up with cysteine in the Sec site (originally a UGA stop codon). How can these two examples of cysteine in a Sec site take place, especially since the cysteine and SeC tRNAs are completely different?!

It turns out that it's the levels of selenium and a molecule called thiosulfate (SPO4) that is important for converting other amino acids to cysteine. In some cases, the serine tRNA can be made into a cysteine tRNA instead of a SeC tRNA. So here we have a case of a UGA stop codon converted to a Sec codon then converted to a cysteine codon. Exceptional.

Next week, we can finish up nucleic acid exceptions. Do you think A, G, C, T, and U are it when describing nucleotides? Not even close.

Xu XM, Turanov AA, Carlson BA, Yoo MH, Everley RA, Nandakumar R, Sorokina I, Gygi SP, Gladyshev VN, & Hatfield DL (2010). Targeted insertion of cysteine by decoding UGA codons with mammalian selenocysteine machinery. Proceedings of the National Academy of Sciences of the United States of America, 107 (50), 21430-4 PMID: 21115847

Thoru Pederson (2010). Regulatory RNAs derived from transfer RNA? RNA DOI: 10.1261/rna.2266510

For more information or classroom activities, see:

Genetic code –

Isodecoder tRNAs –