Tests aimed at pushing the limits of single-cell DNA sequencing have revealed something even more surprising. Microbes living in a pond in Oxford University Park appear to be using genetic code in a way scientists have never seen before.
Dr. Jamie McGowan, a postdoctoral researcher at the Earlham Institute, was studying the genomes of protists from freshwater. The goal was realistic. The researchers wanted to test a DNA sequencing pipeline that could process very small amounts of DNA, including DNA from single cells.
Instead, the team discovered unexpected genetic outliers. Organisms identified as: oligohymenophoria sp. PL0344 was found to be a previously unknown species with a rare change in the way it reads instructions in its DNA and builds proteins. of PLOS Genetics The study reported that two codons normally associated with gene stop signals were reassigned to different amino acids, a combination not previously reported, the researchers said.
“It was sheer luck that we chose this protist to test our sequencing pipeline. This just shows what’s out there and highlights how little we know about protist genetics.”
A tiny creature with a big genetic surprise
Protists are so diverse that they are difficult to define clearly. Many are microscopic, single-celled organisms such as amoebas, algae, and diatoms. Some are even larger and multicellular, such as kelp, slime molds, and red algae.
“Protists are vaguely defined and are essentially eukaryotes that are not animals, plants, or fungi,” McGowan says. “This is obviously very common because protists are a very variable group.
“Some are more like animals, others more like plants. There are hunters and prey, parasites and hosts, swimmers and caretakers, some have different diets, and some photosynthesize. Basically, there are very few generalizations possible.”
oligohymenophoria sp. PL0344 belongs to a group called ciliates. These swimming protists are microscopic and can be found in many aquatic environments. Ciliates are of particular interest to geneticists because they are known to be hot spots for changes in the genetic code, including changes involving stop codons.
When the meaning of genetic stop signs changes
In most organisms, three stop codons tell the cell where to end a gene: TAA, TAG, and TGA. These act like punctuation marks in genetic instructions, letting us know when we need to stop building a protein.
The genetic code is usually described as nearly universal because most organisms use the same basic rules. Variations do occur, but they are rare. In the few known genetic code variants, TAA and TAG usually vary together and end up with the same meaning. This pattern suggests that the two codons are evolutionarily related.
“In almost every other case that we know of, TAA and TAG change in tandem,” Dr. McGowan explained. “If it is not a stop codon, each specifies the same amino acid.”
This creature did something different. in oligohymenophoria sp. PL0344, only TGA appears to function as a stop codon. The other two signals are being reused. TAA specifies lysine and TAG specifies glutamic acid. The researchers also found more TGA codons than expected, which may help compensate for the loss of two other stop signals. of PLOS Genetics This paper reports that the remaining UGA stop codons are enriched immediately after the coding region, suggesting that they may help prevent harmful readthrough if translation continues excessively.
“This is highly unusual,” Dr McGowan said. “We don’t know of any other examples where these stop codons are attached to two different amino acids. This breaks some of the rules we thought we knew about gene translation. These two codons were thought to be attached.”
“Scientists are creating new genetic codes, but they also exist in nature. There are some fascinating things we can discover if we look for them.
“Or in this case, when we’re not looking.”
How cells read DNA instructions
DNA can be thought of as a set of instructions, but the instructions must be copied and interpreted before they can have any effect. First, the gene is transcribed into RNA. That RNA copy is then translated into amino acids, which are combined to form proteins and other functional molecules.
Translation begins at a DNA start codon (ATG) and usually ends at a stop codon (usually TAA, TAG, or TGA). In this ciliate, its familiar ending system has been rearranged. This discovery shows that even one of biology’s most conserved systems may be more flexible than expected.
The researchers’ genome and transcriptome analyzes also identified suppressor tRNA genes that match the reassigned codons, supporting the conclusion that organisms do indeed read these previous stop signals as amino acids. This study found that UAA encodes lysine and UAG encodes glutamic acid.
Subsequent research revealed that ciliates break genetic rules.
Follow-up studies have reinforced the idea that ciliates are an unusually rich source of surprising genetic code. in 2024 PLOS Genetics In the study, researchers reported multiple independent reassignments of the UAG stop codon in phyllopharyngeal ciliates. Some uncultivated ciliates in the TARA Oceans dataset appear to use UAG to code for leucine, while Hartmannula sinica and Trochilia petrani were found to use UAG to code for glutamine.
Subsequent studies also found that while UAA remains the preferred stop codon in chloropharyngeal ciliates, UAG repeatedly transitions to a protein-coding role. This finding points to repeated changes in the genetic code of poorly studied microbial eukaryotes and supports the idea that ciliates are one of the strongest exceptions to the standard genetic code.
Taken together, these findings suggest that the genetic code is not as fixed as once thought. For most organisms, rules are surprisingly stable. But in overlooked microorganisms, especially ciliates, evolution has repeatedly discovered ways to edit instructions.
Funding and publication
The original study was PLOS Genetics This was funded by the Wellcome Trust as part of the Darwin Tree of Life project and supported by Earlham Institute core funding from the Biotechnology and Biological Sciences Research Council (BBSRC), part of UKRI. This publication reported sequence data and genome assembly resources stored in public repositories.

