Selenocysteine (Sec) and pyrrolysine (Pyl) are rare amino acids that are not as widely occurring as the 20 canonical amino acids.
Most people are well aware that there are 20 amino acids that build our proteins, but what if I told you that was actually false, and in fact, there are a few additional amino acids that we seldom talk about?
A total of 22 amino acids have been defined in our DNA. Selenocysteine (Sec) and Pyrrolysine (Pyl) are the 21st and 22nd amino acids, respectively. They are referred to as rare amino acids, as they are not as prevalent in nature as the rest of the amino acids.
Why was this finding so shocking?
Four nucleotide bases, symbolized by the letters A, G, T and C, make up the entirety of our DNA. When the genetic code was discovered in the 1960s, scientists found that it was read 3 letters at a time. These three letters are collectively known as codons and are present on messenger RNA (mRNA).
Each codon was considered to have a single function: either denoting one of the 20 amino acids or denoting the beginning (via start codon) or ending point (via stop codon) of the protein-generating translation machinery.
Though these 20 amino acids are the backbone of every protein, some proteins have different or non-traditional amino acids. Most of these amino acids, scientists found, were derived from the original 20 whose structure had changed after the polypeptide chain formed at the end of translation. These changes are known as post-translational modifications, and they are necessary to give the protein its necessary function.
Even though these amino acids are uncommon, they still have important functions. For example, 4-hydroxylysine and 5-hydroxyproline are derivatives of lysine and proline, respectively, and are found in collagen (a protein found in connective tissue).
Therefore, when selenocysteine (Sec) and pyrrolysine (Pyl) were first discovered in proteins, they were thought to result from such post-translational modifications to cysteine and lysine, respectively. However, in 1986, two important discoveries revealed that selenocysteine was actually coded by the stop codon UGA. 16 years later, pyrrolysine was identified to be coded by the stop codon UAG!
By the mid-1960s, it was well known that the start codon (AUG) coded the amino acid methionine, but the stop codons (UAG, UAA, and UGA) were not believed to code for any proteins, but simply terminate translation. In fact, they have even been referred to as nonsense codons, since they do not form amino acids.
Therefore, this discovery was groundbreaking, as it attributed a new role to the stop codons. Additionally, the two discoveries were made separately, in two distinct organisms—E. coli and mice—indicating that such unusual amino acids were present across a wide variety of species.
16 years later, pyrrolysine was identified as being coded by the stop codon UAG!
Why are they rare?
Even though Selenocysteine (Sec) and Pyrrolysine (Pyl) are coded for in the DNA, unlike standard amino acids, they require a special mechanism to be incorporated into a protein. In fact, they require two mechanisms, because even though these two rare amino acids are both coded by stop codons, they do so by utilizing completely different mechanisms.
The presence of Pyrrolysine is restricted to a tiny proportion of proteins in a very limited number of organisms. So far, only 11 organisms of the approximately 1,000 organisms for which full genomic data is available encode pyrrolysine.
On the other hand, selenocysteine is present in a plethora of organisms across all three domains (Archaea, Bacteria, Eukarya) of life. It is believed to be synthesized by nearly a quarter of sequenced bacteria. However, interestingly, just 17 mammalian proteins have been found to synthesize selenocysteine.
Are they important?
Scientists already knew that even though many non-standard amino acids may not be incorporated into proteins, they are still crucial intermediates in numerous metabolic processes. Since selenocysteine and pyrrolysine are actually coded by the DNA, they were even more likely to be essential. However, due to their rarity, the importance of these two amino acids has been overlooked until recently.
Selenocysteine is structurally similar to cysteine and contains an essential micronutrient—selenium—in place of the sulphur atom found in cysteine. It is a crucial amino acid found in selenoproteins and is associated with a number of metabolic and cellular processes. A deficit of selenium in the brain has been found to cause neurological abnormalities like seizures. It has been linked with a number of other diseases, in addition to neurodegenerative disorders. However, researchers have yet to identify its exact role in the disease mechanism.
Pyrrolysine was identified in a methanogen, Methanosarcina barkeri, which can be found in the stomach of cows! Until recently, proteins containing pyrrolysine have only been found in methanogens, implying that it plays a role in methane production (methanogenesis). It is structurally similar to lysine and aids the enzyme methyltransferase during methanogenesis. Further studies are required to identify if pyrrolysine plays a role in any other processes.
Why and when during the evolution of life these two amino acids were added to the genomes of a few organisms remains a mystery. While they are currently categorized as rare, further research might prove us wrong. After all, in the words of Albert Einstein, ‘We still do not know one-thousandth of one percent of what nature has revealed to us’.