Yale Scientists Study Genes Misidentified as ‘Non-Protein Coding’

protein-coding genes
Aw112010 protein (in green) in a cell. The blue is the nucleus and the red is the outer wall of the cell.

The human genome contains regions that “code” for proteins, which means they have instructions to make protein molecules with specific functions in the body. But Yale researchers have discovered several protein-coding genes that were misidentified as non-protein-coding, and one in particular that plays a key role in the immune system.

The research team suspected that the way genes are annotated, or categorized, within the genome has limited the identification of genes with the potential to code proteins. To test their theory, the researchers used mice models to study interaction between RNA and ribosomes, which are minute structures that turn RNA into proteins. They also used a technique called ribosome profiling to investigate how ribosomes relate to RNA. With the combined techniques, the researchers found a number of genes previously identified as non-protein-coding that were actively making proteins.

In further experiments, the researchers focused on one such gene, Aw112010, that was activated when mice were infected with bacteria. Through use of CRISPR gene editing, they confirmed that immune cells were robustly synthesizing the protein made from Aw112010 gene in response to Salmonella infection.

The findings suggest many more protein-coding genes and functions may be discovered. “A large portion of important protein-coding genes have been missed by virtue of their annotation,” said first author Ruaidhri Jackson. Without vetting and identifying these genes, “we can’t fully understand the protein-coding genome or adequately screen genes for health and disease purposes.”