Researchers Sequence the Genome’s Elusive Centromere

Amanda Larracuente, assistant professor of biology, in her lab in Hutchison Hall. Larracuente and colleagues from Rochester and the University of Connecticut are the first to completely sequence all the centromeres in the genome of a single plant or animal species, in this case, the fruit fly. (University of Rochester photo / J. Adam Fenster)

Though much of the human genome has been sequenced and assembled, scientists have hit road blocks trying to map unassembled regions of DNA that consist mostly of repetitive sequences. One of these regions, found in every cell, is the centromere.

Researchers from the University of Rochester, along with their colleagues at the University of Connecticut, have now discovered the centromeres of the model genetic organism Drosophila melanogaster (fruit fly), sequencing the most repetitive parts of genome and unlocking one of the “last frontiers of genome assembly,” says Amanda Larracuente, an assistant professor of biology at Rochester and co-lead author on the study. The research, published in the journal PloS Biology, sheds light on a fundamental aspect of biology, and shows that selfish genetic elements may play a larger role in centromere function than researchers previously thought.

What is a centromere?

Chromosomes are found in the nucleus of cells and carry tightly wrapped strands of DNA. One of the most essential structures of a chromosome is a specialized region of DNA called the centromere, which is vital for cell division: during cell division proteins attach to the centromere and coordinate the process of packaging DNA and pulling the chromosome apart to opposite sides of the dividing cell. If centromeres don’t function properly, cells may divide with too few or too many chromosomes, which can result in aneuploidy disorders like Down syndrome.

‘Notoriously difficult to sequence’

Centromeres in most plants and animals—including humans—are often found near the center of the chromosome, embedded in blocks of repetitive DNA known as satellite DNA. Satellite DNA, and, in turn, centromeres, are challenging to sequence because of their repetitive nature: when mapping a genome, traditional sequencing methods chop up strands of DNA and read—or sequence—them, then try to infer the order of those sequences and assemble them back together. But, the pieces of repetitive DNA all look the same, so assembling them is like trying to put together a puzzle with identical pieces.

“Even the best studied model organisms with small, relatively simple genomes, are missing some of their sequence due to repetitive DNA,” says Ching-Ho Chang, a PhD student in Larracuente’s lab and co-lead author of the paper. “Centromeres are hiding in these repeat-dense regions, and the organization of centromeres at the DNA level is only known for a few centromeres in a few organisms. No plant or animal has all its centromeres completely sequenced.”

Until now. Larracuente, Chang, and their colleagues used new sequencing technology and genome assembly methods to sequence the repetitive regions of the fruit fly genome, including its centromeres. This is the first time researchers have sequenced all the centromeres in any multicellular organism.

“We break through these barriers and leverage the power of single molecule long-read sequencing and chromatin fiber imaging to discover the detailed organization of the centromeres,” says Barbara Mellone, an associate professor of molecular and cell biology at the University of Connecticut and co-lead author on the study.

Chromosome strands appear in blue and centromeres appear in green.
Fluorescent green spots indicate the location of centromeres on chromosomes from Drosophila melanogaster (University of Rochester image / Ching-Ho Chang)

Transposable elements play an important role

When researchers assemble a portion of the genome, they are also able to discern more information about the sequences themselves. This research not only overcomes the enormous technical hurdle of sequencing centromeres but provides new insight into centromeres in general. For decades, scientists assumed centromeres corresponded to the satellite DNA in which they are embedded. Within the sea of satellite DNA, however, there are “islands” rich in transposable elements. Transposable elements are typically thought of as selfish genetic elements—sequences that can jump around and spread in genomes and in populations despite often being harmful to their host. The researchers found that centromeres sit directly over the islands of transposable elements, which form the core of the centromere itself. This could mean that genomes are repurposing selfish genetic elements to build essential regions of chromosomes.

And, this is likely occurring in organisms beyond just flies, Larracuente says. “Transposable elements are typically thought of as selfish genome parasites. However, more and more studies show that sometimes transposable elements can have functions in plants, animals, and fungi. We show that in flies they aren’t just near centromeres, they form the functional core of centromeres, leading us to suspect that transposable elements are an important part of centromere biology.”