Scientists decoded the genome of rice in 2002. They completed the soybean genome in 2008. They mapped the maize genome in 2009. But only now has the long-awaited wheat genome been fully sequenced. That delay says nothing about wheat’s importance. It is arguably the most critical crop in the world. It’s grown on more land than anything else. It provides humanity with a fifth of our calories. But it also has one of the most complex genomes known to science.
For a start, wheat’s genome is monstrously big. While the genome of Arabidopsis—the first plant to be sequenced—contains 135 million DNA letters, and the human genome contains 3 billion, bread wheat has 16 billion. Just one of wheat’s chromosomes—3B—is bigger than the entire soybean genome.
To make things worse, the bread-wheat genome is really three genomes in one. About 500,000 years ago, before humans even existed, two species of wild grass hybridized with each other to create what we now know as emmer wheat. After humans domesticated this plant and planted it in their fields, a third grass species inadvertently joined the mix. This convoluted history has left modern bread wheat with three pairs of every chromosome, one pair from each of the three ancestral grasses. In technical lingo, that’s a hexaploid genome. In simpler terms, it’s a gigantic pain in the ass.
Typically, geneticists sequence genomes by breaking DNA into small segments, reading them separately, and assembling the pieces back together. But if each chromosome occurs six times, how do you know where to put any given piece?
By Ed Young