The result of 13 years of collaborative international research, the reference genome of the bread wheat variety Chinese Spring is the highest quality wheat genome assembly produced to date. Sequencing the bread wheat genome was long considered an impossible task, due to its enormous size – five times larger than the human genome – and complexity – bread wheat has three sub-genomes and more than 85% of the genome is composed of repeated elements.
Genome annotation is the process of identifying functional DNA sequences – locating coding regions is a vital starting point for interpreting the genome and assigning functions to genes. Researchers in the consortium applied a variety of different tools and bioinformatic approaches to define the structure of genes, but in a large complicated genome such as wheat annotation is challenging and gene models can look very different.
Utilising their recently published tool Mikado, the Swarbreck Group (EI) integrated the independently generated gene models to deliver the most comprehensive annotation of the 21 wheat chromosomes to date.
Recipient of this year’s IWGSC Leadership Award, Group Leader at Earlham Institute Dr David Swarbreck, said: “My Group had a special role in assessing the accuracy of the predicted gene structures, we developed metrics to examine how well supported a gene model was by the different types of evidence (proteins, RNA-Seq etc). Using these metrics, we were able to identify potential errors in the annotation allowing us to cherry pick the best most accurate gene models from across a large pool of alternative annotations provided by other groups.”
Mikado was developed to support earlier wheat annotation projects at EI, enabling our researchers to leverage different methods for assembling the transcripts of genes. The tool was refined for the IWGSC project and used as a framework for integrating alternative gene models. As such, Mikado was an important part of ensuring a high quality final annotation.
Paving the way for the production of wheat varieties better adapted to climate challenges, with higher yields, enhanced nutritional quality and improved durability, the IWGSC research project involved more than 200 scientists from 73 research institutions in 20 countries.
With the reference genome sequence now completed, breeders have at their disposal new tools to address these challenges. They will be able to identify more rapidly genes and regulatory elements underlying complex agronomic traits such as yield, grain quality, resistance to fungal diseases, and tolerance to abiotic stress – and produce hardier wheat varieties.
It is predicted that the high-quality reference genome sequence will boost wheat’s improvement over the next decades, with the benefits similar to those observed with maize and rice genome references.
“The publication of the wheat reference genome is the culmination of the work of many individuals who came together under the banner of the IWGSC to do what was considered impossible,” explained Kellye Eversole, Executive Director of the IWGSC. “The method of producing the reference sequence and the principles and policies of the consortium provide a model for sequencing large, complex plant genomes and reaffirms the importance of international collaborations for advancing food security.”
“The tools and experience gained from assembling and annotating the Chinese Spring wheat cultivar can now be applied to our efforts to do the same for other wheat cultivars. It is only by sequencing multiple wheat genomes that we can identify the full complement of wheat genes and provide the best resource for wheat researchers and breeders to continue to improve wheat quality and production. Mikado and related tools will go on to help our future wheat project that uses a pan-genomic approach exploring multiple wheat genomes - focusing on ten different wheat varieties,” added Dr Swarbreck.
The Science article is entitled "Shifting the limits in wheat research and breeding using a fully annotated reference genome" and can be read here.