Hot processor speeds up UK genome analysis

28 October 2015

The Earlham Institute (EI) is the first Institute in the UK to deploy a new bioinformatics processor called DRAGEN™, which dramatically reduces genomic pipeline run times from hours to minutes.

This collaboration between Edico Genome and EI resulted in the first adaptation of the DRAGEN technology for the analysis of non-human genomes as part of the Institute’s endeavours to sequence the DNA of plant, animal and microbial species to promote a sustainable bioeconomy.

EI’s high-performance computing (HPC) infrastructure will benefit from the addition of Edico Genome’s DRAGEN™, the world’s first processor designed to analyse specific sequencing data tasks. DRAGEN will be used to accelerate EI’s next-generation sequencing workflows.

Initial evaluations of DRAGEN showed that mapping against the ash tree genome was 177 times faster per processing core than EI’s local HPC systems, requiring only seven minutes instead of three hours on one of the larger datasets. Alignment runs on the rice genome that takes approximately two hours on EI’s HPC servers took just three minutes using DRAGEN.

Project Lead Dr Tim Stitt, Head of Scientific Computing at EI, said: “We are really excited to be Edico Genome’s first DRAGEN customer in the U.K., and we hotly anticipate utilising this ground-breaking technology to advance our mission to promote a sustainable bio-economy and maintain the U.K.’s food security.

“In particular, we are really interested to see how DRAGEN handles the wheat genome, which is five times bigger than the human genome and much more complex. Wheat is the staple diet for over 35 percent of the world’s population, which is predicted to increase to nine billion people by 2050.

“By understanding the genomic building blocks of wheat, and its diversity, we can better inform breeders on how to improve their yields, particularly in areas where wheat is prone to disease and drought. Obviously the sooner we do this the better and DRAGEN can greatly help us in this mission.

“Alignment against reference genomes is a fundamental task undertaken daily by EI researchers. Thanks to our partnership with Edico Genome, our DRAGEN system will contain both genome and transcriptome highly optimised analysis pipelines.”

“EI is proud to be a leader in bringing new and disruptive technologies into the hands of the bioscience community and our collaboration with Edico Genome continues to illustrate our leadership in this area.”

The DRAGEN Bio-IT Processor is integrated on a PCIe card and available in a pre-configured server, enabling seamless integration into bioinformatics workflows. DRAGEN is highly reconfigurable, using a field-programmable gate array (FPGA) to provide hardware-accelerated implementations of BCL conversion, compression, mapping, alignment, sorting, duplicate marking, haplotype variant calling and joint genotyping.

The DRAGEN system, therefore, is much faster than traditional approaches that execute algorithmic implementations in software. In a recent study published in Genome Medicine, DRAGEN sped up analysis of a whole genome from 22.5 hours to 41 minutes, while also achieving sensitivity and specificity of 99.5 percent. Similar efficiency gains could make an enormous impact due to the high throughput of genomic data processed at EI, where sequence alignment is critical to many sequencing projects.

“Our collaboration with EI, a powerhouse in genomics that is home to one of the largest computing hardware facilities in Europe, is a great example of the benefits DRAGEN holds for sequencing centres,” said Pieter van Rooyen, PhD, Chief Executive Officer of Edico Genome. “We look forward to continuing to work with researchers and clinicians around the world with a need to analyse next-generation sequencing data rapidly and cost effectively without compromising accuracy.”

EI is strategically funded by BBSRC and operates a National Capability to promote the application of genomics and bioinformatics to advance bioscience research and innovation.

Notes to editors

1. The hardware modifications were carried out by Edico Genome engineers based on in-house testing using the datasets provided by EI. This testing allowed adaption of the pipelines to handle these non-human datasets.

2. The DRAGEN technology has been shown to accurately analyse over 50 whole human genomes (from FASTQ to VCF) in less than a day. EI plans to incorporate the system into the existing HPC platform as a resource within the batch submission system.

For more information, please contact:

Hayley London

Marketing & Communications Officer, Earlham Institute (EI)

  • +44 (0)1603 450 107

hayley.london@earlham.ac.uk

The Earlham Institute (EI) is a world-leading research institute focusing on the development of genomics and computational biology. EI is based within the Norwich Research Park and is one of eight institutes that receive strategic funding from Biotechnology and Biological Science Research Council (BBSRC) - £6.45M in 2015/2016 - as well as support from other research funders. EI operates a National Capability to promote the application of genomics and bioinformatics to advance bioscience research and innovation.

EI offers a state of the art DNA sequencing facility, unique by its operation of multiple complementary technologies for data generation. The Institute is a UK hub for innovative bioinformatics through research, analysis and interpretation of multiple, complex data sets. It hosts one of the largest computing hardware facilities dedicated to life science research in Europe. It is also actively involved in developing novel platforms to provide access to computational tools and processing capacity for multiple academic and industrial users and promoting applications of computational Bioscience. Additionally, the Institute offers a training programme through courses and workshops, and an outreach programme targeting key stakeholders, and wider public audiences through dialogue and science communication activities.

www.earlham.ac.uk

Edico Genome has created the world’s first bioinformatics processor designed to analyse next-generation sequencing data, DRAGEN™. The use of next-generation sequencing is growing at an unprecedented pace, creating a need for a technology that can process this big data rapidly and accurately.

Edico Genome’s computing platform has been shown to speed whole genome data analysis from hours to minutes, while maintaining high accuracy and reducing costs, enabling clinicians and researchers to reveal answers more quickly. For more information, visit www.EdicoGenome.com or follow @EdicoGenome.