• HPC

Project GENESYS: Genetic Search System

Energy efficient DNA sequence searches using a revolutionary optical processor.

Project summary.

Led by: Scientific Computing Group

Start date: 1 March 2015

End date: 30 April 2017

Duration: 24 months

Value: £500,000

Biomedical and bioscience researchers worldwide routinely query DNA sequences against large databases of already sequenced genomes. Comparing DNA from individuals of the same species can give many valuable insights e.g. cancer genes in humans or disease immunity in vital agriculture crops. But advances in sequencing devices have resulted in output doubling every 18 months or less - a single device can sequence up to 1 trillion bases per run. To analyse this in reasonable time requires large high-performance computers (HPCs) that can be prohibitively expensive due to initial capital and ongoing power and cooling costs. At the Earlham Institute these computers consume 130KW+ of power while PetaFLOP systems at the largest research labs can draw 1MW+ costing millions of dollars.

In this project, Earlham Institute's leading genomics research expertise and Optalysys's radical optical processing technology (potential ExaFLOP processing from a standard mains supply) join to build the basis of a new system for large, energy-efficient DNA sequence searching that can potentially reduce energy costs by over 95% while significantly reducing the environmental impact of running traditional HPC. 

Details.

The technical aim of the project is to demonstrate that Optalysys's optical processing technology can produce comparable output data to the current HPC-based software processes and to provide a benchmark comparison to show the power efficiency and time savings of the optical approach. The results of the project will form the basis for the development of integrated optical processors within existing HPC architectures and revolutionary desktop-based systems that will widen the field to allow genetic sequence analysis to be performed locally and without the prohibitive running and build costs of HPC systems.

The test scenario uses a "BLAST"-type process with 500k reads (input samples) from the Human Microbiome Project Mock Community, consisting of 301 base pairs and compares these against a sequence database of 20 types of gut bacteria (64m base pairs). The task currently runs on a single EI HPC node (16 cores and 128GB RAM) taking 28 hours and consuming 11.2kWh of power. Thousands of these BLAST-type tasks can be run at EI (or other genomics institutes and labs) in a given year. A sequence match in the database is determined by direct comparison (and some statistical scoring), allowing them to be associated with one or more locations in the bacteria types.

This process will be adapted to run on a custom-built version of the Optalysys optical processing technology, based upon the well-known principles of the established Matched Filter correlator architecture, but using proprietary Optalysys designs. The systems created in the project will initially be tested in a standalone, off-line process and following successful trials will be interfaced with the existing HPC architecture at EI. 

Collaborators.

Optalysys

Optalysys are the industrial lead partner and bring revolutionary optical computing technology and project management expertise to the project. Find out more about Optalysys.

Impact statement.

Environmental - HPC systems consume vast amounts of power and generate significant heat e.g. the world's fastest supercomputer the Tianhe-2 uses 24MW of power and costs $21m/year to run. A comparable optical "supercomputer" based on Optalysys technology will run from a standard mains supply at 4 orders of magnitude less, as it is based on low power liquid crystal and laser devices whilst providing processing levels that by 2020 are beyond future electronic expectations.

Sequence searches (such as BLAST) are used in numerous projects that provide an early response to plant and tree pathogens. We have been heavily involved in recent national efforts to respond quickly to the dieback threat that is severely damaging UK Ash trees. A successful project will result in an energy-efficient BLAST-like searching technology that opens up analysis to many more researchers who do not necessarily have access to large HPC resources or datacentres. This could have a potentially enormous effect on the environment if such pathogens can be identified and mitigated more rapidly.

Following a successful project and subsequent product development and commercial launch, Optalysys expects to create significant UK job opportunities by 2020.