Understanding complexity in living systems.
This programme seeks to identify and understand the functional roles of alleles in biological systems.
This research programme has now concluded. Please visit Our Research Programmes to find out about our current research.
Our science needs new and innovative ways of processing the huge and varied datasets that are produced every day by our researchers. We’re building better data handling systems, more efficient and greener algorithms to make the best use of our large supercomputers, and exciting technology to put data together in new ways to answer complex biological questions.
State-of-the-art technologies are generating unprecedented amounts of complex data, from genomes, to proteomes and transcriptomes, spanning mechanistic and functional diversity. Handling, interpreting and integrating these large scale data into descriptive models that interpret the molecular functions at a system level requires continued development of algorithms, robust computational models, and interoperable analytical frameworks. Supported by our core capability we will contribute to the newest developments in the data sciences and facilitate the extrapolation of meaningful signals from often noisy data.
This effort is fundamental to the success of our associated research programmes into ‘Understanding genome evolution to drive trait improvement’, ‘Understanding complexity in living systems’ and ‘Designing Future Wheat’, as the analyses developed in these programmes are intrinsically data-rich and therefore exposed to multiple challenges of computational complexity. This programme specifically answers such challenges arising from managing large scale datasets and their associated metadata, improving existing algorithms to maximise their efficiency on state-of-the-art computing architectures, and the integration of heterogeneous datasets generated in the other programmes, thereby enabling and enhancing their interpretation.
We will carry out fundamental research into software engineering methods to manage, share, visualise and integrate the large and complex datasets. We will also develop research data management and dissemination layers, underpinned by community standards, that provide the granularity and searchability of EI’s large-scale and diverse data outputs. A key part of this programme will be the integration of the statistical, machine-learning, and network-based models described in the other programmes. We will exploit our algorithm optimisation expertise to drive computational advances in accuracy and efficiency across our research into assembly and variant calling, and annotation and network analysis. These efforts will put the platforms in place to consistently collect and rapidly feed datasets into downstream integrative analyses, enabling the extensive and complex data interrogation processes required for bringing together multiple heterogeneous datasets.
It is now an absolute requirement of data-intensive integrative biology that access to relevant multi-scale diverse datasets is fast and intuitive, and analyses can handle these complex datasets to maintain statistical power. Key objectives of this topic include:
The data-intensive challenges of sequence assembly, annotation, gene expression, real-time image analysis, and knowledge mining require investigation and implementation of state-of-the-art computing technology. Key objectives in this area include:
A key goal of genomic data analyses utilising extensive whole-genome sequencing, transcriptomic, and methylomic datasets is the ability to extrapolate effective models that predict phenotypic traits and outcomes. Key objectives for this topic include:
The development of new bioinformatics tools, resources and algorithms will help researchers across the biotechnology and biomedical sectors work with data more effectively, as well as making it easily shareable and usable by others. Our continued optimisation of computational architecture and software will increase resource efficiency for researchers. Our work on in-field technology will have many benefits in the agritech sector, as well as provide better data quality and access to agronomists and farmers. Multi-omics researchers will benefit from our holistic and open approach to data management, allowing for more robust scientific interrogation.
This programme seeks to identify and understand the functional roles of alleles in biological systems.
Coordinating expertise across BBSRC institutes with complementary university programmes to make sure researchers are well equipped to support the development of this crucial food crop.
Understanding genome evolution to drive trait improvement understanding genome evolution to drive trait improvement understanding genome evolution to drive trait.