COPO
Collaborative OPen Omics (COPO) is a portal for brokering data and metadata to public repositories.
We are proponents of open science and actively contribute to the development of tools and standards that ensure scientific data is FAIR.
Open science describes a set of principles held and practices followed with the aim of making all scientific research - including publications, data, and tools - freely accessible to everyone.
The Earlham Institute is a strong advocate of open science, both as a publicly-funded organisation and in recognition of the huge benefits that come from unimpeded access. These benefits include increased efficiency, greater quality and integrity of science, improved transfer of knowledge, innovation across sectors, and the opportunity for everyone to be engaged with science.
Wherever possible, we publish in open-access journals, make our software and tools available as open source on platforms such as GitHub, and use standardised metadata to improve both reproducibility and clarity for other researchers wanting to use our datasets.
The Institute is a leader in data-intensive bioscience. An ethos of open science provides transparency to people about the data we generate or collect, the methodology used, and the analysis of the results.
Modern science involves the generation of vast amounts of data. The ability to collect, store, and interrogate more and more data is empowering us to ask new research questions and make discoveries that would have previously been unimaginable.
There are inherent challenges, however, with scientists around the world generating, labelling, storing, and sharing all of this data in different ways - not to mention the sheer quantity of data itself.
Without a universal approach, data can’t be reliably accessed, understood, or integrated by others who might benefit from it.
FAIR approaches provide a way to manage this by making data Findable, Accessible, Interoperable, and Reusable.
The Earlham Institute develops FAIR compliant metadata standards and tools for the community to adopt as a way of maximising the potential use and impact of the data we generate.
Our experts also play an active role in the community, contributing to discussion and debate.
For example, the Earlham Institute is the coordinating hub of ELIXIR-UK, the UK node of ELIXIR - an international organisation helping researchers to manage and analyse increasing volumes of data.
The pan-European project provides the infrastructure for integrating life sciences data across the continent with the aim of facilitating the linking of data worldwide.
Member organisations collectively provide platforms and guidance for research data management, reproducible data analysis, FAIR data and software management, and related services and standards.
The UK Node leads many ELIXIR communities and focus groups, and participates in European consortia and international standards organisations.
The Earlham Institute has developed a range of freely-accessible services and open source software tools for the bioscience community, including Collaborative Open Omics (COPO) and Grassroots, as well as managing CyVerse UK and contributing code and tools to the Galaxy project.
Collaborative OPen Omics (COPO) is a portal for brokering data and metadata to public repositories.
COPO validates community-defined metadata to ensure the FAIRness of its associated data. The metadata provides essential context for the data, making it searchable, interoperable, and reusable.
Data description - often called metadata - is critical to increasing the value of the data itself, allowing scientists and online search tools to better understand its relevance.
Publications, data, images, and other ‘research objects’ can all be submitted through COPO to remote long-term storage repositories.
Grassroots Infrastructure wraps up bespoke scientific and industry-standard software tools into a user-friendly portal, as well as allowing access from software programs and workflows too.
There is a strong emphasis on FAIR data and all of the data, metadata, and software tools can be searched using a simple search form similar to Google.
Grassroots is designed to be a platform that can be easily installed by other institutions, groups, and individual users too. These can be connected together if desired to form their own interconnected cloud-based resource.
CyVerse UK is a cloud computing infrastructure that provides access to bioinformatics services, tools, compute resources, and system administration.
CyVerse UK represents the first node of CyVerse outside the United States, and it’s funded independently from the parent project.
Galaxy is an open, web-based platform for accessible, reproducible, and transparent data-intensive research.
The Earlham Institute Galaxy server is dedicated to the eight BBSRC-supported institutes and offers state-of-the-art tools and resources for: NGS quality control, sequence alignment, RNA-Seq, de novo assembly, phylogenetics, variant analysis, and genome annotation.