Open source data management: Sherlock open for all
EI provides a new, easy, customisable, open source data management platform for the bioinformatics community
The Earlham Institute provides a new, easy, customisable method of data management for smaller research groups who don’t have the resources to develop a modern, big data solution.
In line with our open access approach, Máté Szalay-Bekő of the Tamas Korcsmáros group has open sourced Sherlock to help the bioinformatics community.
Borrowing open source tools from big players such as Amazon and Facebook, Sherlock is a platform to analyse bioinformatics data using modern, big data technologies.
The tool will be especially useful for small research groups who don’t necessarily have the resources to develop their own big-data management solution, who are free to customise the software.
Máté tells us, “We want to encourage everyone to use Sherlock, and also to customise it for themselves. The code is open and we hope that the documentation and examples we provided can help in their daily work. We are open to any suggestions for further development, and also excited for the feedback.”
Using Sherlock can improve data quality and reduce the complexity of data management in any bioinformatics lab.
There is some initial investment in understanding big data concepts and technologies, but pays back in the time saved when, instead of downloading six different public databases and writing a dozen scripts that might take a whole day or longer, you can run a query and have the answer to your desired research question in 15 minutes.
On future plans, Máté explains, “Sherlock is not a finished project. On the contrary, it has just started. As we are using it and integrating more and more data into this platform in our research group, we will always share these new scripts / examples. Of course, the best outcome would be to have many people joining us, resulting in real, community-driven open source development.”