What is the workshop about?
The course will provide an introduction to de novo assembly, with a hands on introduction followed by in-depth analysis of the key steps in the process. It covers several aspects such as the initial setup of a de novo genome sequencing project, quality control and preprocessing of datasets, generation and evaluation of first pass assemblies and assembly improvement.
Practical exercises will be performed on small-scale real and simulated datasets. We will present best practices and provide tips based on EI's faculty experience.
The course will consist of a mixture of conceptual lectures, methodological lectures and hands on sessions, as well as group activities and discussions. The participants will gain first-hand experience and understanding on NGS assembly, working with the assistance of the faculty, troubleshooting small problems, and reviewing the results.
This year, we will be including more in-depth content on working with mixed datasets and assembly graphs.
"The de novo assembly course was a great way to learn about genome assembly! I found the theory about de novo assembly particularly useful, but we were also given helpful advice on how to assess the quality of de novo assemblies, and some practical sessions where we could test popular assemblers for ourselves. The trainers were friendly and knowledgeable, and allowed time for us to ask them about problems we had encountered in our own work.”
“Aside from hands-on experience with tools, the course explored how to think about assembly and understand it at a deeper level; from initial considerations regarding experimental design through to quality assessment. This course made clear the immense information you can glean from your assemblies (that may otherwise remain overlooked) by adopting a kmer-based approach.”
~ De Novo Assembly Training Course attendees, February 2019
What will I learn?
- Understand the strategic setup of a de novo genome sequencing project, combining different types of data in a coherent approach
- Acquire means to define goals for an assembly project and monitor its progress
- Learn to effectively assess the sequencing data sets' quality
- Learn how input data, algorithms and paramaters affect assembly results
- Learn how to progress from a first-pass to a draft sequence release version
- Review existent QC metrics for assembly projects and their significance
- Graph based approaches for assembly
Prerequisites:
Familiarity with linux is essential to ensure participants are able to go through hands on and concentrate on the actual course topics rather than having to pass the hurdles of basic command line.
Participants also need to have a working, basic understanding of Python. Participants should complete the Software Carpentry lesson titled 'Programming with Python', at least up until step 6 in order to be at the right level of understanding.
Useful background reading on concepts to Next Generation Sequencing technologies can be found here.
Essential pre-reading for this course provides an introduction to k-mers. Familiarity with the concepts and language will ensure we can cover greater depth in this course.
Target audience:
This course is aimed at post-doctoral researchers and advanced PhD students who are already involved or embarking in de novo sequencing projects.