Inside Earlham Institute: purchasing snazzy kit for the Genomics Pipelines lab
Harbans Marway explains quality control is key in Genomics Pipelines, and the right equipment makes life easier in a next generation genome sequencing lab.
It’s an exciting time for Genomics Pipelines. It might not sound it at first, but bear with me and you’ll see why the purchasing of an instrument matters so much (to me at least).
What is the instrument in question? It’s called the FEMTO Pulse and will help us see the correct resolution of our DNA in the lab, an important quality control step in the genome sequencing pipeline.
How exactly? Let me start by explaining a bit about what goes on in the early stages of a next generation genome sequencing laboratory, and why this machine is so important.
One of the first most important steps in the DNA sequencing process is to identify fragments of DNA that are the correct size for the sequencing experiment that we want to carry out, and the way to do this is by using gel electrophoresis.
This is a DNA separation technique that has been known to the scientific community since the ‘60’s (and is pretty standard practice in many laboratories today). It involves applying an electric field across a gel matrix, which causes charged strands of DNA to move through the gel.
Charged DNA, you ask?
DNA itself is charged; the phosphate groups of the deoxyribose-phosphate backbones of DNA carry a negative charge. When an electric field is applied, the DNA migrates from negative to positive and, crucially for what we need to see, the smaller fragments of DNA travel faster than the larger ones, which allows us to separate the DNA based on size.
We can separate DNA pretty efficiently, because the gel matrix that we use is made up of tiny pores, meaning DNA can move through the gel; the longer bits of DNA generally move more slowly through the gel than smaller ones because the longer bits get stuck.
You’ll hear wet-lab scientists referring to ‘running a gel’, which can be time consuming, and looks a bit like this:
You first have to make the gel, usually with agarose – a type of sugar which is extracted from seaweed. Normally, you have to heat this agarose up with some buffer in a microwave, then pour it into a tray and wait for it to dry into a gel before finally connecting it up to the power supply to ‘run the gel’ and separate your DNA.
Once you have finished, usually after 40 minutes or so, often you see a ‘smear’ of DNA on the gel as it spreads out based on size.
To know how large (or small) your DNA is, you can buy ‘ladders’ from plenty of companies to compare your DNA against. The ladders are just DNA fragments of known size, which you run alongside your own DNA to give you an idea of how big or small your own DNA fragments are.
Now, we need to introduce a curveball.
I say that smaller DNA fragments travel faster through the gel than larger fragments. This is almost true. Large DNA fragments do indeed travel slower than small ones, however at a certain size all the larger fragments will travel at the same speed.
This isn’t a concern if you only want to see the small fragments, but if you’re interested in the larger ones (and we are, we’ll come back to this later), then this is no good.
Rather than having a linear electrical field, Pulsed-Field Gel Electrophoresis (PFGE) periodically switches the voltage at angles (usually 60 degrees) to the central flow. This still results in a net forward migration, however what is important to note is the smaller (of the large) pieces of DNA react faster to the change in field direction.
This ‘breaks up’ the band of large DNA and allows you to see with greater clarity.
PFGE can be a pain though. It uses up a lot of material, and for what is often a quality-control step in a process, this is unacceptable (due to cost, for starters). Also, it can take ages – the last one I ran took 16 hours!
Recently, though, many companies have been manufacturing easier solutions to traditional gels. (It is worth noting here, that I don’t mean PFGE, just ‘normal’ gels for small DNA fragments.)
These solutions often involve loading small amounts of material onto a chip and can be safer (traditional gels use chemicals that bind to DNA in order to let us visualise it using UV light), quicker and generally have software to interrogate the results with. All in all – an improvement!
However, we had yet to see a solution like this for PFGE - until now.
This is where the FEMTO Pulse comes in. In theory it ought to allow us to see large fragments of DNA, but using only a tiny amount of input (hence femto, that’s one order of magnitude smaller than Pico)!
Additionally, we’re a high-throughput lab processing many samples a day and this platform should allow us to process 11 samples an hour – a step up from cumbersome PFGE.
![]()
Good material in = good data out, and we’re in the business of producing good data.
![]()
At EI, we have some incredible long-read sequencing technologies at our disposal, including platforms from PacBio and Oxford Nanopore. These are important because genomes can be tricky to read.
Let’s imagine a genome is a book.
Our short-read sequencers can read one sentence at a time with very few spelling errors. However, the long-read sequencers can read a whole paragraph or two, perhaps even a page – admittedly with quite a few more spelling errors. This can be useful when you want to assemble the genome in its entirety.
To continue with this simile, imagine if you didn’t know the book and you wanted to stick these sentences and paragraphs together in a legible way, it helps a great deal if you can assemble the paragraphs as a scaffold and map the well-spelt sentences on top of these.
Equally, some genomes have repetitive sequences, which means that many of the sentences read on the short-read sequencers, albeit more accurately, literally get lost in translation.
Without the ‘scaffold’ of the paragraphs and pages to give some context, the computers that do the maths and piece the sentences together sometimes get rid of them entirely, mistake them for similar sentences, or simply don’t know where to put them.
You can see why having longer reads can help with knowing where in the book you are
Plant genomes are large, often have repetitive sequences and sometimes have more than two pairs of chromosomes. Imagine trying to stick War and Peace together from fragments of sentences and paragraphs – quite the undertaking.
We recently helped publish the wheat genome, which I assure you is a big deal in the scientific world. Reading the abstract alone should give you a good understanding of the difficulties faced trying to read these large and complex genomes!
The most complete version of the wheat genome relied heavily on a combination of both accurate short-read sequencing, aligned with the more expansive scaffolding afforded by the long-read platforms (as well as some great quality control using the K-mer analysis tool, KAT).
We have a lovely article about just what is contained in a genome here.
There are plenty of good kits out there, which make extraction easy, however you don’t usually yield a lot of material from your sample, and what you do get isn’t always the largest. Many scientists have moved back to the old-school techniques.
These techniques can yield plenty of large DNA, however require optimising for each tissue and can involve strong chemicals – which often get carried through the extraction. This can be a problem in the lab as some of those chemicals can interfere with our processes and damage the DNA over time.
If you’re a customer of ours, or would like to find out more, feel free to pop us an email at projects@earlham.ac.uk and we’ll be more than happy to discuss your needs.
We need good quality, large DNA fragments to get our long reads; much like cooking or whisky-distilling, what you put in is what you get out.
Knowing what you’re working with is essential to good lab practice. Our quality control for this sort of material is tri-fold: we measure the concentration, check the sample is clean and finally assess the size of the DNA.
Hopefully, once we’ve rolled out the FEMTO Pulse into our production pipelines, we’ll see the benefits: that it can resolve a larger size and uses less precious material, reducing costs and increasing quality.
Good material in = good data out, and we’re in the business of producing good data.