February 17, 2022
With all of the recent advances in genomic data analysis, it is worth examining where the raw sequence data actually comes from. Researchers have a few choices of sequencing platforms, from traditional, low-throughput Sanger sequencing to innovative next-generation sequencing technologies. Platform choice will depend on the scale of the project, the cost of sequencing, and the ultimate research question being answered by downstream analysis.
The Major Players
The dominant company in the next-generation sequencing space is Illumina. Illumina sequencing is primarily a short-read, sequencing-by-synthesis platform, and has become incredibly popular for high-throughput sequencing.
Newer to the space, and sometimes even called “third-generation” sequencing, is nanopore sequencing, with Oxford Nanopore Technologies leading the charge. Nanopore sequencing is long-read, direct sequencing, relying on completely different principles to traditional sequencing technologies.
Principles and Accuracy of Sequencing Platforms
Illumina sequencing-by-synthesis uses a proprietary platform to amplify fragments of the genome being sequenced and then reads which base is added as the fragment is synthesized using fluorescently tagged bases. Sequencing-by-synthesis is a well-established technology, and Illumina’s adjustments have produced a high throughput version of this type of sequencing with over 99% accuracy and over 1 million Gigabases able to be sequenced per year on their NovaSeq 6000.
Nanopore sequencing, in contrast, is a brand-new technology. DNA is passed through a protein nanopore with an electric current. Each of the four bases in DNA causes a unique disruption in that current, which can be measured and translated into the respective base. Nanopore sequencing is beginning to be widely used, but its accuracy can range from 87-98%, making it challenging to use for the identification of rare SNPs, for example.
Cost is also an important consideration in the choice of sequencing platform, and pricing depends on the specific sequencer being used. The Illumina NextSeq 550 platform ranges from $40-63 per Gigabase (Gb), with a lower cost of $10-35 per Gb on their NovaSeq 6000 platform.
Oxford Nanopore has higher average prices as things currently stand: the MinION and GridION platforms range from $50-2000 per Gb, a significant increase from Illumina sequencing. The larger-scale platform offered by Oxford Nanopore, PromethION, has a comparable cost to Illumina of $21-42 per Gb.
Long-Read vs Short-Read Technology
Though the sequencing principles and costs vary between these two sequencing approaches, the main difference between Illumina and Oxford Nanopore is in the raw sequence data produced by these processes.
Illumina sequencing primarily sequences small fragments of DNA, producing read lengths of 50-300 base pairs (bp) which are then assembled into a whole genome sequence using bioinformatics pipelines and reference genomes. This is called short-read technology and has been incredibly useful thus far in genomics. However, it is very time and labor-intensive to assemble these short reads correctly, and if the genome is from an organism that lacks a high-quality reference genome or has many repeat sequences or rare variants, it makes assembly even more challenging and less accurate.
Nanopore sequencing is a long-read technology, often producing reads of 10,000-30,000 base pairs in length, and the record single read length of 2,300,000 base pairs. The advantage of longer reads is easier genome assembly and higher accuracy in identifying rare variants and distinguishing repeating sequences more clearly.
It is worth noting that Illumina is breaking into the long-read space with its new Infinity assay. Infinity uses existing Illumina sequencing with different sample preparation steps to produce reads up to 10,000 base pairs in length.
Outsourcing Downstream Bioinformatic Analysis
The raw genomic data produced by these sequencing platforms has enormous potential to provide us with biological and health-related insights but requires significant downstream processing and analysis to extract this valuable information.
Working with service providers like Bridge Informatics is a great option. We support your data storage, analysis, and pipeline development needs to eliminate common challenges associated with these downstream analysis tasks. Book a free discovery call with us if you’re interested in outsourcing your bioinformatic needs with Bridge Informatics.
Jane Cook, Journalist & Content Writer, Bridge Informatics
Jane is a Content Writer at Bridge Informatics, a professional services firm that helps biotech customers implement advanced techniques in the management and analysis of genomic data. Bridge Informatics focuses on data mining, machine learning, and various bioinformatic techniques to discover biomarkers and companion diagnostics. If you’re interested in reaching out, please email [email protected] or [email protected].