Adapters are an essential part of library preparation. Similar to NGS technologies, they are constantly evolving. Feeling overwhelmed with the mind-boggling variety of adapters available on the market? Worry not! Let’s walk through everything you need to know about adapters.
Adapters are nucleotide sequences added to both ends of DNA fragments during library preparation to enable compatibility with the sequencer. The figure below shows a complete dual-indexed library, with the adapters denoted in the blue boxes.
Figure 1. Library Structure (source: Illumina).
Adapters can be divided into several categories:
① By index position: single-indexed vs. dual-indexed adapters
Single-indexed adapters: Contain index sequences at the P7 end;
Figure 2. Schematic of a Single-indexed Adapter
Dual-indexed adapters: Contain index sequences at the P5 and P7 ends;
Figure 3. Schematic of a Dual-indexed Adapter
Typical index sequences are 6 nt or 8 nt long. For 8-nt indexes, there are, in theory, 48 single-index and 416 double-index combinations. Given the need to maintain the base and color balance of index combinations during sequencing, the number of available index choices is, in fact, fairly limited. Nevertheless, there are far more dual-indexed than single-indexed adapters. You can choose between them based on your sample size.
② By compatibility with PCR-free library preparation: full vs. partial adapters
Figure 4. Amplification Methods for Full vs. Partial Adapters
Full adapters can be added to the ends of DNA fragments by TA ligation. With sufficient library yields, these libraries can be used directly for sequencing without PCR amplification (PCR-free).
Following TA ligation of partial adapters to the ends of DNA fragments, these partial adapters must be extended into full-length adapters by PCR using indexing primers complementary to the short adapter sequences before the libraries are ready for sequencing. When partial adapters are used, library amplification must be performed using primers containing index sequences (provided in the adapter kit), instead of universal primers that do not contain index sequences.
③ UDI ＆ UMI
To mitigate index hopping, which is more prominent with patterned flow cells (HiSeq and NovaSeq), Illumina recommends preparing dual-indexed libraries with unique indexes. By tagging both P5 and P7 ends with unique identifiers, or unique dual indexes (UDIs), this strategy assigns completely unique, unrepeated P5 and P7 indexes to each pair, allowing for cross-checking of both indexes and effectively reducing incorrect index assignment.
Figure 5. UDI Resolves Index Hopping and Misassignment(Source: Burning Rock)
Unique Molecular Identifiers (UMIs) are used to identify low-frequency variants in cfDNA while avoiding false positives. By tagging the original DNA fragments, the consistency of sequencing results can be verified across copies of the same original DNA sequence, eliminating false positives and improving low-frequency variant calling.
Figure 6. UMI Facilitates Detection of Low-Frequency Variants
As shown in Figure 6-(1), some base mutations are introduced during library preparation or sequencing (denoted in red), which crop up as false positives. UMI correction removes these two false positive calls. In Figure 6-(2), true low-frequency mutations (rightmost T) and false positives (bases in red) coexist. UMI correction retains the true mutation (rightmost T) while removing false positives (bases in red), thereby improving the accuracy of low-frequency mutation detection. In Figure 6-(3), which does not include UMIs, true mutations (rightmost T) and false positives (bases in red) cannot be distinguished. Many potential sequences are identified, but they do not reflect the true characteristics of the sample.