Overlap-layout-consensus¶
The Overlap-Layout-Consensus (OLC) method is a fundamental technique used in genome assembly, which is essentially a three-step process.
- Overlap: The initial step involves identifying overlaps among all the reads. This is akin to finding common sections among different fragments of a puzzle.
- Layout: Once overlaps are established, the OLC algorithm arranges all the reads in a specific order that best represents their overlaps, forming a graph.
- Consensus: The final step involves deriving a consensus sequence from the multiple sequence alignments (MSA). This consensus sequence is a representation of the most likely arrangement of the reads.
It's important to note that the OLC approach is more suitable for low-coverage long reads, whereas other methods like Debruijn Graph (DBG) are more suitable for high-coverage short reads, especially for large genome assembly. The choice of method depends on the specific requirements and constraints of the genome assembly project.
-
Jung, H., Ventura, T., Chung, J. S., Kim, W. J., Nam, B. H., Kong, H. J., ... & Eyun, S. I. (2020). Twelve quick steps for genome assembly and annotation in the classroom. PLoS computational biology, 16(11), e1008325. doi: 10.1371/journal.pcbi.1008325 ↩