Repetitive DNA poses problems for genome sequencing. What types of repetitive DNA are most problematic?
Table of contents
- 1. Introduction to Genetics51m
- 2. Mendel's Laws of Inheritance3h 37m
- 3. Extensions to Mendelian Inheritance2h 41m
- 4. Genetic Mapping and Linkage2h 28m
- 5. Genetics of Bacteria and Viruses1h 21m
- 6. Chromosomal Variation1h 48m
- 7. DNA and Chromosome Structure56m
- 8. DNA Replication1h 10m
- 9. Mitosis and Meiosis1h 34m
- 10. Transcription1h 0m
- 11. Translation58m
- 12. Gene Regulation in Prokaryotes1h 19m
- 13. Gene Regulation in Eukaryotes44m
- 14. Genetic Control of Development44m
- 15. Genomes and Genomics1h 50m
- 16. Transposable Elements47m
- 17. Mutation, Repair, and Recombination1h 6m
- 18. Molecular Genetic Tools19m
- 19. Cancer Genetics29m
- 20. Quantitative Genetics1h 26m
- 21. Population Genetics50m
- 22. Evolutionary Genetics29m
15. Genomes and Genomics
Sequencing the Genome
Problem 3a
Textbook Question
When the whole-genome shotgun sequence of the Drosophila genome was assembled, it comprised 134 scaffolds made up of 1636 contigs. Why were there so many more contigs than scaffolds?

1
Understand the terms: A 'contig' is a continuous sequence of DNA assembled from overlapping reads, while a 'scaffold' is a larger structure formed by linking contigs together using additional information, such as paired-end reads or known physical distances.
Recognize that contigs are the initial building blocks of genome assembly. They are created by aligning and merging overlapping DNA sequence reads, but they do not necessarily span the entire genome due to gaps or repetitive sequences.
Scaffolds are formed by connecting contigs using information like paired-end reads, which provide spatial relationships between contigs. This allows scaffolds to span gaps that contigs cannot bridge, resulting in fewer scaffolds than contigs.
Consider the limitations of sequencing technology: Gaps between contigs often arise due to repetitive sequences, low coverage regions, or sequencing errors. These gaps prevent contigs from being merged directly, leading to a higher number of contigs compared to scaffolds.
Reflect on the assembly process: The assembly algorithm prioritizes creating scaffolds that represent larger, more complete sections of the genome. However, the presence of unresolved gaps and ambiguities means that many contigs remain unlinked, contributing to the higher count of contigs relative to scaffolds.

This video solution was recommended by our tutors as helpful for the problem above
Video duration:
2mPlay a video:
Was this helpful?
Key Concepts
Here are the essential concepts you must grasp in order to answer the question correctly.
Contigs and Scaffolds
Contigs are contiguous sequences of DNA that are assembled from overlapping DNA fragments, representing a continuous stretch of the genome. Scaffolds, on the other hand, are larger structures that consist of multiple contigs linked together, often with gaps. The difference in their numbers arises because scaffolds are formed by connecting contigs, which can lead to many more contigs than scaffolds in a genome assembly.
Recommended video:
Guided course
Traditional vs. Next-Gen
Genome Assembly
Genome assembly is the process of reconstructing the complete DNA sequence of an organism's genome from short DNA fragments obtained through sequencing. This process involves aligning and merging overlapping sequences to form longer contiguous sequences (contigs) and then organizing these into scaffolds. The complexity of the genome and the quality of the sequencing data can significantly affect the number of contigs and scaffolds produced.
Recommended video:
Guided course
Genomics Overview
Sequencing Technology Limitations
The limitations of sequencing technologies can lead to the generation of numerous short reads that may not overlap perfectly, resulting in many contigs. Factors such as repetitive regions in the genome, sequencing errors, and the inherent difficulty in assembling complex genomic regions contribute to the formation of more contigs than scaffolds. These challenges necessitate advanced computational methods to accurately assemble the genome.
Recommended video:
Guided course
Sequencing Overview
Related Videos
Related Practice
Textbook Question
610
views