Repetitive DNA poses problems for genome sequencing. What strategies can be employed to overcome these problems?
Table of contents
- 1. Introduction to Genetics51m
- 2. Mendel's Laws of Inheritance3h 37m
- 3. Extensions to Mendelian Inheritance2h 41m
- 4. Genetic Mapping and Linkage2h 28m
- 5. Genetics of Bacteria and Viruses1h 21m
- 6. Chromosomal Variation1h 48m
- 7. DNA and Chromosome Structure56m
- 8. DNA Replication1h 10m
- 9. Mitosis and Meiosis1h 34m
- 10. Transcription1h 0m
- 11. Translation58m
- 12. Gene Regulation in Prokaryotes1h 19m
- 13. Gene Regulation in Eukaryotes44m
- 14. Genetic Control of Development44m
- 15. Genomes and Genomics1h 50m
- 16. Transposable Elements47m
- 17. Mutation, Repair, and Recombination1h 6m
- 18. Molecular Genetic Tools19m
- 19. Cancer Genetics29m
- 20. Quantitative Genetics1h 26m
- 21. Population Genetics50m
- 22. Evolutionary Genetics29m
15. Genomes and Genomics
Sequencing the Genome
Problem 3b
Textbook Question
When the whole-genome shotgun sequence of the Drosophila genome was assembled, it comprised 134 scaffolds made up of 1636 contigs. What is the difference between physical and sequence gaps?

1
Understand the concept of a 'scaffold' in genome assembly: A scaffold is a series of contigs (continuous sequences of DNA) that are ordered and oriented based on additional information, such as paired-end reads or other mapping data.
Define a 'physical gap': Physical gaps occur when there is missing DNA sequence information between contigs. These gaps exist because the DNA fragments covering these regions were not sequenced or were not included in the assembly due to technical limitations.
Define a 'sequence gap': Sequence gaps occur when the DNA sequence is known to exist but cannot be accurately determined due to repetitive sequences, low-quality data, or other sequencing challenges. These gaps are often represented by 'N' in the assembled sequence.
Compare physical and sequence gaps: Physical gaps represent regions where no sequencing data is available, while sequence gaps represent regions where sequencing data exists but cannot be resolved into a clear sequence. Physical gaps are typically larger and may require additional experimental methods to close, whereas sequence gaps can often be resolved computationally or with improved sequencing techniques.
Relate the concepts to the Drosophila genome assembly: In the Drosophila genome assembly, the 134 scaffolds and 1636 contigs indicate that both physical and sequence gaps were present. Physical gaps would correspond to regions where contigs could not be connected, while sequence gaps would be within contigs where the sequence could not be fully resolved.

This video solution was recommended by our tutors as helpful for the problem above
Video duration:
1mPlay a video:
Was this helpful?
Key Concepts
Here are the essential concepts you must grasp in order to answer the question correctly.
Physical Gaps
Physical gaps refer to regions in a genome assembly where there is no sequence data available due to limitations in the sequencing technology or methodology. These gaps can occur when the DNA fragments used for sequencing do not overlap sufficiently, leaving portions of the genome unsequenced. Understanding physical gaps is crucial for assessing the completeness of a genome assembly.
Recommended video:
Guided course
Segmentation Genes
Sequence Gaps
Sequence gaps are specific areas within a genome assembly where the sequence is known to be incomplete or missing, often indicated by 'N' in the sequence data. These gaps can arise from repetitive regions that are difficult to sequence accurately or from errors during the assembly process. Identifying sequence gaps is important for evaluating the quality and accuracy of the assembled genome.
Recommended video:
Guided course
Sequencing Difficulties
Contigs and Scaffolds
Contigs are contiguous sequences of DNA that have been assembled from overlapping fragments, representing a continuous stretch of the genome. Scaffolds, on the other hand, are larger structures that consist of multiple contigs linked together, often with gaps in between. Understanding the relationship between contigs and scaffolds is essential for interpreting genome assemblies and the presence of gaps.
Recommended video:
Guided course
Traditional vs. Next-Gen
Related Videos
Related Practice
Textbook Question
492
views