PubMed · 2026-04-22
Standard genome sequencing tools have been systematically misreading the gene-duplication clusters that give plants their disease resistance, collapsing multiple distinct genes into one blurry copy. New long-read pangenome technology can finally reconstruct these regions accurately, revealing hidden genetic variation that matters for crop breeding.
Short-read sequencing routinely collapses multi-gene tandem arrays into a single consensus sequence, causing systematic undercounting of disease-resistance and specialized metabolism genes
Annotation pipelines compound sequencing errors by either fusing distinct array members into one oversized gene model or dropping array members entirely, even when the underlying assembly preserved them
Graph pangenomes built from long-read, haplotype-resolved assemblies restore individual array members with distinct coordinates, enabling accurate copy-number variant genotyping and per-gene expression analysis