Sequence alignment is essential for genomic research and clinical diagnostics, yet detecting complex rearrangements such as inversions, duplications, and gene conversions remains challenging due to al Show more
Sequence alignment is essential for genomic research and clinical diagnostics, yet detecting complex rearrangements such as inversions, duplications, and gene conversions remains challenging due to allele complexity and limitations of current methods. We introduce VACmap, a non-linear mapping approach to enhance the detection and representation of all genetic variations. VACmap improves duplication detection from 20% to 90% in the Challenging Medically-Relevant Genes (CMRG) benchmark and improves characterization of complex inversions in repetitive regions and gene conversion events. It improves resolving clinically significant loci, including the LPA gene (with repetitive KIV-2 units linked to coronary heart disease), GBA1 and STRC genes (risk factors for Parkinson's disease and hearing loss, respectively, affected by pseudogene recombination with GBAP1 and STRCP1). Here, we show that VACmap delivers better alignment accuracy and SV detection, providing a robust tool for genomic analysis and clinical insights, with potential to advance understanding of genetic diversity and disease mechanisms. Show less
Genomic structural variants (SVs) are a major source of genetic diversity in humans. Here, through long-read sequencing of 945 Han Chinese genomes, we identify 111,288 SVs, including 24.56% unreported Show more
Genomic structural variants (SVs) are a major source of genetic diversity in humans. Here, through long-read sequencing of 945 Han Chinese genomes, we identify 111,288 SVs, including 24.56% unreported variants, many with predicted functional importance. By integrating human population-level phenotypic and multi-omics data as well as two humanized mouse models, we demonstrate the causal roles of two SVs: one SV that emerges at the common ancestor of modern humans, Neanderthals, and Denisovans in GSDMD for bone mineral density and one modern-human-specific SV in WWP2 impacting height, weight, fat, craniofacial phenotypes and immunity. Our results suggest that the GSDMD SV could serve as a rapid and cost-effective biomarker for assessing the risk of cisplatin-induced acute kidney injury. The functional conservation from human to mouse and widespread signals of positive natural selection suggest that both SVs likely influence local adaptation, phenotypic diversity, and disease susceptibility across diverse human populations. Show less