Oncogenic fusions formed through chromosomal rearrangements are hallmarks of childhood cancer that define cancer subtype, predict outcome, persist through treatment, and can be ideal therapeutic targe Show more
Oncogenic fusions formed through chromosomal rearrangements are hallmarks of childhood cancer that define cancer subtype, predict outcome, persist through treatment, and can be ideal therapeutic targets. However, mechanistic understanding of the etiology of oncogenic fusions remains elusive. Here we report a comprehensive detection of 272 oncogenic fusion gene pairs by using tumor transcriptome sequencing data from 5190 childhood cancer patients. We identify diverse factors, including translation frame, protein domain, splicing, and gene length, that shape the formation of oncogenic fusions. Our mathematical modeling reveals a strong link between differential selection pressure and clinical outcome in CBFB-MYH11. We discover 4 oncogenic fusions, including RUNX1-RUNX1T1, TCF3-PBX1, CBFA2T3-GLIS2, and KMT2A-AFDN, with promoter-hijacking-like features that may offer alternative strategies for therapeutic targeting. We uncover extensive alternative splicing in oncogenic fusions including KMT2A-MLLT3, KMT2A-MLLT10, C11orf95-RELA, NUP98-NSD1, KMT2A-AFDN and ETV6-RUNX1. We discover neo splice sites in 18 oncogenic fusion gene pairs and demonstrate that such splice sites confer therapeutic vulnerability for etiology-based genome editing. Our study reveals general principles on the etiology of oncogenic fusions in childhood cancer and suggests profound clinical implications including etiology-based risk stratification and genome-editing-based therapeutics. Show less
Genome-wide association studies (GWAS) have identified >250 loci for body mass index (BMI), implicating pathways related to neuronal biology. Most GWAS loci represent clusters of common, noncoding var Show more
Genome-wide association studies (GWAS) have identified >250 loci for body mass index (BMI), implicating pathways related to neuronal biology. Most GWAS loci represent clusters of common, noncoding variants from which pinpointing causal genes remains challenging. Here we combined data from 718,734 individuals to discover rare and low-frequency (minor allele frequency (MAF) < 5%) coding variants associated with BMI. We identified 14 coding variants in 13 genes, of which 8 variants were in genes (ZBTB7B, ACHE, RAPGEF3, RAB21, ZFHX3, ENTPD6, ZFR2 and ZNF169) newly implicated in human obesity, 2 variants were in genes (MC4R and KSR2) previously observed to be mutated in extreme obesity and 2 variants were in GIPR. The effect sizes of rare variants are ~10 times larger than those of common variants, with the largest effect observed in carriers of an MC4R mutation introducing a stop codon (p.Tyr35Ter, MAF = 0.01%), who weighed ~7 kg more than non-carriers. Pathway analyses based on the variants associated with BMI confirm enrichment of neuronal genes and provide new evidence for adipocyte and energy expenditure biology, widening the potential of genetically supported therapeutic targets in obesity. Show less
Epithelial ovarian cancer (EOC) is the fifth leading cause of cancer mortality in American women. Normal ovarian physiology is intricately connected to small GTP binding proteins of the Ras superfamil Show more
Epithelial ovarian cancer (EOC) is the fifth leading cause of cancer mortality in American women. Normal ovarian physiology is intricately connected to small GTP binding proteins of the Ras superfamily (Ras, Rho, Rab, Arf, and Ran) which govern processes such as signal transduction, cell proliferation, cell motility, and vesicle transport. We hypothesized that common germline variation in genes encoding small GTPases is associated with EOC risk. We investigated 322 variants in 88 small GTPase genes in germline DNA of 18,736 EOC patients and 26,138 controls of European ancestry using a custom genotype array and logistic regression fitting log-additive models. Functional annotation was used to identify biofeatures and expression quantitative trait loci that intersect with risk variants. One variant, ARHGEF10L (Rho guanine nucleotide exchange factor 10 like) rs2256787, was associated with increased endometrioid EOC risk (OR = 1.33, p = 4.46 x 10-6). Other variants of interest included another in ARHGEF10L, rs10788679, which was associated with invasive serous EOC risk (OR = 1.07, p = 0.00026) and two variants in AKAP6 (A-kinase anchoring protein 6) which were associated with risk of invasive EOC (rs1955513, OR = 0.90, p = 0.00033; rs927062, OR = 0.94, p = 0.00059). Functional annotation revealed that the two ARHGEF10L variants were located in super-enhancer regions and that AKAP6 rs927062 was associated with expression of GTPase gene ARHGAP5 (Rho GTPase activating protein 5). Inherited variants in ARHGEF10L and AKAP6, with potential transcriptional regulatory function and association with EOC risk, warrant investigation in independent EOC study populations. Show less
We conducted a meta-analysis of three endometrial cancer genome-wide association studies (GWAS) and two follow-up phases totaling 7,737 endometrial cancer cases and 37,144 controls of European ancestr Show more
We conducted a meta-analysis of three endometrial cancer genome-wide association studies (GWAS) and two follow-up phases totaling 7,737 endometrial cancer cases and 37,144 controls of European ancestry. Genome-wide imputation and meta-analysis identified five new risk loci of genome-wide significance at likely regulatory regions on chromosomes 13q22.1 (rs11841589, near KLF5), 6q22.31 (rs13328298, in LOC643623 and near HEY2 and NCOA7), 8q24.21 (rs4733613, telomeric to MYC), 15q15.1 (rs937213, in EIF2AK4, near BMF) and 14q32.33 (rs2498796, in AKT1, near SIVA1). We also found a second independent 8q24.21 signal (rs17232730). Functional studies of the 13q22.1 locus showed that rs9600103 (pairwise r(2) = 0.98 with rs11841589) is located in a region of active chromatin that interacts with the KLF5 promoter region. The rs9600103[T] allele that is protective in endometrial cancer suppressed gene expression in vitro, suggesting that regulation of the expression of KLF5, a gene linked to uterine development, is implicated in tumorigenesis. These findings provide enhanced insight into the genetic and biological basis of endometrial cancer. Show less
Pediatric osteosarcoma is characterized by multiple somatic chromosomal lesions, including structural variations (SVs) and copy number alterations (CNAs). To define the landscape of somatic mutations Show more
Pediatric osteosarcoma is characterized by multiple somatic chromosomal lesions, including structural variations (SVs) and copy number alterations (CNAs). To define the landscape of somatic mutations in pediatric osteosarcoma, we performed whole-genome sequencing of DNA from 20 osteosarcoma tumor samples and matched normal tissue in a discovery cohort, as well as 14 samples in a validation cohort. Single-nucleotide variations (SNVs) exhibited a pattern of localized hypermutation called kataegis in 50% of the tumors. We identified p53 pathway lesions in all tumors in the discovery cohort, nine of which were translocations in the first intron of the TP53 gene. Beyond TP53, the RB1, ATRX, and DLG2 genes showed recurrent somatic alterations in 29%-53% of the tumors. These data highlight the power of whole-genome sequencing for identifying recurrent somatic alterations in cancer genomes that may be missed using other methods. Show less
To identify loci for age at menarche, we performed a meta-analysis of 32 genome-wide association studies in 87,802 women of European descent, with replication in up to 14,731 women. In addition to the Show more
To identify loci for age at menarche, we performed a meta-analysis of 32 genome-wide association studies in 87,802 women of European descent, with replication in up to 14,731 women. In addition to the known loci at LIN28B (P = 5.4 × 10⁻⁶⁰) and 9q31.2 (P = 2.2 × 10⁻³³), we identified 30 new menarche loci (all P < 5 × 10⁻⁸) and found suggestive evidence for a further 10 loci (P < 1.9 × 10⁻⁶). The new loci included four previously associated with body mass index (in or near FTO, SEC16B, TRA2B and TMEM18), three in or near other genes implicated in energy homeostasis (BSX, CRTC1 and MCHR2) and three in or near genes implicated in hormonal regulation (INHBA, PCSK2 and RXRG). Ingenuity and gene-set enrichment pathway analyses identified coenzyme A and fatty acid biosynthesis as biological processes related to menarche timing. Show less