Directed evolution is a powerful approach for engineering proteins with enhanced affinity or specificity for a ligand of interest but typically requires many rounds of screening/library mutagenesis to Show more
Directed evolution is a powerful approach for engineering proteins with enhanced affinity or specificity for a ligand of interest but typically requires many rounds of screening/library mutagenesis to obtain mutants with desired properties. Furthermore, mutant libraries generally only cover a small fraction of the available sequence space. Here, for the first time, we use ordinal regression to model protein sequence data generated through successive rounds of sorting and amplification of a protein-ligand system. We show that the ordinal regression model trained on only two sorts successfully predicts chromodomain CBX1 mutants that would have stronger binding affinity with the H3K9me3 peptide. Furthermore, we can extract the predictive features using contextual regression, a method to interpret nonlinear models, which successfully guides identification of strong binders not even present in the original library. We have demonstrated the power of this approach by experimentally confirming that we were able to achieve the same improvement in binding affinity previously achieved through a more laborious directed evolution process. This study presents an approach that reduces the number of rounds of selection required to isolate strong binders and facilitates the identification of strong binders not present in the original library. Show less
Uterine leiomyomas, in contrast to sarcomas, tend to cease growth following menopause. In the setting of a rapidly enlarging uterine mass in a postmenopausal patient, clinical distinction of uterine l Show more
Uterine leiomyomas, in contrast to sarcomas, tend to cease growth following menopause. In the setting of a rapidly enlarging uterine mass in a postmenopausal patient, clinical distinction of uterine leiomyoma from sarcoma is difficult and requires pathologic examination. A 74-year-old woman presented with postmenopausal bleeding and acute blood loss requiring transfusion. She was found to have a rapidly enlarging uterine mass clinically suspicious for sarcoma. An abdominal hysterectomy and bilateral salpingo-oophorectomy were performed. A 15.5 cm partially necrotic intramural mass was identified in the uterine corpus. The tumor was classified as a cellular leiomyoma. RNA sequencing identified a KAT6B-KANSL1 fusion that was confirmed by RT-PCR and Sanger sequencing. After 6 months of follow-up, the patient remains asymptomatic without evidence of disease. Prior studies of uterine leiomyomas have identified KAT6B (previously MORF) rearrangements in uterine leiomyomas, but this case is the first to identify a KAT6B-KANSL1 gene fusion in a uterine leiomyoma. While alterations of MED12 and HMGA2 are most common in uterine leiomyomas, a range of other genetic pathways have been described. Our case contributes to the evolving molecular landscape of uterine leiomyomas. Show less
Genetic variants within the fatty acid desaturase ( DNA methylation at six CpG sites spanning We observed significant ASM between rs174537 and DNA methylation at key regulatory regions in the
Genetic variants near and within the fatty acid desaturase (FADS) cluster are associated with polyunsaturated fatty acid (PUFA) biosynthesis, levels of several disease biomarkers and risk of human dis Show more
Genetic variants near and within the fatty acid desaturase (FADS) cluster are associated with polyunsaturated fatty acid (PUFA) biosynthesis, levels of several disease biomarkers and risk of human disease. However, determining the functional mechanisms by which these genetic variants impact PUFA levels remains a challenge. Utilizing an Illumina 450K array, we previously reported strong allele-specific methylation (ASM) associations (p = 2.69×10-29) between a single nucleotide polymorphism (SNP) rs174537 and DNA methylation of CpG sites located in the putative enhancer region between FADS1 and FADS2, in human liver tissue. However, this array only featured 20 CpG sites within this 12kb region. To better understand the methylation landscape within this region, we conducted bisulfite sequencing of the region between FADS1 and FADS2. Liver tissues from 50 male subjects (27 European Americans, 23 African Americans) were obtained from the Pathobiological Determinants of Atherosclerosis in Youth (PDAY) study, and used to ascertain the genotype at rs174537 and methylation status across the region of interest. Associations between rs174537 genotype and methylation status of 136 CpG sites were determined. Age-adjusted linear regressions were used to assess ASM associations with rs174537 genotype. The majority of CpG sites (117 out of 136, 86%) exhibited high levels of methylation with the greatest variability observed at three key regulatory regions-the promoter regions for FADS1 and FADS2 and a putative enhancer site between the two genes. Eight CpG sites within the putative enhancer region displayed significant (FDR p <0.05) ASM associations with rs174537. These data support the concept that both genetic and epigenetic factors regulate PUFA biosynthesis, and raise fundamental questions as to how genetic variants such as rs174537 impact DNA methylation in distant regulatory regions, and ultimately the capacity of tissues to synthesize PUFAs. Show less
Levels of omega-6 (n-6) and omega-3 (n-3), long chain polyunsaturated fatty acids (LcPUFAs) such as arachidonic acid (AA; 20:4, n-6), eicosapentaenoic acid (EPA; 20:5, n-3) and docosahexaenoic acid (D Show more
Levels of omega-6 (n-6) and omega-3 (n-3), long chain polyunsaturated fatty acids (LcPUFAs) such as arachidonic acid (AA; 20:4, n-6), eicosapentaenoic acid (EPA; 20:5, n-3) and docosahexaenoic acid (DHA; 22:6, n-3) impact a wide range of biological activities, including immune signaling, inflammation, and brain development and function. Two desaturase steps (Δ6, encoded by FADS2 and Δ5, encoded by FADS1) are rate limiting in the conversion of dietary essential 18 carbon PUFAs (18C-PUFAs) such as LA (18:2, n-6) to AA and α-linolenic acid (ALA, 18:3, n-3) to EPA and DHA. GWAS and candidate gene studies have consistently identified genetic variants within FADS1 and FADS2 as determinants of desaturase efficiencies and levels of LcPUFAs in circulating, cellular and breast milk lipids. Importantly, these same variants are documented determinants of important cardiovascular disease risk factors (total, LDL, and HDL cholesterol, triglycerides, CRP and proinflammatory eicosanoids). FADS1 and FADS2 lie head-to-head (5' to 5') in a cluster configuration on chromosome 11 (11q12.2). There is considerable linkage disequilibrium (LD) in this region, where multiple SNPs display association with LcPUFA levels. For instance, rs174537, located ∼ 15 kb downstream of FADS1, is associated with both FADS1 desaturase activity and with circulating AA levels (p-value for AA levels = 5.95 × 10(-46)) in humans. To determine if DNA methylation variation impacts FADS activities, we performed genome-wide allele-specific methylation (ASM) with rs174537 in 144 human liver samples. This approach identified highly significant ASM with CpG sites between FADS1 and FADS2 in a putative enhancer signature region, leading to the hypothesis that the phenotypic associations of rs174537 are likely due to methylation differences. In support of this hypothesis, methylation levels of the most significant probe were strongly associated with FADS1 and, to a lesser degree, FADS2 activities. Show less
Over the past 50 years, increases in dietary n-6 PUFA, such as linoleic acid, have been hypothesised to cause or exacerbate chronic inflammatory diseases. The present study examines an individual's in Show more
Over the past 50 years, increases in dietary n-6 PUFA, such as linoleic acid, have been hypothesised to cause or exacerbate chronic inflammatory diseases. The present study examines an individual's innate capacity to synthesise n-6 long-chain PUFA (LC-PUFA) with respect to the fatty acid desaturase (FADS) locus in Americans of African and European descent with diabetes or the metabolic syndrome. Compared with European Americans (EAm), African Americans (AfAm) exhibited markedly higher serum levels of arachidonic acid (AA) (EAm 7·9 (sd 2·1), AfAm 9·8 (sd 1·9) % of total fatty acids; P < 2·29 × 10⁻⁹) and the AA:n-6-precursor fatty acid ratio, which estimates FADS1 activity (EAm 5·4 (sd 2·2), AfAm 6·9 (sd 2·2); P = 1·44 × 10⁻⁵). In all, seven SNP mapping to the FADS locus revealed strong association with AA, EPA and dihomo-γ-linolenic acid (DGLA) in the EAm. Importantly, EAm homozygous for the minor allele (T) had significantly lower AA levels (TT 6·3 (sd 1·0); GG 8·5 (sd 2·1); P = 3·0 × 10⁻⁵) and AA:DGLA ratios (TT 3·4 (sd 0·8), GG 6·5 (sd 2·3); P = 2·2 × 10⁻⁷) but higher DGLA levels (TT 1·9 (sd 0·4), GG 1·4 (sd 0·4); P = 3·3 × 10⁻⁷) compared with those homozygous for the major allele (GG). Allele frequency patterns suggest that the GG genotype at rs174537 (associated with higher circulating levels of AA) is much higher in AfAm (0·81) compared with EAm (0·46). Similarly, marked differences in rs174537 genotypic frequencies were observed in HapMap populations. These data suggest that there are probably important differences in the capacity of different populations to synthesise LC-PUFA. These differences may provide a genetic mechanism contributing to health disparities between populations of African and European descent. Show less