Genome-wide association studies (GWASs) have implicated ∼380 genetic loci for plasma lipid regulation. However, these loci only explain 17-27% of the trait variance, and a comprehensive understanding Show more
Genome-wide association studies (GWASs) have implicated ∼380 genetic loci for plasma lipid regulation. However, these loci only explain 17-27% of the trait variance, and a comprehensive understanding of the molecular mechanisms has not been achieved. In this study, we utilized an integrative genomics approach leveraging diverse genomic data from human populations to investigate whether genetic variants associated with various plasma lipid traits, namely, total cholesterol, high and low density lipoprotein cholesterol (HDL and LDL), and triglycerides, from GWASs were concentrated on specific parts of tissue-specific gene regulatory networks. In addition to the expected lipid metabolism pathways, gene subnetworks involved in "interferon signaling," "autoimmune/immune activation," "visual transduction," and "protein catabolism" were significantly associated with all lipid traits. In addition, we detected trait-specific subnetworks, including cadherin-associated subnetworks for LDL; glutathione metabolism for HDL; valine, leucine, and isoleucine biosynthesis for total cholesterol; and insulin signaling and complement pathways for triglyceride. Finally, by using gene-gene relations revealed by tissue-specific gene regulatory networks, we detected both known (e.g., APOH, APOA4, and ABCA1) and novel (e.g., F2 in adipose tissue) key regulator genes in these lipid-associated subnetworks. Knockdown of the F2 gene (coagulation factor II, thrombin) in 3T3-L1 and C3H10T1/2 adipocytes altered gene expression of Abcb11, Apoa5, Apof, Fabp1, Lipc, and Cd36; reduced intracellular adipocyte lipid content; and increased extracellular lipid content, supporting a link between adipose thrombin and lipid regulation. Our results shed light on the complex mechanisms underlying lipid metabolism and highlight potential novel targets for lipid regulation and lipid-associated diseases. Show less
Hepatocellular carcinoma (HCC) is a heterogeneous disease with high mortality rate. Recent genomic studies have identified TP53, AXIN1, and CTNNB1 as the most frequently mutated genes. Lower frequency Show more
Hepatocellular carcinoma (HCC) is a heterogeneous disease with high mortality rate. Recent genomic studies have identified TP53, AXIN1, and CTNNB1 as the most frequently mutated genes. Lower frequency mutations have been reported in ARID1A, ARID2 and JAK1. In addition, hepatitis B virus (HBV) integrations into the human genome have been associated with HCC. Here, we deep-sequence 42 HCC patients with a combination of whole genome, exome and transcriptome sequencing to identify the mutational landscape of HCC using a reasonably large discovery cohort. We find frequent mutations in TP53, CTNNB1 and AXIN1, and rare but likely functional mutations in BAP1 and IDH1. Besides frequent hepatitis B virus integrations at TERT, we identify translocations at the boundaries of TERT. A novel deletion is identified in CTNNB1 in a region that is heavily mutated in multiple cancers. We also find multiple high-allelic frequency mutations in the extracellular matrix protein LAMA2. Lower expression levels of LAMA2 correlate with a proliferative signature, and predict poor survival and higher chance of cancer recurrence in HCC patients, suggesting an important role of the extracellular matrix and cell adhesion in tumor progression of a subgroup of HCC patients. The heterogeneous disease of HCC features diverse modes of genomic alteration. In addition to common point mutations, structural variations and methylation changes, there are several virus-associated changes, including gene disruption or activation, formation of chimeric viral-human transcripts, and DNA copy number changes. Such a multitude of genomic events likely contributes to the heterogeneous nature of HCC. Show less
Multi-causality and heterogeneity of phenotypes and genotypes characterize complex diseases. In a database with comprehensive collection of phenotypes and genotypes, we compared the performance of com Show more
Multi-causality and heterogeneity of phenotypes and genotypes characterize complex diseases. In a database with comprehensive collection of phenotypes and genotypes, we compared the performance of common machine learning methods to generate mathematical models to predict diabetic kidney disease (DKD). In a prospective cohort of type 2 diabetic patients, we selected 119 subjects with DKD and 554 without DKD at enrolment and after a median follow-up period of 7.8 years for model training, testing and validation using seven machine learning methods (partial least square regression, the classification and regression tree, the C5.0 decision tree, random forest, naïve Bayes classification, neural network and support vector machine). We used 17 clinical attributes and 70 single nucleotide polymorphisms (SNPs) of 54 candidate genes to build different models. The top attributes selected by the best-performing models were then used to build models with performance comparable to those using the entire dataset. Age, age of diagnosis, systolic blood pressure and genetic polymorphisms of uteroglobin and lipid metabolism were selected by most methods. Models generated by support vector machine (svmRadial) and random forest (cforest) had the best prediction accuracy whereas models derived from naïve Bayes classifier and partial least squares regression had the least optimal performance. Using 10 clinical attributes (systolic and diastolic blood pressure, age, age of diagnosis, triglyceride, white blood cell count, total cholesterol, waist to hip ratio, LDL cholesterol, and alcohol intake) and 5 genetic attributes (UGB G38A, LIPC -514C > T, APOB Thr71Ile, APOC3 3206T > G and APOC3 1100C > T), selected most often by SVM and cforest, we were able to build high-performance models. Amongst different machine learning methods, svmRadial and cforest had the best performance. Genetic polymorphisms related to inflammation and lipid metabolism warrant further investigation for their associations with DKD. Show less