Frederick J.1, Brian S. Yandell1, Karl W. Broman2
1Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin, 53706, USA
2Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin, 53706, USA
Testing close linkage vs. pleiotropy in multiparental populationsMultiparental populations, such as the Diversity Outbred mouse population, are a new resource for systems genetics studies. Distinguishing close linkage of distinct quantitative trait loci from one pleiotropic locus that associates with multiple traits has implications in biomedical research, plant and animal breeding, population genetics, and other genetics applications. Presence of two distinct loci potentially enables selective modification of one locus at a time. In the case of a single pleiotropic locus, it may be difficult or impossible to modify the locus without influencing both traits. We extend methods of Jiang and Zeng  to develop a likelihood ratio test for the alternative hypothesis of close linkage of two loci against the null hypothesis of pleiotropy for a pair of traits that map to a single genomic region. Unlike previous tests of these competing hypotheses, our test incorporates polygenic random effects to account for complex patterns of relatedness among subjects. Additionally, our test accommodates more than two founder alleles. We use a parametric bootstrap to determine statistical significance of likelihood ratio test statistics. We characterize our test’s type I error rate and power to detect close linkage of two loci in simulation studies, where we find that it is slightly conservative and has reasonable power when the univariate LOD peaks are strong. To demonstrate its practical utility, we apply our test to data from a study of 261 Diversity Outbred mice. We perform pairwise analyses of three traits that map to a single region on chromosome 8. We find evidence that there are two distinct QTL present in the region. We share our methods in a freely available software package (https://github.com/fboehm/qtl2pleio) for the R statistical computing environment.
1European Molecular Biology Laboratory – European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD
Ensembl tools for analysis of variants in complex traits: a worked example
The Ensembl genome browser (www.ensembl.org)1 provides visualisation and analysis of integrated genomic data, including genes, variants, comparative genomics and gene regulation, for over 100 species. This workshop provides guidance for scientists who have not yet discovered the power and depth of this resource.
Ensembl can be used to analyse variation data, such as from whole genome variant calling, to evaluate likely candidate genes and variants involved in complex traits.
A brief introduction to Ensembl will be followed by hands-on demonstrations and exercises:
Participants will learn basic workflows which can be adapted to their own research questions. They will be able to learn about the wide range of data in Ensembl, and have the opportunity to think about how this data might be informative for their own research. Importantly, participants will be able to engage with the Ensembl community, finding sources of help and documentation.
Workshop materials, including slides, demonstration screenshots, exercises and solutions will be made available before the workshop and will remain permanently online at our training portal: https://training.ensembl.org.
Michelle N Perry1, Cynthia L Smith1, Carol J Bult1, and MGI Curation Group1
1Jackson Laboratory, Bar Harbor, Main, USA, 04684
Refinement of QTL candidate genes using Mouse Genome Informatics
While QTL may span megabases of genomic sequence involving many loci, the comparison of QTL-associated phenotypes with those of known engineered and induced mutations within the genomic region offers a means for identifying candidate genes. Further refinement of candidate gene lists and sequencing targets is possible by examining strain-specific SNP likely to contribute to disruptions in gene function.
Modern sequencing capabilities combined with the breadth of available inbred strains and characterized outbred stocks make translating phenotype to its causative sequence variation easier than ever with the use of computational tools. In addition to serving as the authoritative source for mouse gene, allele, and strain nomenclature, Mouse Genome Informatics (MGI, www.informatics.jax.org) offers an integrated database encompassing: engineered alleles and spontaneous mutations, QTL, SNP, embryonic expression patterns, Mammalian Phenotype Ontology annotations describing phenotypes, and Disease Ontology (DO) annotations for relating human diseases/syndromes to genotypes.
As of April 2018, MGI has curated over 6,564 QTL with 12,414 strain-specific variants, nearly all of which are associated with a chromosome and many to a specific genomic sequence range. The MGI website offers a variety of query forms and quick search options to access QTL details such as mapping data and phenotypic descriptions. Bulk data access is available from downloadable reports and from MouseMine.
There at least 26 seizure susceptibility QTL that differ between C57BL/6J and DBA/2J. Of these, audiogenic seizure prone 3 (Asp3) was mapped to Chromosome 7; 27606373-37142942 bp (GRCm38). A search in MGI for genes in this interval associated with a seizure phenotype returns four genes (Mag, Scn1b, Scl7a10 and Usf2). Of the 5,657 SNPs mapped to the Asp3 interval, one is potentially disruptive to one of these four candidate genes; rs32072976 in Slc7a10 of DBA/2J results in the missense mutation of arginine to histidine.
The ability to use MGI to connect QTL mapping intervals with phenotypic alleles and SNPs makes it a valuable resource to identify candidate genes and regions of interest that differ between strains. MGI is currently developing strain-specific pages to summarize the known mutations, QTL, phenotypes and diseases that are characteristic of specific strains. New visualization tools are also being developed to compare genomic intervals between strains and improve identification of candidate genes based on genomic location and curated phenotype.
Jan Silhavy1, Ondrej Kuda1, Marie Brezinova1, Vladimir Landa1, Vaclav Zidek1, Chandra Dodia2, Franziska Kreuchwig3, Laurence Balas4, Thierry Durand4, Norbert Hübner3, Aron B. Fisher2, Jan Kopecky1, Michal Pravenec1
1Institute of Physiology of the Czech Academy of Sciences, Prague, Czech Republic
2Institute for Environmental Medicine of the Department of Physiology, University of Pennsylvania, Philadelphia, United States
3Max-Delbrück-Center for Molecular Medicine, Berlin, Germany
4Institut des Biomolécules Max Mousseron, CNRS, Université Montpellier, Montpellier, France
Nrf2-mediated Antioxidant Defense and Peroxiredoxin 6 are Linked to FAHFA biosynthetic pathwayFatty acid esters of hydroxy fatty acids (FAHFAs) are lipid mediators with anti-diabetic and anti-inflammatory properties that are produced in white adipose tissue (WAT) via de novo lipogenesis, but their biosynthetic enzymes are unknown. Using a combination of lipidomics in WAT, QTL mapping and correlation analyses in rat BXH/HXB recombinant inbred strains, and response to oxidative stress in murine models, we elucidated the potential pathway of biosynthesis of several FAHFAs. Analysis of WAT samples identified ~160 regioisomers documenting the complexity of this lipid class. The linkage analysis highlighted several members of Nuclear factor, erythroid 2-like 2 (Nrf2)-mediated antioxidant defense system (Prdx6, Mgst1, Mgst3, Gpx7), lipid-handling proteins (Cd36, Scd6, Acnat1, Acnat2, Baat) and family of flavin containing monooxygenase (Fmo) as the positional candidate genes. Transgenic expression of Nrf2 and deletion of Prdx6 genes resulted in reduction of palmitic acid ester of 9-hydroxystearic acid (9-PAHSA) and 11-PAHSA levels, while oxidative stress induced by an inhibitor of glutathione synthesis increased PAHSA levels nonspecifically. Our results indicate that the synthesis of FAHFAs via carbohydrate-responsive element-binding protein (ChREBP)-driven de novo lipogenesis depends on the adaptive antioxidant system and suggest that FAHFAs may link activity of this system with insulin sensitivity in peripheral tissues.
Petr Mlejnek1, Jan Silhavy1, Miroslava Simakova1, Jaroslava Trnovska2, Hana Malinska2, Ludmila Kazdova2, Michal Pravenec1
1Institute of Physiology of the Czech Academy of Sciences, Prague, Czech Republic
2Institute for Clinical and Experimental Medicine, Prague, Czech Republic
CD36 regulates glucose and lipid metabolism in brown adipose tissue via insulin signaling in spontaneously hypertensive ratsBrown adipose tissue (BAT) plays an important role in lipid and glucose metabolism in rodents and possibly also in humans. Identification of genes responsible for BAT function would shed light on underlying pathophysiological mechanisms. Recently, using weighted gene co-expression network analysis (WGCNA) in the BAT from BXH/HXB recombinant inbred strains, derived from SHR (spontaneously hypertensive rat) and BN (Brown Norway) progenitors, we identified Cd36 as the hub gene of co-expression module associated with BAT relative weight. In the current study, we performed functional experiments to validate Cd36 as a quantitative trait gene responsible for BAT weight and function. SHR-Cd36 transgenic rats with wild type Cd36 gene when compared to the SHR that harbors a deletion variant of Cd36, exhibited reduced BAT weight. SHR-Cd36 BAT incubated ex vivo with glucose showed significantly increased glucose oxidation and incorporation into BAT lipids (lipogenesis) when compared to the SHR but the difference in lipogenesis was not observed when BAT was incubated in glucose together with palmitate. No significant differences between strains were detected in palmitate oxidation and incorporation into BAT lipids. Next we tested whether the increased lipogenesis in BAT is due to the effects of Cd36 on insulin signaling which is suppressed by palmitate. This possibility is supported by finding correlations between lipogenesis and expression of Irs1, Akt1, Pik3r1 and Tbc1d4 (AS160) genes involved in insulin signaling in SHR-Cd36 but not in SHR rats. These findings demonstrate that palmitate suppresses insulin signaling via Cd36. Correlation analyses in RI strains showed that increased activity of BAT was associated with amelioration of insulin resistance and hypertension. In summary, these results demonstrate important role of Cd36 gene in regulating BAT weight and function and consequentially lipid and glucose metabolism.
Michal Pravenec1, Jan Silhavy1, Vaclav Zidek1, Vladimir Landa1, Petr Mlejnek1, Miroslava Simakova1, Jaroslava Trnovska2, Hana Malinska2, Martina Huttl2, Ludmila Kazdova2, Tomas Mracek1, Jan Kopecky1, Josef Houstek1
1Institute of Physiology of the Czech Academy of Sciences, Prague, Czech Republic
2Institute for Clinical and Experimental Medicine, Prague, Czech Republic
Mutant Wars2 gene in spontaneously hypertensive rats impairs brown adipose tissue function and predisposes to visceral obesityBrown adipose tissue (BAT) plays an important role in lipid and glucose metabolism in rodents and possibly also in humans. Identification of genes responsible for BAT function would shed light on underlying pathophysiological mechanisms of metabolic disturbances. Recent linkage analysis in the BXH/HXB recombinant inbred (RI) strains, derived from Brown Norway (BN) and spontaneously hypertensive rats (SHR), identified 2 closely linked quantitative trait loci (QTL) associated with glucose oxidation and glucose incorporation into BAT lipids in the vicinity of Wars2 (tryptophanyl tRNA synthetase 2 (mitochondrial)) gene on chromosome 2. The SHR harbors L53F WARS2 protein variant that was associated with reduced angiogenesis and Wars2 thus represents a prominent positional candidate gene. In the current study, we validated this candidate as a quantitative trait gene (QTG) using transgenic rescue experiment. SHR-Wars2 transgenic rats with wild type Wars2 gene when compared to SHR, showed more efficient mitochondrial proteosynthesis and increased mitochondrial respiration, which was associated with increased glucose oxidation and incorporation into BAT lipids, and with reduced weight of visceral fat. Correlation analyses in RI strains showed that increased activity of BAT was associated with amelioration of insulin resistance in muscle and white adipose tissue. In summary, these results demonstrate important role of Wars2 gene in regulating BAT function and consequently lipid and glucose metabolism.
Thom Saunders1,2, Wanda E. Filipiak1, Galina B Gavrilina1, Anna K. LaForest1, Corey E. Ziebell1, Michael G. Zeidler1, , Elizabeth D. Hughes1
1Biomedical Research Core Facilities, Transgenic Animal Model Core
2Department of Internal Medicine, Division of Molecular Medicine and Genetics, University of Michigan Medical School, Ann Arbor, USA
CRISPR/Cas9 rat and mouse genome editing
CRISPR/Cas9 is a RNA guided nuclease that produces double strand breaks in DNA. Rat and mouse rat zygotes repair chromosome breaks with non-homologous endjoining (NHEJ) or homology directed repair (HDR). The resulting founder animals are typically genetic mosaics that carry multiple edits of the targeted allele. NHEJ repair can produce deletions in critical regions to knockout gene expression or remove coding exons from transcripts. HDR repair is used to introduce new information into the genome so as cause the expression of proteins with mutant amino acid codons or to introduce reporter proteins that are expressed with exacting specificity and development timing from endogenous genes. Guide RNAs (gRNA) predicted to be active are tested in zygotes or cells. gRNAs that target Cas9 to the gene of interest and induce chromosome breaks are used to engineer genes in rat or mouse zygotes to produce genetically engineered animals.
CRISPR/Cas9 was used to produce rat gene knockout models, point mutations, reporter knockins, and conditional genes (floxed genes). The same range of mutations was produced in mouse models, additionally, deletions up to 100 kb were produced. Results show that the use of enhanced specificity Cas9 protein is highly effective in producing chromosome breaks to stimulate repair by NHEJ and HDR. The efficiency of oligonucleotide knockins is lower than the production of simple indels and the introduction of reporters is more efficient that the generation of floxed genes. Compared to preceding embryonic stem cell techniques and other genome editing tools (zinc finger nucleases and transcription like effector nucleases) we find that CRISPR/Cas9 significantly increases access to mouse and rat genomes for the generation of biomedical research models.
Byron C. Jones1, James P. O’Callaghan2, Diane B. Miller2, Wenyuan Zhao1, Lu Lu1, Robert W. Williams1
1Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis TN, 38163, USA
2Centers for Disease Control and Prevention, National Institute for Occupational Safety and Health, Morgantown, WV, 26505, USA
Strain differences in proinflammatory gene expression in prefrontal cortex following high glucocorticoid-organophosphate exposure in BXD mice
During the USA-led First Gulf War of 1990-1991, 25-30% of returning military personnel developed a chronic multi-system disorder that persists even until today in many individuals. The cause for this disease, termed Gulf War illness, was suspected to be exposure to the various chemicals present in theatre. Among those were airborne particulate matter from open fires, depleted uranium, and three kinds of organophosphorous (OP) compounds, i.e., insecticides such as malathion, pyridostigmine as prophylaxis against nerve agents, and even small amounts of sarin emanating from ammunition dumps destroyed by allied forces. Exposure to OPs was proposed to produce inflammation in the periphery and central nervous system – the latter being the source of sickness behavior. O’Callaghan and Miller proposed that exposure to OP might not be sufficient and they asserted that high circulating glucocorticoids – cortisol – would be a contributing factor. Accordingly, they developed a mouse model that included exposure to OPs with high circulating glucocorticoids – corticosterone as might mimic stressful conditions as would be experienced in a combat zone. In this work they showed a synergistic action of the combined exposures in the C57BL/6J (B6) male mouse . Because 25-30% of the military personnel became sick, the question raised was, what about the others in theatre who did not become sick? To address this question, we first tested the other founder strain for the BXD mice, the DBA/2J (D2) and both sexes. We found that the D2 strain and females of both strain were much less responsive than the B6 and we then tested 20+ BXD strains and both sexes. As expected, we found wide variation among strains in response to combined corticosterone and diisopropylflurophosphate (our OP surrogate for nerve agents) on expression of IL-1β, TNFα, and IL-6 in the prefrontal cortex. We also observed substantial sex differences in response and in QTL mapping. Supported in part by DOD grant GWI160086.
O’Callaghan JP, Kelly KA, Locker AR, Miller DB, Lasley SM: Corticosterone primes the neuroinflammatory response to DFP in mice: potential animal model of Gulf War illness. J Neurochem 2015, 133:708-721
Robert W. Corty1, Lisa M. Tarantino1, Joseph S. Takahashi3, William Valdar1
1University of North Carolina at Chapel Hill, Department of Genetics
2The Jackson Laboratory
3Howard Hughes Medical Institute, Department of Neuroscience, University of Texas Southwestern Medical Center, Dallas, TX 75390
Mean-variance QTL mapping: discovering QTL that standard approaches cannot, debunking QTL that standard approaches credit
Most QTL mapping approaches seek to identify "mean QTL", genetic loci that influence the phenotype mean, after assuming that all individuals in the mapping population have equal residual variance. Recent work has broadened the scope of QTL mapping to identify genetic loci that influence phenotype variance, termed "variance QTL", or some combination of mean and variance, which we term "mean-variance QTL".
We describe an approach for detecting QTL affecting the mean and/or variance. This "mean-variance QTL mapping" uses the double generalized linear model, an elaboration of the standard linear model, to capture structure in the residual variance. Its potential for use in QTL mapping has been described previously, but it remained underdeveloped and underutilized, with certain key advantages undemonstrated until now.
Most straightforwardly, it allows detection of variance QTL. Less obvious but more generally applicable, it increases power to detect mean QTL when the phenotype exhibits variance heterogeneity induced by either an experimental covariate or from the QTL genotype itself.
e demonstrate these properties through three case studies: 1) an intercross of C57BL/6J and C58/J in which mean-variance QTL mapping detects a variance QTL for rearing behavior; 2) a backcross of CAST/Ei into M16i, in which we identify a mean QTL for bodyweight at three weeks of age, after accommodating variance heterogeneity induced by sire; and 3) a reduced complexity cross of C57BL/6J and C57BL/6N, where we detect a mean QTL for circadian wheel running, after accommodating variance heterogeneity induced by the QTL itself.
In each case we explain how standard analyses left the QTL signal overlooked and how mean-variance QTL mapping detected it. Additionally, we illustrate the ability of mean-variance QTL mapping to explain away false QTL that standard analyses erroneously classed as significant. Last, we provide a software package, R/vqtl, compatible with the popular R/qtl suite, for researchers to use the method to analyze (and reanalyze) cross data themselves, our aim being to make mean-variance QTL mapping an easily-applied extension to the standard QTL mapping toolkit.
Corty RW, Valdar W (2018+) vqtl: An R package for Mean-Variance QTL Mapping bioRxiv https://doi.org/10.1101/149377
Corty RW, Valdar W (2018+) Mean-Variance QTL Mapping on a Background of Variance Heterogeneity bioRxiv https://doi.org/10.1101/276980
Corty RW, Kumar V, Tarantino LM, Takahashi JS, Valdar W (2018+) Mean-Variance QTL Mapping Identifies Novel QTL for Circadian Activity and Exploratory Behavior in Mice bioRxiv https://doi.org/10.1101/276972
Neza Alfazema1,2, Marjorie Barrier1,2, Sophie Marion de Procé1, Robert I. Menzies2, Roderick Carter2, Ana Garcia Diaz4, Ben Moyon5, Zoe Webster5, Christopher O.C. Bellamy6, Mark J. Arends6 , Roland H. Stimson2, Nicholas M. Morton2, Timothy J. Aitman1,2, Philip M. Coan1,2
1Centre for Genomic and Experimental Medicine, MRC Institute for Genetics and Molecular Medicine, Edinburgh, EH4 2XU, UK.
2Centre for Cardiovascular Science, Queen’s Medical Research Institute, University of Edinburgh, EH16 4TJ, UK.
3Royal (Dick) School of Veterinary Studies, University of Edinburgh, EH25 9RG, UK.
4Department of Medicine, Imperial College London, London, SW7 2AZ, UK.
5Embryonic Stem Cell and Transgenics Facility, MRC Clinical Sciences Centre, Imperial College London, London W12 0NN, UK.
6Division of Pathology & Centre for Comparative Pathology, Edinburgh CRUK Cancer Centre, Edinburgh, EH4 2XR, UK.
Camk2n1 is a negative regulator of hypertension, insulin sensitivity, and adipogenesisMetabolic syndrome (MetS) is a cause of cardiovascular disease (CVD) and type 2 diabetes. The spontaneously hypertensive rat (SHR) is an established model of MetS. We previously found in SHR that Camk2n1 expression is regulated in cis as an expression quantitative trait locus (cis-eQTL) in left ventricle (LV) and epididymal adipose tissue (EAT) and positively correlated with fat mass and adipocyte volume. Camk2n1 negatively regulates CaMKII, a kinase that controls hypertensive and hypertrophic responses to stress stimuli. However the function of Camk2n1 in MetS has not previously been investigated. We knocked out Camk2n1 in SHR to investigate its role in regulation of cardiometabolic traits. Camk2n1-/- rats had lower blood pressure (-Δ12/10 mmHg) than SHR controls, associated with to enhanced nitric oxide bioavailability. LV mass was reduced 9% in Camk2n1-/- rats which were also protected from isoproterenol-induced increases in the rate pressure product and the hypertrophic marker Acta. Camk2n1 deficiency improved insulin sensitivity and reduced visceral fat mass in vivo and diminished adipogenic capacity in vitro. Weighted gene co-expression network analysis in LV and EAT showed that Camk2n1 deletion significantly altered networks controlling cell cycle, oxidative phosphorylation, classical complement and antigen presentation. In human visceral fat, cis-eQTLs that increase CAMK2N1 expression significantly associate with increased CVD risk, hyperglycemia, and visceral adiposity. Further, CAMK2N1 expression in visceral fat from obese non-diabetic and obese type 2 diabetic subjects was elevated compared to expression in lean subjects and correlated significantly with fat mass. We conclude that CAMK2N1 is an important regulator of networks that control MetS traits and merits further investigation as a therapeutic target in humans.
Martha Imprialou1, Paula Kover2, Richard Mott3
1Wellcome Trust Centre for Human Genetics, University of Oxford, OX3 7BN, UK
2Department of Biology and Biochemistry, University of Bath, BA2 7AY, UK
3Genetics Institute, University College London, WC1 6BT, UK
Genomic Rearrangements in Arabidopsis Considered as Quantitative Traits.To understand the population genetics of structural variants and their effects on phenotypes, we developed an approach to mapping structural variants that segregate in a population sequenced at low coverage. We avoid calling structural variants directly. Instead, the evidence for a potential structural variant at a locus is indicated by variation in the counts of short-reads that map anomalously to that locus. These structural variant traits are treated as quantitative traits and mapped genetically, analogously to a gene expression study. Association between a structural variant trait at one locus, and genotypes at a distant locus indicate the origin and target of a transposition. Using ultra-low-coverage (0.3×) population sequence data from 488 recombinant inbred Arabidopsis thaliana genomes, we identified 6502 segregating structural variants. Remarkably, 25% of these were transpositions. While many structural variants cannot be delineated precisely, we validated 83% of 44 predicted transposition breakpoints by polymerase chain reaction. We show that specific structural variants may be causative for quantitative trait loci for germination and resistance to infection by the fungus Albugo laibachii, isolate Nc14. Further we show that the phenotypic heritability attributable to read-mapping anomalies differs from, and, in the case of time to germination and bolting, exceeds that due to standard genetic variation. Genes within structural variants are also more likely to be silenced or dysregulated. This approach complements the prevalent strategy of structural variant discovery in fewer individuals sequenced at high coverage. It is generally applicable to large populations sequenced at low-coverage, and is particularly suited to mapping transpositions. The work is described with more detail in 1.
1Imprialou et al (2017) Genetics 205:1425-1441.
MA Bogue1, SC Grubb1, DO Walton1, VM Philip1, M Dunn1, GI Kolishovski1, T Stearns1, V Sinha1, B Kadakkuzha1, G TeHennepe1 , G Kunde-Ramamoorthy1, EJ Chesler1
1The Jackson Laboratory, Bar Harbor Maine, USA
Mouse Phenome Database 2.0: New tools and data resources for curated and integrated primary mouse phenotype dataThe Mouse Phenome Database (MPD; phenome.jax.org) is a widely used resource that provides access to primary experimental data, protocols and analysis tools for mouse phenotyping studies. Data are contributed by investigators around the world and represent a broad scope of phenotyping endpoints and disease-related characteristics in naïve mice and those exposed to drugs, environmental agents or other treatments. MPD was recently re-engineered using Web 2.0 technologies to facilitate interactive data exploration and quantitative analysis. Rigorous statistical analyses are implemented in R, and dynamic D3 visualizations are available. New tools and analyses are under continual development and include multivariate outlier detection, replicability analysis, and multi-trait, multi-population mapping. We are curating data from inbred strains and other reproducible strains, including KOMP mice, Collaborative Cross (CC), CC-RIX, and founder strains. We are also collecting primary data from mapping populations, including historic mapping crosses and advanced high-diversity mouse populations such as Diversity Outbred mice. We are developing a new data submission interface for data contributors so that they, as domain experts, may annotate their own data with relevant ontology terms and provide detailed information for protocols and animal environmental conditions as required by the ARRIVE guidelines. These data are exposed to analysis tools within MPD and other systems through an API. Tools and features from the re-engineered system will be presented. Funding provided by NIH DA028420, DA045401.
Thu Hong Le1, Michael Scott1, Leah C Solberg Woods2, Richard Mott1
1Genetics Institute, University College London, Gower St, London WC1E 6BT UK
2Department of Internal Medicine, School of Medicine, Wake Forest University, Winston Salem, North Carolina, USA.
Imputation in Multiparental Populations from Ultra-low Coverage Sequence
We show that it is possible to impute near-complete catalogues of genomic variation and haplotype mosaics in multiparental populations sequenced at low-coverage (<1x), using the STITCH algorithm1. This algorithm uses a FAST-PHASE type approach, optimised for low-coverage sequence data. The advantage of this approach is that it is cost-competitive with genotyping arrays whilst imputing at all mappable genomic regions.
We first present data from a rat heterogeneous stock (HS), in which we sequenced 1586 rats at mean coverage 0.24x. The eight founders were sequenced at higher coverage (40x) to provide a haplotype reference panel. More than 2.4 million imputed SNPs were retained after quality control (imputation info score > 0.4 and Hardy-Weinberg Equilibrium p-value >10-6). Imputed genotypes were then compared with 989 rats genotyped with 10K SNP array2. Concordance with 3253 SNPs in common was 0.98 and the correlation R2 was 0.91. We investigated imputation with and without using the reference panel and obtained similar SNP imputation accuracy. However, an advantage of using a reference panel is that haplotype mosaic reconstructions based on founder haplotypes are generated automatically.
In a second example we imputed 488 Arabidopsis (Multiparental Advanced Intercross) MAGIC lines sequenced at approximately 0.5x. These are descended from 19 founders3 whose founders were sequenced at over 20x4. We found that it was possible to impute the genomes with and without using a reference panel, to a comparable accuracy of 0.98 concordance with 782 GoldenGate SNP genotypes, using STITCH or a simpler Viterbi-type Hidden Markov Model that used the founder reference panel5.
These results suggest that in future sequencing founders of a multi-parental population for an accurate imputation is not necessary, and moreover that they lie within the same analysis framework used for natural populations such as humans. Given the attractiveness of HS and MAGIC in model organism studies and for crop improvement, this may widen their utility even further.
Harry A Smith1, Paula L. Hoffman2, Spencer Mahaffey3, Lauren Vanderlinden1, Boris Tabakoff3, Laura Saba1
1Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado, 80045, USA
2Department of Pharmacology, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, Colorado, 80045, USA
3Department of Pharmaceutical Sciences, University of Colorado Skaggs School of Pharmacy and Pharmaceutical Sciences, Aurora, Colorado, 80045, USA
Exploring the influence of genetic background on isoform diversity in rat liver using the Hybrid Rat Diversity Panel
Background Even though the rat has become a widely-used model organism for studying complex disease traits like hypertension and alcohol use disorders, the rat transcriptome remains under-annotated. The human Ensembl transcriptome has over 200,000 annotated transcripts, yet there are only just over 41,000 annotated transcripts in the rat Ensembl transcriptome. Even the Mouse Ensembl Transcriptome has over 132,000 transcripts, so it is unlikely the difference in the number of transcripts is due to biology. The goal of this project is to utilize high throughput total RNA sequencing data from a subset of the genetically diverse Hybrid Rat Diversity Panel (HRDP) to reconstruct the rat liver transcriptome.
Methods We sequenced whole liver ribosomal RNA-depleted total RNA from 43 inbred strains of rats including 30 recombinant inbred (HXB/BXH) strains and 13 classic inbred strains from the HRDP. The RNA-Seq reads were aligned to strain specific genomes, and these genome-aligned reads were used to reconstruct the rat liver transcriptome. Transcripts were classified to 1 of 3 categories 1) known transcript of a known (i.e., Ensembl annotated) gene, 2) novel transcript of a known gene, and 3) novel transcript of a novel gene. We explored various background expression thresholds and compared the power to identify novel transcript between a larger panel with limited genetic diversity (recombinant inbred) and a smaller panel with more inherent genetic diversity (classic inbred). We estimated the number of inbred strains needed to capture the full transcriptome.
Results The liver transcriptome generated by combining information from all 43 strains contained 328,096 transcripts. Using a conservative estimate of background expression, 114,982 of these transcripts were present in at least one of the 43 strains. Of the “present” transcripts, 15,621 (14%) were known transcripts of known genes and included 14,862 (95% of known transcripts) transcripts designated as protein-coding transcripts by Ensembl. 73,855 (64%) were novel transcripts of known genes and 25,506 (22%) were novel transcripts of novel genes. One of the goals of this experiment was to investigate the impact of combining the RI panel with the panel of classic inbred strains. As we expected, most transcripts were present in both panels. Interestingly, 40% of transcripts were only “present” in one of the two panels. Furthermore, the transcripts that were unique to each panel were more likely to be unannotated when compared with the transcripts present across both panels.
Conclusions When using RNA-Seq to quantitate RNA expression levels in rat, transcriptome reconstruction is essential for capturing the diversity in splicing and alternative transcription start and stop sites. Furthermore, genetic background influences this isoform diversity and these results support the analysis of individual isoforms rather than aggregating RNA expression at the gene level. Supported by NIAAA R24AA013162 and NIDA P30DA044223.
Joshua T Yuan1, Daniel M Gatti2, Vivek M Philip2, Steven Kasparek3, Andrew M Kreuzman4, Benjamin Mansky4, Kayvon Sharif4, Dominik Taterra4, Walter M Taylor4, Mary Thomas4 , Jeremy O Ward5, Andrew Holmes6, , Elissa J Chesler2, Clarissa C Parker3,4
1Department of Computer Science, Program in Molecular Biology & Biochemistry, Middlebury College, Middlebury VT 05753, USA
2The Jackson Laboratory, 610 Main Street, Bar Harbor, ME 04609, USA
3Department of Psychology, Middlebury College, Middlebury VT 05753, USA
4Program in Neuroscience, Middlebury College, Middlebury VT 05753, USA
5Department of Biology, Program in Molecular Biology & Biochemistry, Middlebury College, Middlebury VT 05753, USA
6Laboratory of Behavioral and Genomic Neuroscience, National Institute on Alcoholism and Alcohol Abuse (NIAAA), US National Institutes of Health (NIH), Bethesda, Maryland, USA.
Genome-wide association for testis weight in the diversity outbred mouse populationTestis weight is a complex trait associated with male fertility across numerous species. Phenotypic variation in complex traits such as testis weight is attributed to the combined effect of multiple genes, which may be identified via quantitative trait locus (QTL) mapping. We sought to evaluate the genetically diverse, highly recombinant Diversity Outbred (DO) mouse population as a tool to identify and map QTLs associated with testis weight. We recorded paired testis weights for 502 male DO mice and genotyped subjects on the GIGAMuga array at ~143,000 SNPs. We performed a genome-wide association analysis and identified one significant and two suggestive QTLs associated with testis weight on chromosomes 4, 11, and 12. The use of the highly recombinant DO population greatly improved our ability to refine the size of QTLs from broad chromosomal regions to narrow intervals spanning between 3.44 and 7.55 Mb. Using bioinformatic approaches, we developed a list of prioritized candidate genes and identified those with known roles in testicular size and development. Candidates of particular interest include the RNA demethylase gene Alkbh5, the cyclin-dependent kinase inhibitor gene Cdkn2c, the dynein axonemal heavy chain gene Dnah11, the phospholipase D gene Pld6, the trans-acting transcription factor gene Sp4, and the spermatogenesis-associated gene Spata6, each of which has a human ortholog. Our results demonstrate the utility of DO mice in high--resolution genetic mapping of complex traits, enabling us to identify developmentally important genes in adult mice. Understanding how genetic variation in these genes influence testis weight could aid in the understanding of mechanisms of mammalian reproductive function.
Olivia L Sabik1,2, Gina M Calabrese1, Cheryl L Ackert-Bicknell3, Charles R Farber1,2,4
1Center for Public Health Genomics, School of Medicine, University of Virginia, Charlottesville, VA 22908
2Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22908
3Center for Musculoskeletal Research, University of Rochester Medical Center, University of Rochester, Rochester, NY 14624
4Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908
Using co-expression network analysis to inform genome-wide association studies for bone mineral densityOsteoporosis is a disease characterized by reduced bone mineral density (BMD) and increased bone fragility. About half of the variation in BMD across the human population can be explained by genetic variation, and genome-wide association studies (GWASs) have identified over 200 regions of the human genome containing genetic variants influencing BMD. Despite the wealth of genetic signals identified, the genes and mechanisms through which these loci impact bone remain unknown. Here, we hypothesized that we could identify the causal genes underlying GWAS loci for BMD by analyzing GWAS data in the context of a co-expression network. Our approach was based on the idea that groups of causal genes likely influence BMD through the same biological process (e.g. osteoblast-mediated bone formation) and genes involved in specific biological processes are often co-expressed. We applied weighted gene co- expression network analysis (WGCNA), to genome-wide gene expression profiles from mouse osteoblasts and identified 65 modules of co-expressed genes. Of these, one module was significantly enriched for genes located within BMD GWAS loci (odds ratio (OR) = 3.4, P = 4x10-9) and its expression was significantly correlated with in vitro levels of mineralization (r = 0.49; FDR = 0.012). This module was also enriched for skeletal development genes (OR = 8.58; P <2.2x10-16), and genes that result in a bone phenotype when knocked-out (OR = 4.06; P = 2.14x10-9). Using independent gene expression profiles measured throughout osteoblast differentiation, we observed that genes in this module were expressed in two distinct patterns, either early or late in osteoblast differentiation, and the enrichments listed above were increased in the late differentiation group. Additionally, we observed that the mean connectivity of GWAS- implicated genes was significantly higher than non-GWAS genes, leading us to prioritize genes based on their degree of connectivity within the module. Of the 30 most highly connected genes in this module, 13 had not been previously linked to bone mass, and three, (Beta-1,4-N-Acetyl-Galactosaminyltransferase 3 (B4galnt3), Cell Adhesion Molecule 1 (Cadm1), and Solute Carrier Family 8 Member A3 (Slc8a3)), were located within GWAS loci for BMD. For all three genes, their expression in humans was influenced by expression quantitative trait loci (eQTL) that colocalized with the BMD associations and mouse knockouts demonstrated altered BMD. These data have identified the genes responsible for three BMD GWAS associations and provided a resource for understanding the complex network of genes directing bone mass.
Michael F Scott1, Nick Fradgley2, Alison Bentley2, Keith Gardner2, Phil Howell2, Ian Mackay2, James Cockram2, Richard Mott1
1UCL Genetics Institute, Department of Genetics, Evolution and Environment, Division of Biosciences University College London, Gower Street, London, WC1E 6BT, UK
2NIAB, Genetics and Breeding department, Huntingdon Road, Cambridge, CB3 0LE, UK
Diverse MAGIC wheat: a 16-parent population for QTL mappingWe report on the development of resources for mapping complex traits in a 16-parent MAGIC (Multiparent Advanced Generation Inter-Cross) population of hexaploid bread wheat (Triticum aestivum). The 16 founders include both historical and elite varieties (released between 1935 and 2003) from several Northern European countries. These founders are diverse; according to genotyping array results, they capture 90% of the genetic diversity found in a panel of 519 UK wheat varieties released over the last 40 years. After inter-crossing and six generations of selfing, 595 RILs (Recombinant Inbred Lines) have been produced and genotyped using the Affymetrix 35k axiom wheat breeders’ SNP array. Due to the presence of three homeologous genomes in this polyploid species, genotyping array results cannot be straightforwardly assigned to genomic positions. We used software specific to 16-parent MAGIC populations to estimate the recombination rates between markers, which allowed us to create linkage groups and then assign chromosomal positions. We then perform QTL mapping for 30 traits, including height, leaf size, seed size, resistance to yellow rust, yield, and phenological traits, which were measured during replicated trials in 2017. Another year of phenotyping will be performed in replicated trial plots in 2018, which are available for community phenotyping at NIAB-Cambridge. The population germplasm and genotypes are also available upon request. Our results demonstrate successful identification of QTL in this population. In addition, we identify linkage groups that experience apparent segregation distortion and/or inheritance as large haplotypes, indicating putative genomic re-arrangements or introgressions. These features are expected to be particularly prevalent in this diverse population. In some cases, we are able to validate that these features correspond to known introgressions introduced by breeders from Rye and T. timopheevii.
David G. Ashbrook1, Robert W. Williams1
1Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
Sequencing the BXD family, a cohort for experimental systems genetics and precision medicine
The BXD mouse genetic reference population is the most deeply phenotyped mammalian model system, with ~6000 phenotypes in GeneNetwork.org, the repository for BXD family data. GeneNetwork allows examination of complex interactions between gene variants, phenotypes from different biological levels, and environmental factors. The family consists of 152 inbred strains, each of which is a unique mosaic of alleles from the C57Bl/6J and DBA2/J, and segregate for ~4.5 million known sequence variants. Using the current genotype data, it is possible to achieve mapping precision of under ±2.0 Mb over most of the genome.
We have carried out 40X sequencing of all BXD strains using a Chromium linked-read barcoding strategy. This deep sequencing of ~40 kb DNA fragments has several uses including: identification of structural variants not reliably detected using short read shotgun sequencing; identification of variants unique to each ‘epoch’ of BXD, derived in the last four decades; identification of rare spontaneous mutations; and production of the first ‘infinite marker maps’, allowing even higher precision mapping of phenotypes.
We have confirmed ~4.5 million variants between the DBA/2J and C57BL/6J parents. We have aligned sequences for >50 samples and identified haplotype blocks with greater precision than was possible with microarray-based genotyping. Candidates have been identified for a phenome-wide association study.
This family is an excellent resource for testing networks of causal and mechanistic relations among clinical phenotypes and millions of molecular and organismal traits, including metabolic syndrome, infection, addiction, neurodegeneration, and longevity. Full sequencing of all lines will only increase its usefulness.
Richard A. Radcliffe1,2, Robin Dowell3,4,5, Aaron T. Odell3, Phillip Richmond3, Beth Bennett1, Colin Larson1, Katerina J. Kechris6, Laura M. Saba1, Pratyaydipta Rudra6, Wen J. Shi7
1Dept. of Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, CO 80045 USA
2Institute for Behavioral Genetics, University of Colorado, Boulder CO 80309 USA
3BioFrontiers Institute, University of Colorado, Boulder, CO 80309 USA
4Dept. of Molecular, Cellular, & Developmental Biology, University of Colorado, Boulder, CO 80309 USA
5Dept. of Computer Science, University of Colorado, Boulder, CO 80309 USA
6Dept. of Biostatistics & Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045 USA
7Dept. of Pharmacology, University of Colorado Anschutz Medical Campus, Aurora, CO 80045 USA
*Corresponding author: firstname.lastname@example.org
G(ene) X E(thanol) interactions on the brain transcriptome of the LXS recombinant inbred mice: Relationship to acute ethanol toleranceA well-established genetic risk factor for alcohol use disorder (AUD) is blunted sensitivity to acute ethanol, of which an important component is acute functional tolerance (AFT). We have been investigating the genetic basis of AFT and its relationship to drinking behavior with the use of the LXS recombinant inbred mouse strains. Previous LXS studies found a significant genetic correlation between AFT and binge drinking  and several QTLs for AFT . Here we present results of full transcript RNA-seq analysis of the brain transcriptome of LXS mice 8 hours after administration of either saline or ethanol (5 g/kg), the same dosing procedure used to test AFT in the LXS. Out of the 14,184 genes expressed above background, 1,989 and 2,016 cis-eQTLs were mapped for the saline and ethanol groups, respectively, with 203 unique cis-eQTLs in the saline group and 284 in the ethanol group. We also mapped 1,081 significant trans-eQTLs for the saline group and 1328 for the ethanol group; 220 were in common. Over 800 of the trans-eQTLs mapped to “master trans-regulators”; i.e., loci at which greater than 4 trans-eQTLs mapped. There were 22 such loci in the saline group and 26 in the ethanol group; up to 34 trans-eQTLs mapped to a given locus. We used a multi-step statistical approach on the full set of 14,184 genes to identify 1,182 “ethanol-responsive” (ER) genes. Enrichment analysis of the ER genes indicated several functional categories including RNA processing, chromatin remodeling, protein processing, and kinase-related signaling. AFT values were correlated to expression abundance of the 14,184 genes to identify 88 genes significantly correlated to AFT in the saline group and 301 in the ethanol group (FDR<0.10). Enrichment analysis revealed little about these genes; however, a query of the L1000 gene expression signature library  indicated that genes correlated to AFT had a profile that overlapped with the profiles of 10 different HDAC inhibitors. This occurred exclusively for AFT from the ethanol group. This was an interesting finding since Hdac1 is cis-regulated and is located at the peak of a significant AFT QTL that is specific to the ethanol group . Hdac1 is also at the locus of a master trans-regulator. The results provide insight into the genetic and mechanistic basis of AFT and, more broadly, contribute to our understanding of ethanol-related GxE interactions on gene expression in the context of AUD. Funding from NIH grants R01AA016957 and R01AA021131.
Allen W. Cowley, Jr., Alex Dayton, Eric C. Exner, John D. Bukowy, Timothy J. Stodola, Theresa Kurth, Meredith Skelton, Andrew S. Greene, and Allen W. Cowley, Jr.
Department of Physiology, Medical College of Wisconsin
Analysis of phenotypic variability between males and females 50 inbred rat strains – a legacy of the Program of Genomic Applications (PhysGen).Despite the striking differences between male and female physiology, female physiology is understudied and there has been an overwhelming male bias in pre-clinical scientific research with the use of male to female animals in pharmacological research nearly 6:1. One of the oft-cited reasons for male preference by investigators is the common belief that females exhibit greater phenotypic variability due to the estrous cycle. It is reasonable to suppose that varying hormone levels during the estrous cycle would confound the study of many phenotypes in female organisms by increasing the variance in uncontrolled female experimental groups. If present, this increased variance would mean that greater numbers of female animals than male would be required to detect true experimental differences in the magnitude of a phenotype. Given the availability of phenotypic data obtained from the many rat strains studied within our PhysGen arm of the Program for Genomic Applications (pga.mcw.edu) we have compared male and female variance and coefficient of variation in 142 heart, lung, vascular, kidney and blood phenotypes, each measured in hundreds to thousands of individual rats from over 50 inbred strains. Differences in variance between males and females were assessed on a phenotype-by-phenotype basis by using the Brown-Forsythe test, a one-way ANOVA on the set of absolute deviations from the median. Differences in magnitude were assessed on a phenotype-by-phenotype basis by use of heteroscedastic t-test. Those phenotypes which exhibited a difference in both magnitude and variance, or a difference in variance alone, were then subjected to a retrospective power analysis to determine the sample size necessary to detect the observed differences. We found that 74 of 142 phenotypes from this dataset demonstrated a sex difference in variance. However, 59% of these phenotypes exhibited greater variance in male rats rather than female. A retrospective power analysis demonstrated that only 16 of 74 differentially variable phenotypes would be detected when using an experimental cohort large enough to detect a difference in magnitude. No overall difference in coefficient of variation between male and female rats was detected when analyzing these 142 phenotypes. We conclude that variability of 142 traits in male and female rats is similar, suggesting that differential treatment of males and females for the purposes of experimental design is unnecessary until proven otherwise, rather than the other way around.
Hyeonju Kim1, Saunak Sen1, John Lovell2, Thomas Jeunger3
1Division of Biostatistics, Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, TN, 38163, USA
2Genome Sequencing Center, HudsonAlpha Institute for Biotechnology, Huntsville, AL, 35806, USA
3Department of Integrative Biology, University of Texas at Austin, TX, 78712, USA
Multivariate linear mixed models for detecting GxEWe develop a multivariate linear mixed model for detecting gene-environment interactions when there are many environments, and we have information annotating the environments. Our prototype example datasets are on segregating plant populations grown in multiple sites in multiple years. We will have information on the weather in each year as well as site-specific information such as latitude. The goal is to find QTLs that depend on latitude accounting for weather patterns that vary by year. We formulate a linear mixed model where traits can be correlated due to genomewide similarities (genetic kinship) and due to weather similarities ('climate kinship') between environments. We implement an efficient algorithm that uses an Expectation Conditional Maximization (ECM) algorithm in conjunction with an acceleration step. We tested the method for detecting gene by gene interactions in Arabidopsis recombinant inbred lines grown in Sweden and Italy for 3 consecutive years. In this dataset, our performance is comparable to univariate linear mixed models (FaSTLMM). Further evaluation of the method in a switchgrass dataset grown in 10 locations is in progress. Julia package implementing the methods is in development.
Elena Skolnikova1,2, Nicole Chambers1, Karel Chalupsky1, Peter Makovicky1, Blanka Chylikova2, Josef Vcelak3, Bela Bendlova3, Ondrej Seda2, Radislav Sedlacek2, Lucie Sedova1,2
1Czech Centre for Phenogenomics, Institute of Molecular Genetics, Vestec, Czech Republic
2Institute of Biology and Medical Genetics, The First Faculty of Medicine, Charles University, Prague, Czech Republic
3Department of Molecular Endocrinology, Institute of Endocrinology, Prague, Czech Republic
CRISPR/Cas9-targeting of Nme7 gene affects the metabolic syndrome-related parameters and transcriptomic profiles of Sprague Dawley rats
We have previously identified an association of genetic variants in NME7 (non-metastatic cells 7, nucleoside diphosphate kinase 7) gene with indices of insulin sensitivity and dyslipidemia in two independent human cohorts. We aimed to investigate the role of Nme7 in metabolic syndrome using a targeted rat model.
The CRISPR/Cas9 nuclease system was used for generation of Sprague Dawley Nme7 knock-out rats targeting the exon 4 of Nme7 gene. As the homozygous Nme7 targeted allele was not viable mostly due to prominent and severe hydrocephalus, we compared metabolic profiles of heterozygous SDNme7+/- male and female adult rats with their wild type littermates (SD).
Body weights of SDNme7+/- males compared to SD males started to be significantly higher at week 9 and this difference was preserved till sacrifice at week 22, but body weight of SDNme7+/- and SD females remained comparable. Significant differences were observed in glucose tolerance on standard diet, where both SDNme7+/- male and female rats had significantly higher glucose levels during intraperitoneal glucose tolerance test resulting in larger area under the glycemic curve. At sacrifice, biochemical measurements revealed that urea, serum creatinine and potassium levels significantly differed due to genotype: urea and creatinine levels were significantly higher only in SDNme7+/- male rats compared to SD males (but no differences were observed in female rats). We observed a genotype-sex interaction for potassium levels where SDNme7+/- male rats had significantly higher potassium levels compared to male wild type littermates and in females the SDNme7+/- had significantly lower potassium levels compared to wild type littermates. There were no differences in organ weights. We performed liver and fat pad transcriptomic analyses of SDNme7+/- and SD males, identifying several dysregulated metabolic and signalling pathways in SDNme7+/- including nodes related to cilia such as Ift140, Ift57, Ttc26, Cep290, Bbs5 genes.
Lauren A. Vanderlinden1, Paula L. Hoffman2, Spencer Mahaffey3, Harry Smith1, Michal Pravenec4, Laura M. Saba3, Boris Tabakoff3
1Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado, 80045, USA
2Department of Pharmacology, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, Colorado, 80045, USA
3Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, Colorado, 80045, USA
4Department of Model Disease, Institute of Physiology of the Czech Academy of Sciences, Prague, Czech Republic.
Genetic Predisposition for Hypertension
Hypertension has a large genetic component as it’s estimated to have a heritability around 40% in humans. Many underlying genetic determinates of this complex trait and associated biological pathways remain unknown. The goal of this analysis is to identify modules of co-expressed genes that predispose to differences in systolic and diastolic blood pressure (SBP/DBP).
Blood pressure, RNA expression levels, and DNA variant information from the HXB/BXH recombinant inbred (RI) panel were combined using weighted gene co-expression network analysis (WGCNA) and a well-established genetical genomic/phenomic approach. This approach identifies candidate modules whose expression correlates with the phenotype and whose expression is associated with the same region of the genome that is associated with the phenotype, i.e., overlap of the physiological quantitative trait loci (QTL) and the module eigengene QTL. The phenotype for this analysis was telemetric daytime blood pressure (both SBP and DBP). The DNA variants were a genetic marker set (SNPs) from the STAR consortium. RNA expression levels were derived from sequencing of polyA-selected RNA from left ventricles of the HXB/BXH panel and originally collected by the EURATRANS Consortium. RNA-Seq reads where quantitated at the Ensembl gene level, normalized to remove unwanted variance, and transformed in preparation for WGCNA.
The DBP and SBP both exhibited relatively high heritability in the HXB/BXH panel (35.8% and 56.1%, respectively). Three suggestive/significant QTLs for SBP were identified and 2 QTLs for DBP. The 15,161 Ensembl genes expressed in the left ventricle of the RI panel were clustered into 1,316 co-expression modules. Using our integrative systems genetics approach, a candidate module was identified for SBP. The candidate module contains 8 genes located in a variety of regions on the genome but the expression of the module and SBP is controlled from a region on chromosome 1. Il22ra1 is the most connected gene within the module and codes for part of a known cytokine receptor. Other genes in the module include Itpka, whose protein product is involved in calcium signaling, Casq1, which codes for a calcium sensor in cardiac muscle, and Bambi which codes for a pseudoreceptor limiting the TGF-beta signaling pathway.
Using a systems genetics approach, we were able to identify a set of co-expressed genes associated with SBP. Overall, the functional overlap of genes within this module suggests calcium signaling in the left ventricle as an important mechanism involved in the predisposition of hypertension.
Supported by NIAAA R24AA013162 and NIDA P30DA044223.
Qian Zhang1, Gina Calabrese1, Angela Verado2, Larry D. Mesner2, Thomas L. Clemens1, Charles R. Farber2,3
1Orthopedic Surgery Department, Johns Hopkins University, Baltimore, MD
2Center for Public Health Genomics, University of Virginia, Charlottesville, VA
3Departments of Public Health Sciences and Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA
MARK3 is the causal gene for a pleiotropic GWAS locus affecting osteoporosis and obesityGenome-wide association studies (GWASs) have identified thousands of loci associated with complex diseases; however, little progress has been made towards identifying the responsible genes. A recent large-scale human GWAS of bone mineral density (BMD), the single strongest predictor of osteoporotic fracture, identified a locus on Chr14q32.32. To identify the gene responsible, we used a network-based strategy to predict that MAP/microtubule affinity-regulating kinase 3 (MARK3) was causal and regulated BMD through a role in bone-forming osteoblasts. MARK3 is a conserved serine/threonine kinase known to regulate cellular bioenergetics not previously implicated in the regulation of bone mass. Our prediction was supported by MARK3 expression quantitative trait loci (eQTL) in multiple human tissues that colocalized with the BMD locus. From the direction of the eQTL effect, we predicted that decreased MARK3 would be associated with increased BMD. To validate these predictions, we characterized the phenotypes of mice lacking Mark3 globally (Mark3-/-) or selectively in osteoblasts (Mark3oc-cre/-). Male 12-week-old mice from both mutant cohorts had identical femoral morphometry characterized by progressively increased BMD exclusively in the cortical compartment. Biomechanical measurements also indicated that femurs from knockout mice were stronger than controls. It has been shown previously that Mark3-/- mice are lean and glucose tolerant when challenged with a high-fat diet. To determine if this is also true in humans, we used GWAS data on body mass index (BMI) and determined that the same variants associated with BMD also influenced BMI. Consistent with the mouse data, alleles associated with increased BMD were associated with decreased BMI. Interestingly, when Mark3oc-cre/- mice were challenged with a high fat diet, mutants mice gained less weight than controls and exhibited increased glucose tolerance and insulin sensitivity. These data strongly suggest that the metabolic phenotype was secondary to an intrinsic function of Mark3 in the osteoblast. These findings identify MARK3 as the gene responsible for a pleiotropic locus affecting BMD and BMI in humans and highlight the importance of osteoblasts in the regulation of both bone mass and metabolism. These data also suggest that therapeutic inhibition of MARK3 could have positive impacts on both bone fragility and obesity.
Laura M. Saba1, Paula L. Hoffman1,2, Boris Tabakoff1
1Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, Colorado, 80045, USA
2Department of Pharmacology, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, Colorado, 80045, USA
Systems genetics and the ‘linc’ to alcohol consumptionWe have used an integrative systems genetics approach, that takes into account genetic variation, and the effect of variation on phenotype and gene expression levels, to elucidate the genetic basis for the complex trait of alcohol consumption. With this approach, we identified a co-expression network associated with predisposition to voluntary alcohol consumption in the two-bottle choice paradigm using a subset of the Hybrid Rat Diversity Panel (HRDP) and several pairs of rat selected lines. Based on common functionalities among genes, this network provided insight into the biological mechanisms important for differences in alcohol consumption. At the center of this network was an unannotated, likely non-coding, transcript. Further statistical interrogation of this unannotated transcript indicated that much of the co-expression among other genes in the candidate network was due to shared regulation by the unannotated transcript that we refer to as LRAP (locus regulating alcohol preference) that resides within a behavioral QTL for alcohol consumption. The integral role of this transcript in alcohol consumption was validated by developing a CRISPR/Cas9 “knockout” of the transcript and assessing its effect on alcohol consumption. The reduction of LRAP expression exerted a dominant effect on alcohol consumption, i.e., the knockout genotype and the heterozygous genotype exhibited a similar increase in consumption compared to the wildtype rats. qRT-PCR assessment of the three other tightly connected genes from the original network demonstrated a similar dominant effect on P2rx4, purinergic receptor P2X 4. Assessment of the entire brain transcriptome in the knockout rat model via RNA-Seq validated predicted changes in expression of other transcripts derived from the original HRDP data and provided insight into the influence of the non-coding transcript on transcription of protein-coding genes. The combination of the HRDP and the systems genetics techniques represents a powerful tool for modeling the complex relationships between DNA, RNA, and alcohol-related phenotypes. Supported by NIAAA R24AA013162, NIDA P30DA044223 and the Banbury Fund.
Tristan V de Jong1, Victor Guryev1, Yury M. Moshkin2
*Contact e-mails: email@example.com; firstname.lastname@example.org
1European Research Institute for the Biology of Ageing, University of Groningen, Groningen, 9713AV, The Netherlands
2Department of Genetic Resources of Experimental Animals, Institute of Cytology and Genetics, Novosibirsk, 630090, Russia
Murine models implicate DNA sequence context as main determinant of robust gene expression
Variation of gene expression significantly deviates from a stochastic Poisson process. Here we show that dispersions of RNA synthesis and processing rates are correlated and largely depend on DNA-sequence context.
We developed models based on nucleotide sequences to predict the level of expression robustness for mouse and rat genes with unexpectedly high accuracy. We observed that high GC content downstream of the transcription start site, that may require increased energy for DNA melting, is a typical feature of non-robust gene expression. This, in turn, might be responsible for non-productive accumulation of RNA-polymerase II on such genes followed by its premature displacement.
Promoter architecture, DNA methylation, nucleosome phasing and their post-translational modifications are also predictive of relative gene robustness, albeit to a lesser extent. Metabolic cues and ageing are capable of modulating the variability of expression. We show that, in murine models, high-fat diet scales up the dispersions of RNA splicing/degradation, but not synthesis, rates. Aging tends to increase overdispersion for most of the genes.
Thus, we conclude that relative robustness of gene expression is largely intrinsically encoded, while its average level can be tuned extrinsically.
Zachary Sloan1, Pjotr Prins1, Frederick Muriithi2, Lei Yan2
1Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee
Recent GeneNetwork 2 improvements and the path forward
The web service GeneNetwork was originally developed as an easy to use collection of data sets and analysis tools for the scientific community. GeneNetwork 2 both expands upon the features of GeneNetwork and provides a more robust platform for the integration of stand-alone tools (for example the genome analysis software GEMMA). This makes future collaboration with others far easier, as third-party software can be run outside of the web server and GeneNetwork 2 only needs to be concerned with providing a user interface and displaying the output.
Some other specific recent improvements include updating our basic statistics figures (histogram, probability plot, etc) and correlation scatterplots to be interactive and exportable. User information and trait collections are also now indexed using ElasticSearch, a RESTful search and analytics engine. In the near future we plan on adding several new features, such as partial correlations and a more interactive mapping results figure. We also plan to add an interface for data submission and a more customizable user experience (for example allowing a user to save settings for mapping or other features).
This talk will showcase and elaborate upon this progress and our future plans.
Aron M Geurts1,2,3, Rebecca Schilling1,2,3, Michael Grzybowski1,2,3, Anne Temple1,2,3, Allison Zappa1,2,3, Lynn Lazcares1,2,3, Jessica Niebuhr1,2,3, Shawn Kalloway2, Jamie Foeckler2, Akiko Takizawa1,2,3, Andrew S. Greene3, Allen W Cowley, Jr.3, Mingyu Liang3, David Mattson3, Melinda R Dwinell1,2,3
1Genome Editing Rat Resource Center, Medical College of Wisconsin, Milwaukee, WI 53226, U.S.A.
2Genome Sciences and Precision Medicine Center, Medical College of Wisconsin, Milwaukee, WI, 53226, U.S.A.
3Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, U.S.A
Mechanisms of blood pressure control: impact of rat genomic and transgenic model resourcesThe field of rat genomics and its application to physiological investigations of complex traits such as blood pressure control has a productive history. The fine mapping of quantitative traits to single genes often leads investigators down new and challenging research avenues where they never thought they would find themselves. The advent of new transgenic and gene editing technologies, along with the thoughtful building of rat model resources, arrived at the right time to facilitate the validation of candidate genes. Often, we have seen the correlated orthologous human phenotype-genotype associations come in the years that follow. The speed and elegance of gene editing particularly rodent models has also allowed the reverse approach – mapping in humans and validation in the rat model - to improve our understanding of human genetic risk while uncovering complex genetic interactions. This presentation will navigate the impact and examples of where classic and modern genomic approaches have led a highly collaborative group of investigators toward narrowing in on translational mechanisms in blood pressure regulation.
Pjotr Prins1, Christian Fischer1, Prasun Anand1
1University of Tennessee Health Science Center
Linear Mixed Models: GEMMA advances in Genome-wide association mapping
Linear mixed models (LMMs) are typically applied for Genome-wide association (GWA) mapping in outbred populations and are also applied for fine mapping in heterogeneous stock (HS) rat and mice. Even with small inbred populations LMMs can be applied for exploratory searches when SNP data is available.
In the context of GeneNetwork (http://gn2.genenetwork.org/), a free software framework for web-based genetics, we have been progressing tooling for LMM fine mapping. We have been improving the GEMMA code base by fixing bugs and improving diagnostics. We have added new features, such as support for leave one chromosome out analysis (LOCO) and estimating significance levels. We have a new GEMMA extension written in the performant D programming language so computations can be scaled up. We are also creating a dynamic GWA browser which allows for interactive visualisation of GWA data and can be run standalone, i.e., within a published paper or website.
In this talk I'll discuss LMM advances, scalability and reproducibility challenges, and how it applies to mouse and rat research.
Wei-Zen Wei1, Heather M. Gibson1, Richard F. Jones2, Joyce Reyes2, Gregory Dyson2, Dan Gatti2, Kuang Wei and Claire McCarthy1,2
*Corresponding author: email@example.com
1Karmanos Cancer Institute, Department of Oncology, Wayne State University, Detroit, MI, USA
2The Jackson Laboratory, Bar Harbor, Maine, USA
Progression of HER2/Neu induced spontaneous tumors and immune response to HER2 vaccination in Diversity Outbred (DO) F1 mice
Breast cancer remains the second leading cause of cancer death in women, with ~25% of the tumors expressing elevated levels of HER2, an oncogene. Regulation of HER2 induced spontaneous tumorigenesis is investigated in F1(DOxNeuT) transgenic (Tg) mice. NeuT is a transforming mutant of rat Neu, a homologue of human HER2. All parental NeuT males develop a salivary tumor and 0-2 mammary tumors. In F1(DOxNeuT) males, however, only 2 of 22 mice develop salivary tumors, but all mice develop mammary tumors (1-5 tumors per mouse). Most male mammary tumors appear in the thoracic, not the inguinal fat pads. In female NeuT (n=38) and F1(DOxNeuT) (n=48) mice, mammary tumors appear in all thoracic and inguinal glands. Six of the F1 females show mammary tumors between 10-14 wks of age while parental NeuT females do not develop mammary tumors before 14 wks. Therefore, tissue microenvironment may be altered in F1 mice to accelerate tumorigenesis in the fat pad, while preventing tumorigenesis in the salivary gland. The impact of genetic background on tumor progression revealed in DO F1 mice warrants the identification of regulatory genes. Additional mice are being generated to enable QTL analysis.
To identify the genetic determinants of cancer vaccine response, F1(DOxhuman HER2) Tg mice are vaccinated with Adv/HER2. -HER2 IgG levels vary greatly in F1 compared to parental HER2 Tg mice, but the response levels are consistently boosted by prior depletion of regulatory T cells (Tregs), implicating Treg associated genes as key modulators. GigaMUGA QTL analysis of ~100 mice showed suggestive, but not definitive association with some candidate genes. More mice are entering into the study to further the linkage analysis. Meanwhile, from reconstructed haplotype map and Tagman analysis, a Treg associated gene TIM3 of CAST/EiJ founder genotype shows an association with the vaccine response. Analysis of 84 vaccinated DOxHER2 Tg mice using a tag SNP rs28230613 shows that mice with homozygous CC produce HER2 Ab at 26.9±28.9 µg/mL, and those with heterozygous CA (CAST/Eij genotype) 12.8±19.4 µg/mL (p=0.0075). TIM3 promoter from CAST strain shows altered activity by luciferase expression assay. Given the analytical capability of DO mice, regulators of tumor progression and vaccine response may be identified by QTL or targeted gene analysis to help guide the design and administration of cancer therapy. Supported by US NIH CA76340, Helen Kay Trust Fund and Herrick Endowment.
1University of Michigan
Genetics of Novelty Seeking and Propensity for Drug Abuse in Outbred RatsUnderstanding individual differences in vulnerability to substance abuse and addiction constitutes a long-standing challenge for clinicians and researchers. We are studying the genetic and functional basis of novelty-seeking behavior in two rat lines that offer a uniquely powerful model for understanding neural mechanisms of drug seeking, addiction, and relapse. After 37 generations of selection for high and low propensity to explore a mildly stressful novel environment, the bred High Responders (bHRs) and Low Responders (bLRs) show contrasting, heritable behaviors. Compared to bLRs and outbred rats, bHRs exhibit higher novelty-seeking and impulsive behaviors, lower anxiety, greater propensity to psychostimulant sensitization, and lower thresholds for drug- and cue-induced relapse. The bLRs exhibit anxious and depressive behaviors and are more responsive to psychosocial stress, which triggers drug-seeking behavior. The two lines exemplify extremes of emotional reactivity that map onto human temperamental differences and underlie two paths to drug abuse—novelty seeking and reactivity to psychosocial stress. We hypothesize that functional DNA variants in some genes, initially derived from outbred Sprague-Dawley (SD) founders, account for the current molecular and behavioral divergence of the two lines. By identifying these causal genes through mapping of quantitative trait loci in an F2 cohort already collected and SD animals representing the founders, we expect to find functional alleles at multiple genes that existed in the SDs and have evolved further apart in the two lines. Many of these genes may be directly relevant to the corresponding human phenotypes or at least provide clues to important pathways that could explain or predict the differential vulnerability to addiction and relapse in humans. Integrating the genotyping data with RNAseq gene expression results in specific brain regions will enable us to relate the genetic, neural, and behavioral facets that contribute to addiction liability and translate this knowledge to more precise and effective treatment for patients.
Hao Chen1, Victor Guryev2, Megan K. Mulligan3, David Ashbrook3, Eva Redei4, Robert W. Williams3
1Department of Pharmacology, University of Tennessee Health Science Center, Memphis, TN, USA
2European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, Groningen 9713AD, The Netherlands
3Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
4Department of Psychiatry and Behavioral Sciences, Feinberg School of Medicine, Northwestern University Chicago, IL, USA
De novo assembly of the genome of a genetic rat model of depression and its control strainMajor depressive disorder (MDD) is a leading cause of disability worldwide. Heritability of MDD is estimated to be between 28–44%, although the causal DNA variants underlying depression remain elusive. The WKY rat strain is a well-established model of depression. Dr. Redei obtained nearly inbred WKY stock from Harlan Laboratories in the mid-1990s and selectively bred animals based on behavioral differences during the forced swim test. She generated two very closely related Wistar substrains characterized by More or Less Immobility during the test (the WMI and WLI lines). Both lines are now fully inbred (>35 F generations). Behaviors of WMI resemble facets of human MDD and anxiety, including depressed mood, disturbed sleep, appetite, etc. We previously described the full genome sequences of these two substrains and identified ~4,400 segregating SNPs and small indels between them. Most are located in noncoding regions but there are a small number of intriguing variants located in exons (e.g. Pclo) or splice sites (e.g., Rab1a, Slc01a2, Ryr3, Lyg1, and Nap1l1). We validated these results by randomly selecting 70 variants for Sanger re-sequencing. PCR products were obtained for 51 regions and 49 polymorphic variants were confirmed. Using a set of high molecular weight 10x Chromium sequencing data (~30x each strain), we identified ~40 large structural variants between each strain and the reference genome (rn6). Given the minimal differences between these two genomes, we combined their 10x Chromium data for de novo assembly using the Supernova tool. We obtained 8.1 K scaffolds greater than 10 kb and the draft assembly was 2.42 Gb. N50 was 35.6 Kb for the contigs and 1.6 Mb for the scaffolds. The majority of the reference genome was covered, including some gaps in rn6 assembly. We are exploiting these scaffolds for opportunities to improve the reference genome. A detailed comparison between these two genomes provides a unique opportunity to identify causal genes and potentially new mechanisms that modulate depression and related behavioral traits.
Jennifer R. Smith1, Stanley J. Laulederkind1, G. Thomas Hayman1, Shur-Jen Wang1, Matthew J. Hoffman1, Elizabeth R. Bolton1, Monika Tutaj1, Jyothi Thota1, Marek A. Tutaj1, Jeffrey L. De Pons1, Melinda R. Dwinell1, Mary E. Shimoyama1
1Rat Genome Database, Medical College of Wisconsin, Milwaukee, WI, 53226, USA
PhenoMiner: a multi-species platform for quantitative phenotype dataPhenotype is defined as any one trait, or any group of traits, which contribute to the physical, biochemical, and physiological makeup of an individual as determined by both genetic composition and environmental influences. As such, the information needed to fully describe a measurement of a phenotype or trait should include both an indication of the genetics of the organism being assessed and information about environmental influences that might affect the measurement. The Rat Genome Database (RGD, http://rgd.mcw.edu) has developed a system to standardize and fully describe quantitative phenotype measurements using a suite of controlled vocabularies. These quantitative phenotype records, available through the PhenoMiner tool (http://rgd.mcw.edu/phenotypes/), include information on the trait being assessed (Vertebrate Trait Ontology), the exact measurement that was made (Clinical Measurement Ontology), the method used to make that measurement (Measurement Method Ontology) and the condition(s) under which the measurement was made (Experimental Condition Ontology). For rat data, the sample measured—and by extension, its associated genetics—is presented as information on rat strain (Rat Strain Ontology) and the number of individuals included in the measurement with their sex and ages. In addition to rat data and in keeping with RGD's expansion to include additional model species, the PhenoMiner has been expanded to store and display data for quantitative phenotypes in chinchilla. As a proof of concept, two initial datasets have been loaded: ear infection data from the Kerschner group at the Medical College of Wisconsin, and urinalysis data from Dr. Christoph Mans at the School of Veterinary Medicine, University of Wisconsin, Madison. Chinchillas used for research are not inbred and are not generally bred in-house by the researchers using them. Rather, they are purchased from commercial breeders or chinchilla "farms" where the animals are generally maintained by outbreeding within colonies of limited size. As such, a controlled vocabulary of chinchilla sources was developed to indicate measurements from animals that would be more or less related. Since the underlying genetics of the animals would be divergent, data is presented for single animals, not grouped or averaged across animals. Plans for future developments include the incorporation of data for additional species, as well as the use of PhenoMiner's underlying structure for other types of data such as cell line measurements and expression data.
M. Hodúlová, L., Šedová, B. Chylíková, M. Krupková, D. Křenová, V. Křen, O. Šeda
Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University, Prague, Czech Republic
Genomic, transcriptomic and miRNomic analysis of PXO recombinant inbred rat models of metabolic syndrome
We have previously established a genetically designed set of recombinant inbred rat strains PXO as a model of metabolic syndrome. Aim of this study was to compare PXO substrains showing contrast in metabolic syndrome parameters on genomic, transcriptomic and miRNomic levels.
At genomic level we have compared > 20,000 SNPs between PXO3-1 and PXO3-2. Both RNA and miRNA were isolated from liver, visceral adipose tissue and muscle (m. soleus) of standard diet-fed adult males of both strains and its integrity was checked by Agilent 2000 BioAnalyzer. The transcriptomic and miRNomic assays were run using Affymetrix® Rat Gene 2.1 ST Array Strip and Affymetrix® miRNA 4.1 Array Strip. Resulting data were subjected to systems biology-level analyses using Partek Genomics Suite and Ingenuity Pathways Analysis.
PXO3-2 showed impaired glucose tolerance, higher TG and HDL-cholesterol together with lower adiposity compared to PXO3-1. On the genomic level, we have identified polymorphic regions on chromosomes 1, 3, 5, 8, 12, 16 and 19 totalling at 3.2% of genome. After correction for multiple comparisons, there were 1133, 236 and 29 differentially expressed transcripts between PXO3-1 and PXO3-2 in liver, visceral adipose tissue and muscle, respectively. Integrative pathway analysis revealed networks likely to underlie the observed metabolic differences including Ppara, Ppard, Insig2, Por a Srebf2 genes as their major nodes.
Using integrative approach of genomic, transcriptomic and miRNomic analysis in, we have identified major biological networks contributing to the pathophysiology of several aspects of metabolic syndrome in the recombinant inbred model set.