OPAR - OSGA publications

OPAR - List of publications

Index	Year	Title	Authors	Citation	PMID	Software link	Abstract
01	biorxiv	SVJAM: Joint Analysis of Structural Variants Using Linked Read Sequencing Data	Mustafa Hakan Gunturkun, Flavia Villani, Vincenza Colonna, David Ashbrook, Robert W. Williams, Hao Chen	bioRxiv 2021.11.02.467006	bioRxiv 2021.11.02.467006		Linked-read whole genome sequencing methods, such as the 10x Chromium, attach a unique molecular barcode to each high molecular weight DNA molecule. The samples are then sequenced using short-read technology. During analysis, sequence reads sharing the same barcode are aligned to adjacent genomic locations. The pattern of barcode sharing between genomic regions allows the discovery of large structural variants (SVs) in the range of 1 Kb to a few Mb. Most SV calling methods for these data, such as LongRanger, analyze one sample at a time and often produces inconsistent results for the same genomic location across multiple samples. We developed a method, SVJAM, for joint calling of SVs, using data from 152 members of the BXD family of recombinant inbred strains of mice. Our method first collects candidate SV regions from single sample analysis, such as those produced by LongRanger. We then retrieve barcode overlapping data from all samples for each region. These data are organized as a high dimensional matrix. The dimension of this matrix is then reduced using principal component analysis. Samples projected onto a two dimensional space formed by the first two principal components forms two or three clusters based on their genotype, representing the reference, alternative, or heterozygotic alleles. We developed a novel distance measure for hierarchical clustering and rotating the axes to find the optimal clustering results. We also developed an algorithm to decide whether the pattern of sample distribution is best fitted with one, two, or three genotypes. For each sample, we calculate its membership score for each genotype. We compared results produced by SVJAM with LongRanger and few methods that rely on PacBio or Oxford Nanopore data. In a comparison of SVJAM with SV detected using long-read sequencing data for the DBA/2J strain, we found that our results recovered many SVs missed by LongRanger. We also found many SVs called by LongRanger were assigned with an incorrect SV type. Our algorithm also consistently identified heterozygotic regions.
02	biorxiv	Flexible multivariate linear mixed models for structured multiple traits	Hyeonju Kim, Gregory Farage, John T. Lovell, John K. Mckay, Thomas E. Juenger, Śaunak Sen	bioRxiv 2020.03.27.012690	bioRxiv 2020.03.27.012690	Software link	Many genetic studies collect structured multivariate traits containing rich information across traits. We present a flexible multivariate linear mixed model for quantitative trait loci mapping (FlxQTL) for multiple correlated traits that adjusts for genetic relatedness and that models information on multiple environments or multiple timepoints using trait covariates. FlxQTL handles genetic mapping of multivariate traits faster with greater flexibility compared to previous implementations.
03	biorxiv	Opiate responses are controlled by interactions of Oprm1 and Fgf12 loci in the murine BXD family: Correspondence to human GWAS findings	Paige M. Lemen, Alexander S. Hatoum, Price E. Dickson, Guy Mittleman, Arpana Agrawal, Benjamin C. Reiner, Wade Berrettini, David G. Ashbrook, Mustafa Hakan Gunturkun, Megan K. Mulligan, Robert W Williams, Hao Chen	bioRxiv 2022.03.11.483993	bioRxiv 2022.03.11.483993		We analyzed time-dependent behavioral responses to morphine and naloxone obtained from a large family of young adult BXD mice (n = 63–64 strains, including C57BL/6J and DBA/2J parents, 4–9 cases per strain) using the latest whole genome sequencing (WGS)-based genetic markers. These data include quantitative locomotor and behavior responses measured three hours after an acute morphine injection (50 mg/kg i.p.), followed by naloxone-induced withdrawal, obtained by Philip et al (2010). Locomotor data were analyzed in 15 min bins and mapped jointly for both sexes or independently for males and females. We confirmed a highly significant association between locomotor response and a genomic region that overlaps with Oprm1 on Chr 10 at 6.8 Mb (LOD maximum of 11.4) between 15–105 min, with a peak at 75 min. Effects were modestly dependent of sex. Strains that were B homozygotes run 76 meters farther than those that are D homozygotes after a morphine injection. We discovered a novel association between a locus on Chr 16 and a late phase locomotor response (after 150 min) in both sexes. This locus had a significant but transient epistatic interaction with the Oprm1 locus between 45–90 min, well before the main effect was detectable. The Chr 16 locus includes one compelling candidate—fibroblast growth factor 12 (Fgf12). Null mutation of Fgf12 has been shown to cause locomotor deficits (e.g., ataxia). Analysis of genes correlated with both OPRM1 and FGF12 in human GWAS data of six brain regions (GTEx, v8) demonstrated an enrichment of genetic signals associated with SUD phenotypes, and a modest corroboration of variants in the FGF12 loci on Chr 3q28. To the best of our knowledge this is the first demonstration of a transient time-dependent epistatic interaction modulating drug response in mammals—a finding with interesting mechanistic implications. Finally, this work demonstrates how high-quality FAIR+ data can be used with newly acquired data sets to yield striking results, and how joint mouse and human neurogenomic and mapping data can be merged at gene and network levels for bidirectional validation of potential SUD variants and molecular networks.
04	biorxiv	Computational approaches towards reducing contamination in single-cell RNA-seq data	Siamak Yousefi, Hao Chen, Jesse Ingels, Arthur G. Centeno, Sumana Chintalapudi, Megan K. Mulligan, Bryan Jones, Pete A. Williams, Simon WM John, Felix L. Struebing, Eldon E. Geisert, Monica M. Jablonski, Lu Lu, Robert W. Williams	bioRxiv 2020.07.15.205062	bioRxiv 2020.07.15.205062		Single cell RNA sequencing has enabled quantification of single cells and identification of different cell types and subtypes as well as cell functions in different tissues. Single cell RNA sequence analyses assume acquired RNAs correspond to cells, however, RNAs from contamination within the input data are also captured by these assays. The sequencing of background contamination as well as unwanted cells making their way to the final assay Potentially confound the correct biological interpretation of single cell transcriptomic data. Here we demonstrate two approaches to deal with background contamination as well as profiling of unwanted cells in the assays. We use three real-life datasets of whole-cell capture and nucleotide single-cell captures generated by Fluidigm and 10x technologies and show that these methods reduce the effect of contamination, strengthen clustering of cells and improves biological interpretation.
05	arXiv	Sparse matrix linear models for structured high-throughput data	Liang JW, Sen S	arXiv:1712.05767 [stat.CO]	arXiv:1712.05767 [stat.CO]	Software link	Recent technological advancements have led to the rapid generation of high-throughput biological data, which can be used to address novel scientific questions in broad areas of research. These data can be thought of as a large matrix with covariates annotating both rows and columns of this matrix. Matrix linear models provide a convenient way for modeling such data. In many situations, sparse estimation of these models is desired. We present fast, general methods for fitting sparse matrix linear models to structured high-throughput data. We induce model sparsity using an L1 penalty and consider the case when the response matrix and the covariate matrices are large. Due to data size, standard methods for estimation of these penalized regression models fail if the problem is converted to the corresponding univariate regression scenario. By leveraging matrix properties in the structure of our model, we develop several fast estimation algorithms (coordinate descent, FISTA, and ADMM) and discuss their trade-offs. We evaluate our method's performance on simulated data, E. coli chemical genetic screening data, and two Arabidopsis genetic datasets with multivariate responses. Our algorithms have been implemented in the Julia programming language and are available at this https URL.
06	2023	RCFGL: Rapid Condition adaptive Fused Graphical Lasso and application to modeling brain region co-expression networks.	Seal S, Li Q, Basner EB, Saba LM, Kechris K.	PLoS Comput Biol. 2023 Jan 6;19(1):e1010758.	PMID: 36607897	Software link	Inferring gene co-expression networks is a useful process for understanding gene regulation and pathway activity. The networks are usually undirected graphs where genes are represented as nodes and an edge represents a significant co-expression relationship. When expression data of multiple (p) genes in multiple (K) conditions (e.g., treatments, tissues, strains) are available, joint estimation of networks harnessing shared information across them can significantly increase the power of analysis. In addition, examining condition-specific patterns of co-expression can provide insights into the underlying cellular processes activated in a particular condition. Condition adaptive fused graphical lasso (CFGL) is an existing method that incorporates condition specificity in a fused graphical lasso (FGL) model for estimating multiple co-expression networks. However, with computational complexity of O(p2K log K), the current implementation of CFGL is prohibitively slow even for a moderate number of genes and can only be used for a maximum of three conditions. In this paper, we propose a faster alternative of CFGL named rapid condition adaptive fused graphical lasso (RCFGL). In RCFGL, we incorporate the condition specificity into another popular model for joint network estimation, known as fused multiple graphical lasso (FMGL). We use a more efficient algorithm in the iterative steps compared to CFGL, enabling faster computation with complexity of O(p2K) and making it easily generalizable for more than three conditions. We also present a novel screening rule to determine if the full network estimation problem can be broken down into estimation of smaller disjoint sub-networks, thereby reducing the complexity further. We demonstrate the computational advantage and superior performance of our method compared to two non-condition adaptive methods, FGL and FMGL, and one condition adaptive method, CFGL in both simulation study and real data analysis. We used RCFGL to jointly estimate the gene co-expression networks in different brain regions (conditions) using a cohort of heterogeneous stock rats. We also provide an accommodating C and Python based package that implements RCFGL. Keywords: co-expression networks; software; RNA sequencing, RNA-Seq
07	2022	Hippocampal RNA sequencing in mice selectively bred for high and low activity	Booher WC, Vanderlinden LA, Hall LA, Thomas AL, Evans LM, Saba LM, Ehringer MA	Genes Brain Behav. 2022 Dec 13:e12832.	PMID: 36514243		High and Low Activity strains of mice were bidirectionally selected for differences in open-field activity (DeFries et al., 1978, Behavior Genetics, 8: 3-13) and subsequently inbred to use as a genetic model for studying anxiety-like behaviors (Booher et al., 2021, Genes, Brain and Behavior, 20: e12730). Hippocampal RNA-sequencing of the High and Low Activity mice identified 3901 differentially expressed protein-coding genes, with both sex-dependent and sex-independent effects. Functional enrichment analysis (PANTHER) highlighted 15 gene ontology terms, which allowed us to create a narrow list of 264 top candidate genes. Of the top candidate genes, 46 encoded four Complexes (I, II, IV and V) and two electron carriers (cytochrome c and ubiquinone) of the mitochondrial oxidative phosphorylation process. The most striking results were in the female high anxiety, Low Activity mice, where 39/46 genes relating to oxidative phosphorylation were upregulated. In addition, comparison of our top candidate genes with two previously curated High and Low Activity gene lists highlight 24 overlapping genes, where Ndufa13, which encodes the supernumerary subunit A13 of complex I, was the only gene to be included in all three lists. Mitochondrial dysfunction has recently been implicated as both a cause and effect of anxiety-related disorders and thus should be further explored as a possible novel pharmaceutical treatment for anxiety disorders. Keywords: PANTHER; RNA-sequencing; anxiety; behavioral battery; enrichment analysis; hippocampus; mitochondria; open-field activity (OFA); quantitative trait loci (QTL); wheel running; seletive breeding
08	2022	GeneCup: mining PubMed and GWAS catalog for gene-keyword relationships	Gunturkun MH, Flashner E, Wang T, Mulligan MK, Williams RW, Prins P, Chen H	G3 (Bethesda). 2022 May 6;12(5):jkac059.	PMID: 35285473	Software link	Interpreting and integrating results from omics studies typically requires a comprehensive and time consuming survey of extant literature. GeneCup is a literature mining web service that retrieves sentences containing user-provided gene symbols and keywords from PubMed abstracts. The keywords are organized into an ontology and can be extended to include results from human genome-wide association studies. We provide a drug addiction keyword ontology that contains over 300 keywords as an example. The literature search is conducted by querying the PubMed server using a programming interface, which is followed by retrieving abstracts from a local copy of the PubMed archive. The main results presented to the user are sentences where gene symbol and keywords co-occur. These sentences are presented through an interactive graphical interface or as tables. All results are linked to the original abstract in PubMed. In addition, a convolutional neural network is employed to distinguish sentences describing systemic stress from those describing cellular stress. The automated and comprehensive search strategy provided by GeneCup facilitates the integration of new discoveries from omic studies with existing literature. GeneCup is free and open source software. The source code of GeneCup and the link to a running instance is available at https://github.com/hakangunturkun/GeneCup. Keywords: PubMed; addiction; custom ontology; literature mining; web service.
09	2022	An Approach to Biomarker Discovery of Cannabis Use Utilizing Proteomic, Metabolomic, and Lipidomic Analyses	Hinckley JD, Saba L, Raymond K, Bartels K, Klawitter J, Christians U, Hopfer C	Cannabis Cannabinoid Res. 2022 Feb;7(1):65–77.	PMID: 33998853		Introduction: Relatively little is known about the molecular pathways influenced by cannabis use in humans. We used a multi-omics approach to examine protein, metabolomic, and lipid markers in plasma differentiating between cannabis users and nonusers to understand markers associated with cannabis use. Methods: Eight discordant twin pairs and four concordant twin pairs for cannabis use completed a blood draw, urine and plasma toxicology testing, and provided information about their past 30-day cannabis use and other substance use patterns. The 24 twins were all non-Hispanic whites. Sixty-six percent were female. Median age was 30 years. Fifteen participants reported that they had used cannabis in the last 30 days, including eight participants that used every day or almost every day (29-30 of 30 days). Of these 15 participants, plasma 11-nor-9-carboxy-Δ9-tetrahydrocannabinol (THC-COOH) and total tetrahydrocannabinol (THC) concentrations were detectable in 12 participants. Among the eight ""heavy users"" the amount of total THC (sum of THC and its metabolites) and plasma THC-COOH concentrations varied widely, with ranges of 13.1-1713 ng/mL and 2.7-284 ng/mL, respectively. A validated liquid chromatography-tandem mass spectrometry (LC-MS/MS) assay measured plasma THC-COOH, THC, and other cannabinoids and metabolites. Plasma THC-COOH was used as the primary measure. Expression levels of 1305 proteins were measured using SOMAScan assay, and 34 lipid mediators and 314 metabolites were measured with LC-MS/MS. Analyses examined associations between markers and THC-COOH levels with and without taking genetic relatedness into account. Results: Thirteen proteins, three metabolites, and two lipids were identified as associated with THC-COOH levels. Myc proto-oncogene was identified as associated with THC-COOH levels in both molecular insight and potential marker analyses. Five pathways (interleukin-6 production, T lymphocyte regulation, apoptosis, kinase signaling pathways, and nuclear factor kappa-light-chain-enhancer of activated B cells) were linked with molecules identified in these analyses. Conclusions: THC-COOH levels are associated with immune system-related pathways. This study presents a feasible approach to identify additional molecular markers associated with THC-COOH levels. Keywords: Myc proto-oncogene; cannabis; markers; proteomics; metabolomics; lipidomics
10	2022	Beyond Genes: Inclusion of Alternative Splicing and Alternative Polyadenylation to Assess the Genetic Architecture of Predisposition to Voluntary Alcohol Consumption in Brain of the HXB/BXH Recombinant Inbred Rat Panel	Lusk R, Hoffman PL, Mahaffey S, Rosean S, Smith H, Silhavy J, Pravenec M, Tabakoff B, Saba LM	Front Genet. 2022;13:821026.	PMID: 35368676		Post transcriptional modifications of RNA are powerful mechanisms by which eukaryotes expand their genetic diversity. For instance, researchers estimate that most transcripts in humans undergo alternative splicing and alternative polyadenylation. These splicing events produce distinct RNA molecules, which in turn yield distinct protein isoforms and/or influence RNA stability, translation, nuclear export, and RNA/protein cellular localization. Due to their pervasiveness and impact, we hypothesized that alternative splicing and alternative polyadenylation in brain can contribute to a predisposition for voluntary alcohol consumption. Using the HXB/BXH recombinant inbred rat panel (a subset of the Hybrid Rat Diversity Panel), we generated over one terabyte of brain RNA sequencing data (total RNA) and identified novel splice variants (via StringTie) and alternative polyadenylation sites (via aptardi) to determine the transcriptional landscape in the brains of these animals. After establishing an analysis pipeline to ascertain high quality transcripts, we quantitated transcripts and integrated genotype data to identify candidate transcript coexpression networks and individual candidate transcripts associated with predisposition to voluntary alcohol consumption in the two-bottle choice paradigm. For genes that were previously associated with this trait (e.g., Lrap, Ift81, and P2rx4) (Saba et al., Febs. J., 282, 3556-3578, Saba et al., Genes. Brain. Behav., 20, e12698), we were able to distinguish between transcript variants to provide further information about the specific isoforms related to the trait. We also identified additional candidate transcripts associated with the trait of voluntary alcohol consumption (i.e., isoforms of Mapkapk5, Aldh1a7, and Map3k7). Consistent with our previous work, our results indicate that transcripts and networks related to inflammation and the immune system in brain can be linked to voluntary alcohol consumption. Overall, we have established a pipeline for including the quantitation of alternative splicing and alternative polyadenylation variants in the transcriptome in the analysis of the relationship between the transcriptome and complex traits. Keywords: HXB/BXH recombinant inbred panel; RNA-seq—RNA sequencing; alternative polyadenylation; alternative splicing; isoform; transcriptome; voluntary alcohol consumption; weighted gene co expression network analysis.
11	2022	Detecting Retinal Neural and Stromal Cell Classes and Ganglion Cell Subtypes Based on Transcriptome Data with Deep Transfer Learning	Madadi Y, Sun J, Chen H, Williams R, Yousefi S	Bioinformatics. 2022 Sep 15;38(18):4321-4329.	PMID: 35876552	Software link	Motivation: To develop and assess the accuracy of deep learning models that identify different retinal cell types, as well as different retinal ganglion cell (RGC) subtypes, based on patterns of single-cell RNA sequencing (scRNA-seq) in multiple datasets. Results: Deep domain adaptation models were developed and tested using three different datasets. The first dataset included 44 808 single retinal cells from mice (39 cell types) with 24 658 genes, the second dataset included 6225 single RGCs from mice (41 subtypes) with 13 616 genes and the third dataset included 35 699 single RGCs from mice (45 subtypes) with 18 222 genes. We used four loss functions in the learning process to align the source and target distributions, reduce misclassification errors and maximize robustness. Models were evaluated based on classification accuracy and confusion matrix. The accuracy of the model for correctly classifying 39 different retinal cell types in the first dataset was ∼92%. Accuracy in the second and third datasets reached ∼97% and 97% in correctly classifying 40 and 45 different RGCs subtypes, respectively. Across a range of seven different batches in the first dataset, the accuracy of the lead model ranged from 74% to nearly 100%. The lead model provided high accuracy in identifying retinal cell types and RGC subtypes based on scRNA-seq data. The performance was reasonable based on data from different batches as well. The validated model could be readily applied to scRNA-seq data to identify different retinal cell types and subtypes.
12	2022	The regulatory landscape of multiple brain regions in outbred heterogeneous stock rats	Munro D, Wang T, Chitre AS, Polesskaya O, Ehsan N, Gao J, Gusev A, Woods LCS, Saba LM, Chen H, Palmer AA, Mohammadi P	Nucleic Acids Res. 2022 Oct 28;50(19):10882-10895.	PMID: 36263809	Software link	Heterogeneous Stock (HS) rats are a genetically diverse outbred rat population that is widely used for studying genetics of behavioral and physiological traits. Mapping Quantitative Trait Loci (QTL) associated with transcriptional changes would help to identify mechanisms underlying these traits. We generated genotype and transcriptome data for five brain regions from 88 HS rats. We identified 21 392 cis-QTLs associated with expression and splicing changes across all five brain regions and validated their effects using allele specific expression data. We identified 80 cases where eQTLs were colocalized with genome-wide association study (GWAS) results from nine physiological traits. Comparing our dataset to human data from the Genotype-Tissue Expression (GTEx) project, we found that the HS rat data yields twice as many significant eQTLs as a similarly sized human dataset. We also identified a modest but highly significant correlation between genetic regulatory variation among orthologous genes. Surprisingly, we found less genetic variation in gene regulation in HS rats relative to humans, though we still found eQTLs for the orthologs of many human genes for which eQTLs had not been found. These data are available from the RatGTEx data portal (RatGTEx.org) and will enable new discoveries of the genetic influences of complex traits. Keywords: RNA sequencing; GWAS; eQTL; resource
13	2022	Evaluation and characterization of expression quantitative trait analysis methods in the Hybrid Rat Diversity Panel	Pattee J, Vanderlinden LA, Mahaffey S, Hoffman P, Tabakoff B, Saba LM	Front Genet. 2022 Sep 14;13:947423.	PMID: 36186443		The Hybrid Rat Diversity Panel (HRDP) is a stable and well-characterized set of more than 90 inbred rat strains that can be leveraged for systems genetics approaches to understanding the genetic and genomic variation associated with complex disease. The HRDP exhibits substantial between-strain diversity while retaining substantial within-strain isogenicity, allowing for the precise mapping of genetic variation associated with complex phenotypes and providing statistical power to identify associated variants. In order to robustly identify associated genetic variants, it is important to account for the population structure induced by inbreeding. To this end, we investigate the performance of four plausible approaches towards modeling quantitative traits in the HRDP and quantify their operating characteristics. In particular, we investigate three approaches based on genome-wide mixed model analysis, and one approach based on ordinary least squares linear regression. Towards facilitating study planning and design, we conduct extensive simulations to investigate the power of genetic association analyses in the HRDP, and characterize the impressive attained power. In simulation of eQTL data in the HRDP, we find that a mixed model approach that leverages leave-one-chromosome-out kinship estimation attains the highest power while controlling type I error. Keywords: Hybrid Rat Diversity Panel; expression quantitative trait loci; study design; systems genetics; transcriptome.
14	2022	A natural mutator allele shapes mutation spectrum variation in mice	Sasani TA, Ashbrook DG, Beichman AC, Lu L, Palmer AA, Williams RW, Pritchard JK, Harris K	Nature. 2022 May;605(7910):497–502.	PMID: 35545679		Although germline mutation rates and spectra can vary within and between species, common genetic modifiers of the mutation rate have not been identified in nature1. Here we searched for loci that influence germline mutagenesis using a uniquely powerful resource: a panel of recombinant inbred mouse lines known as the BXD, descended from the laboratory strains C57BL/6J (B haplotype) and DBA/2J (D haplotype). Each BXD lineage has been maintained by brother-sister mating in the near absence of natural selection, accumulating de novo mutations for up to 50 years on a known genetic background that is a unique linear mosaic of B and D haplotypes2. We show that mice inheriting D haplotypes at a quantitative trait locus on chromosome 4 accumulate C>A germline mutations at a 50% higher rate than those inheriting B haplotypes, primarily owing to the activity of a C>A-dominated mutational signature known as SBS18. The B and D quantitative trait locus haplotypes encode different alleles of Mutyh, a DNA repair gene that underlies the heritable cancer predisposition syndrome that causes colorectal tumors with a high SBS18 mutation load3,4. Both B and D Mutyh alleles are present in wild populations of Mus musculus domesticus, providing evidence that common genetic variation modulates germline mutagenesis in a model mammalian species.
15	2022	Systems genetics in the rat HXB/BXH family identifies Tti2 as a pleiotropic quantitative trait gene for adult hippocampal neurogenesis and serum glucose	Senko AN, Overall RW, Silhavy J, Mlejnek P, Malínská H, Hüttl M, Marková I, Fabel KS, Lu L, Stuchlik A, Williams RW, Pravenec M, Kempermann G	PLoS Genet. 2022 Apr;18(4):e1009638.	PMID: 35377872		Neurogenesis in the adult hippocampus contributes to learning and memory in the healthy brain but is dysregulated in metabolic and neurodegenerative diseases. The molecular relationships between neural stem cell activity, adult neurogenesis, and global metabolism are largely unknown. Here we applied unbiased systems genetics methods to quantify genetic covariation among adult neurogenesis and metabolic phenotypes in peripheral tissues of a genetically diverse family of rat strains, derived from a cross between the spontaneously hypertensive (SHR/OlaIpcv) strain and Brown Norway (BN-Lx/Cub). The HXB/BXH family is a very well established model to dissect genetic variants that modulate metabolic and cardiovascular diseases and we have accumulated deep phenome and transcriptome data in a FAIR-compliant resource for systematic and integrative analyses. Here we measured rates of precursor cell proliferation, survival of new neurons, and gene expression in the hippocampus of the entire HXB/BXH family, including both parents. These data were combined with published metabolic phenotypes to detect a neurometabolic quantitative trait locus (QTL) for serum glucose and neuronal survival on Chromosome 16: 62.1-66.3 Mb. We subsequently fine-mapped the key phenotype to a locus that includes the Telo2-interacting protein 2 gene (Tti2)-a chaperone that modulates the activity and stability of PIKK kinases. To verify the hypothesis that differences in neurogenesis and glucose levels are caused by a polymorphism in Tti2, we generated a targeted frameshift mutation on the SHR/OlaIpcv background. Heterozygous SHR-Tti2+/- mutants had lower rates of hippocampal neurogenesis and hallmarks of dysglycemia compared to wild-type littermates. Our findings highlight Tti2 as a causal genetic link between glucose metabolism and structural brain plasticity. In humans, more than 800 genomic variants are linked to TTI2 expression, seven of which have associations to protein and blood stem cell factor concentrations, blood pressure and frontotemporal dementia.
16	2021	Highlights from the Era of Open Source Web-Based Tools	Anderson KR, Harris JA, Ng L, Prins P, Memar S, Ljungquist B, Fürth D, Williams RW, Ascoli GA, Dumitriu D	J Neurosci. 2021 Feb 3;41(5):927–936.	PMID: 33472826		High digital connectivity and a focus on reproducibility are contributing to an open science revolution in neuroscience. Repositories and platforms have emerged across the whole spectrum of subdisciplines, paving the way for a paradigm shift in the way we share, analyze, and reuse vast amounts of data collected across many laboratories. Here, we describe how open access web-based tools are changing the landscape and culture of neuroscience, highlighting six free resources that span subdisciplines from behavior to whole-brain mapping, circuits, neurons, and gene variants. Keywords: neuroscience; online repositories; open access; open science; open source; web-based tools.
17	2021	Comparing Statistical Tests for Differential Network Analysis of Gene Modules	Arbet J, Zhuang Y, Litkowski E, Saba L, Kechris K	Front Genet. 2021;12:630215.	PMID: 34093641	Software link	Genes often work together to perform complex biological processes, and ""networks"" provide a versatile framework for representing the interactions between multiple genes. Differential network analysis (DiNA) quantifies how this network structure differs between two or more groups/phenotypes (e.g., disease subjects and healthy controls), with the goal of determining whether differences in network structure can help explain differences between phenotypes. In this paper, we focus on gene co-expression networks, although in principle, the methods studied can be used for DiNA for other types of features (e.g., metabolome, epigenome, microbiome, proteome, etc.). Three common applications of DiNA involve (1) testing whether the connections to a single gene differ between groups, (2) testing whether the connection between a pair of genes differs between groups, or (3) testing whether the connections within a ""module"" (a subset of 3 or more genes) differs between groups. This article focuses on the latter, as there is a lack of studies comparing statistical methods for identifying differentially co-expressed modules (DCMs). Through extensive simulations, we compare several previously proposed test statistics and a new p-norm difference test (PND). We demonstrate that the true positive rate of the proposed PND test is competitive with and often higher than the other methods, while controlling the false positive rate. The R package discoMod (differentially co-expressed modules) implements the proposed method and provides a full pipeline for identifying DCMs: clustering tools to derive gene modules, tests to identify DCMs, and methods for visualizing the results. Keywords: differential network analysis; differentially co-expressed modules; gene co-expression networks; networks; statistical inference.
18	2021	A platform for experimental precision medicine: The extended BXD mouse family	Ashbrook DG, Arends D, Prins P, Mulligan MK, Roy S, Williams EG, Lutz CM, Valenzuela A, Bohl CJ, Ingels JF, McCarty MS, Centeno AG, Hager R, Auwerx J, Lu L, Williams RW	Cell Syst. 2021 Mar 17;12(3):235-247.e9.	PMID: 33472028		The challenge of precision medicine is to model complex interactions among DNA variants, phenotypes, development, environments, and treatments. We address this challenge by expanding the BXD family of mice to 140 fully isogenic strains, creating a uniquely powerful model for precision medicine. This family segregates for 6 million common DNA variants-a level that exceeds many human populations. Because each member can be replicated, heritable traits can be mapped with high power and precision. Current BXD phenomes are unsurpassed in coverage and include much omics data and thousands of quantitative traits. BXDs can be extended by a single-generation cross to as many as 19,460 isogenic F1 progeny, and this extended BXD family is an effective platform for testing causal modeling and for predictive validation. BXDs are a unique core resource for the field of experimental precision medicine. Keywords: GXE; complex trait; gene mapping; personalized medicine; power calculation; recombinant inbred strains; systems biology; systems genetics.
19	2021	Whole genome sequencing of nearly isogenic WMI and WLI inbred rats identifies genes potentially involved in depression and stress reactivity	de Jong TV, Kim P, Guryev V, Mulligan MK, Williams RW, Redei EE, Chen H	Sci Rep. 2021 Jul 20;11(1):14774.	PMID: 34285244		The WMI and WLI inbred rats were generated from the stress-prone, and not yet fully inbred, Wistar Kyoto (WKY) strain. These were selected using bi-directional selection for immobility in the forced swim test and were then sib-mated for over 38 generations. Despite the low level of genetic diversity among WKY progenitors, the WMI substrain is significantly more vulnerable to stress relative to the counter-selected WLI strain. Here we quantify numbers and classes of genomic sequence variants distinguishing these substrains with the long term goal of uncovering functional and behavioral polymorphism that modulate sensitivity to stress and depression-like phenotypes. DNA from WLI and WMI was sequenced using Illumina xTen, IonTorrent, and 10X Chromium linked-read platforms to obtain a combined coverage of ~ 100X for each strain. We identified 4,296 high quality homozygous SNPs and indels between the WMI and WLI. We detected high impact variants in genes previously implicated in depression (e.g. Gnat2), depression-like behavior (e.g. Prlr, Nlrp1a), other psychiatric disease (e.g. Pou6f2, Kdm5a, Reep3, Wdfy3), and responses to psychological stressors (e.g. Pigr). High coverage sequencing data confirm that the two substrains are nearly coisogenic. Nonetheless, the small number of sequence variants contributes to numerous well characterized differences including depression-like behavior, stress reactivity, and addiction related phenotypes. These selected substrains are an ideal resource for forward and reverse genetic studies using a reduced complexity cross.
20	2021	A quantitative trait variant in Gabra2 underlies increased methamphetamine stimulant sensitivity	Goldberg LR, Yao EJ, Kelliher JC, Reed ER, Wu Cox J, Parks C, Kirkpatrick SL, Beierle JA, Chen MM, Johnson WE, Homanics GE, Williams RW, Bryant CD, Mulligan MK	Genes Brain Behav. 2021 Nov;20(8):e12774.	PMID: 34677900		Psychostimulant (methamphetamine, cocaine) use disorders have a genetic component that remains mostly unknown. We conducted genome-wide quantitative trait locus (QTL) analysis of methamphetamine stimulant sensitivity. To facilitate gene identification, we employed a Reduced Complexity Cross between closely related C57BL/6 mouse substrains and examined maximum speed and distance traveled over 30 min following methamphetamine (2 mg/kg, i.p.). For maximum methamphetamine-induced speed following the second and third administration, we identified a single genome-wide significant QTL on chromosome 11 that peaked near the Cyfip2 locus (LOD = 3.5, 4.2; peak = 21 cM [36 Mb]). For methamphetamine-induced distance traveled following the first and second administration, we identified a genome-wide significant QTL on chromosome 5 that peaked near a functional intronic indel in Gabra2 coding for the alpha-2 subunit of the GABA-A receptor (LOD = 3.6-5.2; peak = 34-35 cM [66-67 Mb]). Striatal cis-expression QTL mapping corroborated Gabra2 as a functional candidate gene underlying methamphetamine-induced distance traveled. CRISPR/Cas9-mediated correction of the mutant intronic deletion on the C57BL/6J background to the wild-type C57BL/6NJ allele was sufficient to reduce methamphetamine-induced locomotor activity toward the wild-type C57BL/6NJ-like level, thus validating the quantitative trait variant (QTV). These studies show the power and efficiency of Reduced Complexity Crosses in identifying causal variants underlying complex traits. Functionally restoring Gabra2 expression decreased methamphetamine stimulant sensitivity and supports preclinical and human genetic studies implicating the GABA-A receptor in psychostimulant addiction-relevant traits. Importantly, our findings have major implications for studying psychostimulants in the C57BL/6J strain-the gold standard strain in biomedical research. Keywords: addiction; amphetamine; cocaine; eQTL; methylphenidate; psychostimulant; quantitative trait gene; quantitative trait nucleotide; stimulant disorders.
21	2021	The genome sequence of the Norway rat, Rattus norvegicus Berkenhout 1769	Howe K, Dwinell M, Shimoyama M, Corton C, Betteridge E, Dove A, Quail MA, Smith M, Saba L, Williams RW, Chen H, Kwitek AE, McCarthy SA, Uliano-Silva M, Chow W, Tracey A, Torrance J, Sims Y, Challis R, Threlfall J, Blaxter M	Wellcome Open Res. 2021;6:118.	PMID: 34660910		We present a genome assembly from an individual male Rattus norvegicus (the Norway rat; Chordata; Mammalia; Rodentia; Muridae). The genome sequence is 2.44 gigabases in span. The majority of the assembly is scaffolded into 20 chromosomal pseudomolecules, with both X and Y sex chromosomes assembled. This genome assembly, mRatBN7.2, represents the new reference genome for R. norvegicus and has been adopted by the Genome Reference Consortium. Keywords: Norway rat; Rattus norvegicus; chromosomal; genome sequence; reference genome.
22	2021	Spontaneously Hypertensive Rat substrains show differences in model traits for addiction risk and cocaine self-administration: Implications for a novel rat reduced complexity cross	Kantak KM, Stots C, Mathieson E, Bryant CD	Behav Brain Res. 2021 Aug 6;411:113406.	PMID: 34097899		Forward genetic mapping of F2 crosses between closely related substrains of inbred rodents - referred to as a reduced complexity cross (RCC) - is a relatively new strategy for accelerating the pace of gene discovery for complex traits, such as drug addiction. RCCs to date were generated in mice, but rats are thought to be optimal for addiction genetic studies. Based on past literature, one inbred Spontaneously Hypertensive Rat substrain, SHR/NCrl, is predicted to exhibit a distinct behavioral profile as it relates to cocaine self-administration traits relative to another substrain, SHR/NHsd. Direct substrain comparisons are a necessary first step before implementing an RCC. We evaluated model traits for cocaine addiction risk and cocaine self-administration behaviors using a longitudinal within-subjects design. Impulsive-like and compulsive-like traits were greater in SHR/NCrl than SHR/NHsd, as were reactivity to sucrose reward, sensitivity to acute psychostimulant effects of cocaine, and cocaine use studied under fixed-ratio and tandem schedules of cocaine self-administration. Compulsive-like behavior correlated with the acute psychostimulant effects of cocaine, which in turn correlated with cocaine taking under the tandem schedule. Compulsive-like behavior also was the best predictor of cocaine seeking responses. Heritability estimates indicated that 22 %-40 % of the variances for the above phenotypes can be explained by additive genetic factors, providing sufficient genetic variance to conduct genetic mapping in F2 crosses of SHR/NCrl and SHR/NHsd. These results provide compelling support for using an RCC approach in SHR substrains to uncover candidate genes and variants that are of relevance to cocaine use disorders. Keywords: Addiction vulnerability traits; Cocaine; SHR/NCrl substrain; SHR/NHsd substrain.
23	2021	Aptardi predicts polyadenylation sites in sample-specific transcriptomes using high-throughput RNA sequencing and DNA sequence	Lusk R, Stene E, Banaei-Kashani F, Tabakoff B, Kechris K, Saba LM	Nat Commun. 2021 Mar 12;12(1):1652.	PMID: 33712618	Software link	Annotation of polyadenylation sites from short-read RNA sequencing alone is a challenging computational task. Other algorithms rooted in DNA sequence predict potential polyadenylation sites; however, in vivo expression of a particular site varies based on a myriad of conditions. Here, we introduce aptardi (alternative polyadenylation transcriptome analysis from RNA-Seq data and DNA sequence information), which leverages both DNA sequence and RNA sequencing in a machine learning paradigm to predict expressed polyadenylation sites. Specifically, as input aptardi takes DNA nucleotide sequence, genome-aligned RNA-Seq data, and an initial transcriptome. The program evaluates these initial transcripts to identify expressed polyadenylation sites in the biological sample and refines transcript 3'-ends accordingly. The average precision of the aptardi model is twice that of a standard transcriptome assembler. In particular, the recall of the aptardi model (the proportion of true polyadenylation sites detected by the algorithm) is improved by over three-fold. Also, the model-trained using the Human Brain Reference RNA commercial standard-performs well when applied to RNA-sequencing samples from different tissues and different mammalian species. Finally, aptardi's input is simple to compile and its output is easily amenable to downstream analyses such as quantitation and differential expression.
24	2021	Integration of evidence across human and model organism studies: A meeting report	Palmer RHC, Johnson EC, Won H, Polimanti R, Kapoor M, Chitre A, Bogue MA, Benca-Bachman CE, Parker CC, Verma A, Reynolds T, Ernst J, Bray M, Kwon SB, Lai D, Quach BC, Gaddis NC, Saba L, Chen H, Hawrylycz M, Zhang S, Zhou Y, Mahaffey S, Fischer C, Sanchez-Roige S, Bandrowski A, Lu Q, Shen L, Philip V, Gelernter J, Bierut LJ, Hancock DB, Edenberg HJ, Johnson EO, Nestler EJ, Barr PB, Prins P, Smith DJ, Akbarian S, Thorgeirsson T, Walton D, Baker E, Jacobson D, Palmer AA, Miles M, Chesler EJ, Emerson J, Agrawal A, Martone M, Williams RW	Genes Brain Behav. 2021 Apr 23;e12738.	PMID: 33893716		The National Institute on Drug Abuse and Joint Institute for Biological Sciences at the Oak Ridge National Laboratory hosted a meeting attended by a diverse group of scientists with expertise in substance use disorders (SUDs), computational biology, and FAIR (Findability, Accessibility, Interoperability, and Reusability) data sharing. The meeting's objective was to discuss and evaluate better strategies to integrate genetic, epigenetic, and 'omics data across human and model organisms to achieve deeper mechanistic insight into SUDs. Specific topics were to (a) evaluate the current state of substance use genetics and genomics research and fundamental gaps, (b) identify opportunities and challenges of integration and sharing across species and data types, (c) identify current tools and resources for integration of genetic, epigenetic, and phenotypic data, (d) discuss steps and impediment related to data integration, and (e) outline future steps to support more effective collaboration-particularly between animal model research communities and human genetics and clinical research teams. This review summarizes key facets of this catalytic discussion with a focus on new opportunities and gaps in resources and knowledge on SUDs. Keywords: GWAS; cross-species; data integration; drug abuse; genomics; model organisms; multi-omic; substance use disorders; working group.
25	2021	Genetic Modulation of Initial Sensitivity to Δ9-Tetrahydrocannabinol (THC) Among the BXD Family of Mice	Parks C, Rogers CM, Prins P, Williams RW, Chen H, Jones BC, Moore BM, Mulligan MK	Front Genet. 2021;12:659012.	PMID: 34367237		Cannabinoid receptor 1 activation by the major psychoactive component in cannabis, Δ9-tetrahydrocannabinol (THC), produces motor impairments, hypothermia, and analgesia upon acute exposure. In previous work, we demonstrated significant sex and strain differences in acute responses to THC following administration of a single dose (10 mg/kg, i.p.) in C57BL/6J (B6) and DBA/2J (D2) inbred mice. To determine the extent to which these differences are heritable, we quantified acute responses to a single dose of THC (10 mg/kg, i.p.) in males and females from 20 members of the BXD family of inbred strains derived by crossing and inbreeding B6 and D2 mice. Acute THC responses (initial sensitivity) were quantified as changes from baseline for: 1. spontaneous activity in the open field (mobility), 2. body temperature (hypothermia), and 3. tail withdrawal latency to a thermal stimulus (antinociception). Initial sensitivity to the immobilizing, hypothermic, and antinociceptive effects of THC varied substantially across the BXD family. Heritability was highest for mobility and hypothermia traits, indicating that segregating genetic variants modulate initial sensitivity to THC. We identified genomic loci and candidate genes, including Ndufs2, Scp2, Rps6kb1 or P70S6K, Pde4d, and Pten, that may control variation in THC initial sensitivity. We also detected strong correlations between initial responses to THC and legacy phenotypes related to intake or response to other drugs of abuse (cocaine, ethanol, and morphine). Our study demonstrates the feasibility of mapping genes and variants modulating THC responses in the BXDs to systematically define biological processes and liabilities associated with drug use and abuse. Keywords: BXD family; C57BL/6; DBA/2; QTL; THC; addiction; cannabis; drug response.
26	2021	A breeding strategy to identify modifiers of high genetic risk for methamphetamine intake	Reed C, Stafford AM, Mootz JRK, Baba H, Erk J, Phillips TJ	Genes Brain Behav. 2021 Feb;20(2):e12667.	PMID: 32424970		Trace amine-associated receptor 1 (Taar1) impacts methamphetamine (MA) intake. A mutant allele (Taar1m1J ) derived from the DBA/2J mouse strain codes for a non-functional receptor, and Taar1m1J/m1J mice consume more MA than mice possessing the reference Taar1+ allele. To study the impact of this mutation in a genetically diverse population, heterogeneous stock-collaborative cross (HS-CC) mice, the product of an eight-way cross of standard and wild-derived strains, were tested for MA intake. HS-CC had low MA intake, so an HS-CC by DBA/2J strain F2 intercross was created to transfer the mutant allele onto the diverse background, and used for selective breeding. To study residual variation in MA intake existing in Taar1m1J/m1J mice, selective breeding for higher (MAH) vs lower (MAL) MA intake was initiated from Taar1m1J/m1J F2 individuals; a control line of Taar1+/+ individuals (MAC) was retained. The lines were also examined for MA-induced locomotor and thermal responses, and fluid and tastant consumption. Taar1m1J/m1J F2 mice consumed significantly more MA than Taar1+/+ F2 mice. Response to selection was significant by generation 2 and there were corresponding differences in fluid consumed. Fluid consumption was not different in non-MA drinking studies. Taar1m1J/m1J genotype (MAL or MAH vs MAC mice) was associated with heighted MA locomotor and reduced hypothermic responses. MAL mice exhibited greater sensitization than MAH mice, but the selected lines did not consistently differ for thermal or tastant phenotypes. Residual variation among high-risk Taar1m1J/m1J mice appears to involve mechanisms associated with neuroadaptation to MA, but not sensitivity to hypothermic effects of MA. Keywords: hypothermia; psychostimulant; quinine; saccharin; selective breeding; self-administration; sensitization; trace amine-associated receptor 1; two-bottle choice.
27	2021	Gene-by-environment modulation of lifespan and weight gain in the murine BXD family	Roy S, Sleiman MB, Jha P, Ingels JF, Chapman CJ, McCarty MS, Ziebarth JD, Hook M, Sun A, Zhao W, Huang J, Neuner SM, Wilmott LA, Shapaker TM, Centeno AG, Ashbrook DG, Mulligan MK, Kaczorowski CC, Makowski L, Cui Y, Read RW, Miller RA, Mozhui K, Williams EG, Sen S, Lu L, Auwerx J, Williams RW	Nat Metab. 2021 Sep;3(9):1217–1227.	PMID: 34552269		How lifespan and body weight vary as a function of diet and genetic differences is not well understood. Here we quantify the impact of differences in diet on lifespan in a genetically diverse family of female mice, split into matched isogenic cohorts fed a low-fat chow diet (CD, n = 663) or a high-fat diet (HFD, n = 685). We further generate key metabolic data in a parallel cohort euthanized at four time points. HFD feeding shortens lifespan by 12%: equivalent to a decade in humans. Initial body weight and early weight gains account for longevity differences of roughly 4-6 days per gram. At 500 days, animals on a HFD typically gain four times as much weight as control, but variation in weight gain does not correlate with lifespan. Classic serum metabolites, often regarded as health biomarkers, are not necessarily strong predictors of longevity. Our data indicate that responses to a HFD are substantially modulated by gene-by-environment interactions, highlighting the importance of genetic variation in making accurate individualized dietary recommendations.
28	2021	A long non-coding RNA (Lrap) modulates brain gene expression and levels of alcohol consumption in rats	Saba LM, Hoffman PL, Homanics GE, Mahaffey S, Daulatabad SV, Janga SC, Tabakoff B	Genes Brain Behav. 2021 Feb;20(2):e12698.	PMID: 32893479		LncRNAs are important regulators of quantitative and qualitative features of the transcriptome. We have used QTL and other statistical analyses to identify a gene coexpression module associated with alcohol consumption. The ""hub gene"" of this module, Lrap (Long non-coding RNA for alcohol preference), was an unannotated transcript resembling a lncRNA. We used partial correlation analyses to establish that Lrap is a major contributor to the integrity of the coexpression module. Using CRISPR/Cas9 technology, we disrupted an exon of Lrap in Wistar rats. Measures of alcohol consumption in wild type, heterozygous and knockout rats showed that disruption of Lrap produced increases in alcohol consumption/alcohol preference. The disruption of Lrap also produced changes in expression of over 700 other transcripts. Furthermore, it became apparent that Lrap may have a function in alternative splicing of the affected transcripts. The GO category of ""Response to Ethanol"" emerged as one of the top candidates in an enrichment analysis of the differentially expressed transcripts. We validate the role of Lrap as a mediator of alcohol consumption by rats, and also implicate Lrap as a modifier of the expression and splicing of a large number of brain transcripts. A defined subset of these transcripts significantly impacts alcohol consumption by rats (and possibly humans). Our work shows the pleiotropic nature of non-coding elements of the genome, the power of network analysis in identifying the critical elements influencing phenotypes, and the fact that not all changes produced by genetic editing are critical for the concomitant changes in phenotype. Keywords: CRISPR/Cas; brain RNA expression networks; genetic modification; long non-coding RNA; predisposing factors; quantitative genetics; recombinant inbred rat strains; systems genetics; transcriptome; voluntary alcohol consumption.
29	2021	Paraquat Toxicogenetics: Strain-Related Reduction of Tyrosine Hydroxylase Staining in Substantia Nigra in Mice	Torres-Rojas C, Zhao W, Zhuang D, O’Callaghan JP, Lu L, Mulligan MK, Williams RW, Jones BC	Front Toxicol. 2021;3:722518.	PMID: 35295113		Paraquat (PQ) is a putative risk factor for the development of sporadic Parkinson's disease. To model a possible genetic basis for individual differences in susceptibility to exposure to PQ, we recently examined the effects of paraquat on tyrosine hydroxylase (TH)-containing neurons in the substantia nigra pars compacta (SNc) of six members of the BXD family of mice (n = 2-6 per strain). We injected males with 5 mg/kg paraquat weekly three times. The density of TH+ neurons counted by immunocytochemistry at 200x in eight or more sections through the SNc is reduced in five of the six strains relative to control (N = 4 ± 2 mice per strain). TH+ loss ranged from 0 to 20% with an SEM of 1%. The heritability was estimated using standard ANOVA and jackknife resampling and is 0.37 ± 0.05 in untreated animals and 0.47 ± 0.04 in treated animals. These results demonstrate genetic modulation and GxE variation in susceptibility to PQ exposure and the loss of TH staining in the substantia nigra. Keywords: BXD mice; forward genetic analysis; sporadic Parkinson’s disease; stereology; tyrosine hydroxylase.
30	2021	Speeding up eQTL scans in the BXD population using GPUs	Trotter C, Kim H, Farage G, Prins P, Williams RW, Broman KW, Sen S	G3 (Bethesda). 2021 Dec 8;11(12):jkab254.	PMID: 34499130	Software link	The BXD family of mouse strains are an important reference population for systems biology and genetics that have been fully sequenced and deeply phenotyped. To facilitate interactive use of genotype-phenotype relations using many massive omics data sets for this and other segregating populations, we have developed new algorithms and code that enable near-real-time whole-genome quantitative trait locus (QTL) scans for up to one million traits. By using easily parallelizable operations including matrix multiplication, vectorized operations, and element-wise operations, our method is more than 700 times faster than a R/qtl linear model genome scan using 16 threads. We used parallelization of different CPU threads as well as GPUs. We found that the speed advantage of GPUs is dependent on problem size and shape (the number of cases, number of genotypes, and number of traits). Our approach is ideal for interactive web services, such as GeneNetwork.org that need to display results in real-time. Our implementation is available as the Julia language package LiteQTL at https://github.com/senresearch/LiteQTL.jl. Keywords: BXD; GPU; genome scan; linear model.
31	2021	Sex Differences in the Brain Transcriptome Related to Alcohol Effects and Alcohol Use Disorder	Hitzemann R, Bergeson SE, Berman AE, Bubier JA, Chesler EJ, Finn DA, Hein M, Hoffman P, Holmes A, Kisby BR, Lockwood D, Lodowski KH, McManus M, Owen JA, Ozburn AR, Panthagani P, Ponomarev I, Saba L, Tabakoff B, Walchale A, Williams RW, Phillips TJ	Biol Psychiatry. 2022 Jan 1;91(1):43–52.	PMID: 34274109		There is compelling evidence that sex and gender have crucial roles in excessive alcohol (ethanol) consumption. Here, we review some of the data from the perspective of brain transcriptional differences between males and females, focusing on rodent animal models. A key emerging transcriptional feature is the role of neuroimmune processes. Microglia are the resident neuroimmune cells in the brain and exhibit substantial functional differences between males and females. Selective breeding for binge ethanol consumption and the impacts of chronic ethanol consumption and withdrawal from chronic ethanol exposure all demonstrate sex-dependent neuroimmune signatures. A focus is on resolving sex-dependent differences in transcriptional responses to ethanol at the neurocircuitry level. Sex-dependent transcriptional differences are found in the extended amygdala and the nucleus accumbens. Telescoping of ethanol consumption is found in some, but not all, studies to be more prevalent in females. Recent transcriptional studies suggest that some sex differences may be due to female-dependent remodeling of the primary cilium. An interesting theme appears to be developing: at least from the animal model perspective, even when males and females are phenotypically similar, they differ significantly at the level of the transcriptome. Keywords: Binge; Ethanol; Functional genomics; Neural circuitry; Neuroimmune function; Therapeutics.
32	2020	Facilitating Complex Trait Analysis via Reduced Complexity Crosses	Bryant CD, Smith DJ, Kantak KM, Nowak TS, Williams RW, Damaj MI, Redei EE, Chen H, Mulligan MK	Trends Genet. 2020 Aug;36(8):549–562.	PMID: 32482413		Genetically diverse inbred strains are frequently used in quantitative trait mapping to identify sequence variants underlying trait variation. Poor locus resolution and high genetic complexity impede variant discovery. As a solution, we explore reduced complexity crosses (RCCs) between phenotypically divergent, yet genetically similar, rodent substrains. RCCs accelerate functional variant discovery via decreasing the number of segregating variants by orders of magnitude. The simplified genetic architecture of RCCs often permit immediate identification of causal variants or rapid fine-mapping of broad loci to smaller intervals. Whole-genome sequences of substrains make RCCs possible by supporting the development of array- and targeted sequencing-based genotyping platforms, coupled with rapid genome editing for variant validation. In summary, RCCs enhance discovery-based genetics of complex traits. Keywords: GWAS; functional variant; positional cloning; rat genetics; substrain, QTL.
33	2020	Mouse Systems Genetics as a Prelude to Precision Medicine	Li H, Auwerx J	Trends Genet. 2020 Apr;36(4):259–272.	PMID: 32037011		Mouse models have been instrumental in understanding human disease biology and proposing possible new treatments. The precise control of the environment and genetic composition of mice allows more rigorous observations, but limits the generalizability and translatability of the results into human applications. In the era of precision medicine, strategies using mouse models have to be revisited to effectively emulate human populations. Systems genetics is one promising paradigm that may promote the transition to novel precision medicine strategies. Here, we review the state-of-the-art resources and discuss how mouse systems genetics helps to understand human diseases and to advance the development of precision medicine, with an emphasis on the existing resources and strategies. Keywords: genetic reference population; mouse models; precision medicine; systems biology; systems genetics.
34	2020	Alcoholic-Hepatitis, Links to Brain and Microbiome: Mechanisms, Clinical and Experimental Research	Neuman MG, Seitz HK, French SW, Malnick S, Tsukamoto H, Cohen LB, Hoffman P, Tabakoff B, Fasullo M, Nagy LE, Tuma PL, Schnabl B, Mueller S, Groebner JL, Barbara FA, Yue J, Nikko A, Alejandro M, Brittany T, Edward V, Harrall K, Saba L, Mihai O	Biomedicines. 2020 Mar 18;8(3):E63.	PMID: 32197424		The following review article presents clinical and experimental features of alcohol-induced liver disease (ALD). Basic aspects of alcohol metabolism leading to the development of liver hepatotoxicity are discussed. ALD includes fatty liver, acute alcoholic hepatitis with or without liver failure, alcoholic steatohepatitis (ASH) leading to fibrosis and cirrhosis, and hepatocellular cancer (HCC). ALD is fully attributable to alcohol consumption. However, only 10-20% of heavy drinkers (persons consuming more than 40 g of ethanol/day) develop clinical ALD. Moreover, there is a link between behaviour and environmental factors that determine the amount of alcohol misuse and their liver disease. The range of clinical presentation varies from reversible alcoholic hepatic steatosis to cirrhosis, hepatic failure, and hepatocellular carcinoma. We aimed to (1) describe the clinico-pathology of ALD, (2) examine the role of immune responses in the development of alcoholic hepatitis (ASH), (3) propose diagnostic markers of ASH, (4) analyze the experimental models of ALD, (5) study the role of alcohol in changing the microbiota, and (6) articulate how findings in the liver and/or intestine influence the brain (and/or vice versa) on ASH; (7) identify pathways in alcohol-induced organ damage and (8) to target new innovative experimental concepts modeling the experimental approaches. The present review includes evidence recognizing the key toxic role of alcohol in ALD severity. Cytochrome p450 CYP2E1 activation may change the severity of ASH. The microbiota is a key element in immune responses, being an inducer of proinflammatory T helper 17 cells and regulatory T cells in the intestine. Alcohol consumption changes the intestinal microbiota and influences liver steatosis and liver inflammation. Knowing how to exploit the microbiome to modulate the immune system might lead to a new form of personalized medicine in ALF and ASH. Keywords: CYP 1A2; CYP2E1; acetaldehyde dehydrogenase (ALDH), alcohol dehydrogenase (ADH), CYP 1A1; alcoholic hepatitis; hepato-carcinogenesis; hepatocytotoxicity; his3-Δ3′ and his3- Δ5′; laboratory markers; microsomal ethanol oxidizing system (MEOS), immunohistochemistry; mithocondrion.
35	2020	Bioinformatics identification and pharmacological validation of Kcnn3/KCa2 channels as a mediator of negative affective behaviors and excessive alcohol drinking in mice	Padula AE, Rinker JA, Lopez MF, Mulligan MK, Williams RW, Becker HC, Mulholland PJ	Transl Psychiatry. 2020 Nov 27;10(1):414.	PMID: 33247097		Mood disorders are often comorbid with alcohol use disorder (AUD) and play a considerable role in the development and maintenance of alcohol dependence and relapse. Because of this high comorbidity, it is necessary to determine shared and unique genetic factors driving heavy drinking and negative affective behaviors. In order to identify novel pharmacogenetic targets, a bioinformatics analysis was used to quantify the expression of amygdala K+ channel genes that covary with anxiety-related phenotypes in the well-phenotyped and fully sequenced family of BXD strains. We used a model of stress-induced escalation of drinking in alcohol-dependent mice to measure negative affective behaviors during abstinence. A pharmacological approach was used to validate the key bioinformatics findings in alcohol-dependent, stressed mice. Amygdalar expression of Kcnn3 correlated significantly with 40 anxiety-associated phenotypes. Further examination of Kcnn3 expression revealed a strong eigentrait for anxiety-like behaviors and negative correlations with binge-like and voluntary alcohol drinking. Mice treated with chronic intermittent alcohol exposure and repeated swim stress consumed more alcohol in their home cages and showed hypophagia on the novelty-suppressed feeding test during abstinence. Pharmacologically targeting Kcnn gene products with the KCa2 (SK) channel-positive modulator 1-EBIO decreased drinking and reduced feeding latency in alcohol-dependent, stressed mice. Collectively, these validation studies provide central nervous system links into the covariance of stress, negative affective behaviors, and AUD in the BXD strains. Further, the bioinformatics discovery tool is effective in identifying promising targets (i.e., KCa2 channels) for treating alcohol dependence exacerbated by comorbid mood disorders.
36	2020	Alcohol Sensitivity as an Endophenotype of Alcohol Use Disorder: Exploring Its Translational Utility between Rodents and Humans	Parker CC, Lusk R, Saba LM	Brain Sci. 2020 Oct 13;10(10):E725.	PMID: 33066036		Alcohol use disorder (AUD) is a complex, chronic, relapsing disorder with multiple interacting genetic and environmental influences. Numerous studies have verified the influence of genetics on AUD, yet the underlying biological pathways remain unknown. One strategy to interrogate complex diseases is the use of endophenotypes, which deconstruct current diagnostic categories into component traits that may be more amenable to genetic research. In this review, we explore how an endophenotype such as sensitivity to alcohol can be used in conjunction with rodent models to provide mechanistic insights into AUD. We evaluate three alcohol sensitivity endophenotypes (stimulation, intoxication, and aversion) for their translatability across human and rodent research by examining the underlying neurobiology and its relationship to consumption and AUD. We show examples in which results gleaned from rodents are successfully integrated with information from human studies to gain insight in the genetic underpinnings of AUD and AUD-related endophenotypes. Finally, we identify areas for future translational research that could greatly expand our knowledge of the biological and molecular aspects of the transition to AUD with the broad hope of finding better ways to treat this devastating disorder. Keywords: alcohol dependence; alcohol sensitivity; alcohol use disorder (AUD); alcoholism; animal models; cross species validation; endophenotype; genetics; genome-wide association studies (GWAS); rodents.
37	2020	MCMSeq: Bayesian hierarchical modeling of clustered and repeated measures RNA sequencing experiments	Vestal BE, Moore CM, Wynn E, Saba L, Fingerlin T, Kechris K	BMC Bioinformatics. 2020 Aug 28;21(1):375.	PMID: 32859148	Software link	Background: As the barriers to incorporating RNA sequencing (RNA-Seq) into biomedical studies continue to decrease, the complexity and size of RNA-Seq experiments are rapidly growing. Paired, longitudinal, and other correlated designs are becoming commonplace, and these studies offer immense potential for understanding how transcriptional changes within an individual over time differ depending on treatment or environmental conditions. While several methods have been proposed for dealing with repeated measures within RNA-Seq analyses, they are either restricted to handling only paired measurements, can only test for differences between two groups, and/or have issues with maintaining nominal false positive and false discovery rates. In this work, we propose a Bayesian hierarchical negative binomial generalized linear mixed model framework that can flexibly model RNA-Seq counts from studies with arbitrarily many repeated observations, can include covariates, and also maintains nominal false positive and false discovery rates in its posterior inference. Results: In simulation studies, we showed that our proposed method (MCMSeq) best combines high statistical power (i.e. sensitivity or recall) with maintenance of nominal false positive and false discovery rates compared the other available strategies, especially at the smaller sample sizes investigated. This behavior was then replicated in an application to real RNA-Seq data where MCMSeq was able to find previously reported genes associated with tuberculosis infection in a cohort with longitudinal measurements. Conclusions: Failing to account for repeated measurements when analyzing RNA-Seq experiments can result in significantly inflated false positive and false discovery rates. Of the methods we investigated, whether they model RNA-Seq counts directly or worked on transformed values, the Bayesian hierarchical model implemented in the mcmseq R package (available at https://github.com/stop-pre16/mcmseq ) best combined sensitivity and nominal error rate control. Keywords: Correlated data; Longitudinal data; Markov chain Monte Carlo; RNA-Seq.
38	2020	Variability and heritability of mouse brain structure: Microscopic MRI atlases and connectomes for diverse strains	Wang N, Anderson RJ, Ashbrook DG, Gopalakrishnan V, Park Y, Priebe CE, Qi Y, Laoprasert R, Vogelstein JT, Williams RW, Johnson GA	Neuroimage. 2020 Nov 15;222:117274.	PMID: 32818613		Genome-wide association studies have demonstrated significant links between human brain structure and common DNA variants. Similar studies with rodents have been challenging because of smaller brain volumes. Using high field MRI (9.4 T) and compressed sensing, we have achieved microscopic resolution and sufficiently high throughput for rodent population studies. We generated whole brain structural MRI and diffusion connectomes for four diverse isogenic lines of mice (C57BL/6J, DBA/2J, CAST/EiJ, and BTBR) at spatial resolution 20,000 times higher than human connectomes. We measured narrow sense heritability (h2) I.e. the fraction of variance explained by strains in a simple ANOVA model for volumes and scalar diffusion metrics, and estimates of residual technical error for 166 regions in each hemisphere and connectivity between the regions. Volumes of discrete brain regions had the highest mean heritability (0.71 ± 0.23 SD, n = 332), followed by fractional anisotropy (0.54 ± 0.26), radial diffusivity (0.34 ± 0.022), and axial diffusivity (0.28 ± 0.19). Connection profiles were statistically different in 280 of 322 nodes across all four strains. Nearly 150 of the connection profiles were statistically different between the C57BL/6J, DBA/2J, and CAST/EiJ lines. Microscopic whole brain MRI/DTI has allowed us to identify significant heritable phenotypes in brain volume, scalar DTI metrics, and quantitative connectomes. Keywords: Connectome; MR microscopy; MRI/DTI; Mouse brain.
39	2019	R/qtl2: Software for Mapping Quantitative Trait Loci with High-Dimensional Data and Multiparent Populations	Broman KW, Gatti DM, Simecek P, Furlotte NA, Prins P, Sen Ś, Yandell BS, Churchill GA	Genetics. 2019 Feb;211(2):495–502.	PMID: 30591514	Software link	R/qtl2 is an interactive software environment for mapping quantitative trait loci (QTL) in experimental populations. The R/qtl2 software expands the scope of the widely used R/qtl software package to include multiparent populations derived from more than two founder strains, such as the Collaborative Cross and Diversity Outbred mice, heterogeneous stocks, and MAGIC plant populations. R/qtl2 is designed to handle modern high-density genotyping data and high-dimensional molecular phenotypes, including gene expression and proteomics. R/qtl2 includes the ability to perform genome scans using a linear mixed model to account for population structure, and also includes features to impute SNPs based on founder strain genomes and to carry out association mapping. The R/qtl2 software provides all of the basic features needed for QTL mapping, including graphical displays and summary reports, and it can be extended through the creation of add-on packages. R/qtl2, which is free and open source software written in the R and C++ programming languages, comes with a test framework. Keywords: Collaborative Cross; Diversity Outbred mice; MAGIC; MPP; Multiparent Advanced Generation Inter-Cross (MAGIC); QTL; heterogeneous stock; multiparent populations; software.
40	2019	Cleaning Genotype Data from Diversity Outbred Mice	Broman KW, Gatti DM, Svenson KL, Sen Ś, Churchill GA	G3 (Bethesda). 2019 May 7;9(5):1571–1579.	PMID: 30877082		Data cleaning is an important first step in most statistical analyses, including efforts to map the genetic loci that contribute to variation in quantitative traits. Here we illustrate approaches to quality control and cleaning of array-based genotyping data for multiparent populations (experimental crosses derived from more than two founder strains), using MegaMUGA array data from a set of 291 Diversity Outbred (DO) mice. Our approach employs data visualizations that can reveal problems at the level of individual mice or with individual SNP markers. We find that the proportion of missing genotypes for each mouse is an effective indicator of sample quality. We use microarray probe intensities for SNPs on the X and Y chromosomes to confirm the sex of each mouse, and we use the proportion of matching SNP genotypes between pairs of mice to detect sample duplicates. We use a hidden Markov model (HMM) reconstruction of the founder haplotype mosaic across each mouse genome to estimate the number of crossovers and to identify potential genotyping errors. To evaluate marker quality, we find that missing data and genotyping error rates are the most effective diagnostics. We also examine the SNP genotype frequencies with markers grouped according to their minor allele frequency in the founder strains. For markers with high apparent error rates, a scatterplot of the allele-specific probe intensities can reveal the underlying cause of incorrect genotype calls. The decision to include or exclude low-quality samples can have a significant impact on the mapping results for a given study. We find that the impact of low-quality markers on a given study is often minimal, but reporting problematic markers can improve the utility of the genotyping array across many studies. Keywords: MPP; Multiparent Advanced Generation Inter-Cross (MAGIC); QTL; data cleaning; data diagnostics; multiparental populations; quantitative trait loci.
41	2019	Insight into genetic regulation of miRNA in mouse brain	Kordas G, Rudra P, Hendricks A, Saba L, Kechris K	BMC Genomics. 2019 Nov 13;20(1):849.	PMID: 31722663		Background: micro RNA (miRNA) are important regulators of gene expression and may influence phenotypes and disease traits. The connection between genetics and miRNA expression can be determined through expression quantitative loci (eQTL) analysis, which has been extensively used in a variety of tissues, and in both human and model organisms. miRNA play an important role in brain-related diseases, but eQTL studies of miRNA in brain tissue are limited. We aim to catalog miRNA eQTL in brain tissue using miRNA expression measured on a recombinant inbred mouse panel. Because samples were collected without any intervention or treatment (naïve), the panel allows characterization of genetic influences on miRNAs' expression levels. We used brain RNA expression levels of 881 miRNA and 1416 genomic locations to identify miRNA eQTL. To address multiple testing, we employed permutation p-values and subsequent zero permutation p-value correction. We also investigated the underlying biology of miRNA regulation using additional analyses, including hotspot analysis to search for regions controlling multiple miRNAs, and Bayesian network analysis to identify scenarios where a miRNA mediates the association between genotype and mRNA expression. We used addiction related phenotypes to illustrate the utility of our results. Results: Thirty-eight miRNA eQTL were identified after appropriate multiple testing corrections. Ten of these miRNAs had target genes enriched for brain-related pathways and mapped to four miRNA eQTL hotspots. Bayesian network analysis revealed four biological networks relating genetic variation, miRNA expression and gene expression. Conclusions: Our extensive evaluation of miRNA eQTL provides valuable insight into the role of miRNA regulation in brain tissue. Our miRNA eQTL analysis and extended statistical exploration identifies miRNA candidates in brain for future study. Keywords: Bayesian networks; Brain; Hotspots; Mediation; eQTL; miRNA.
42	2019	Matrix Linear Models for High-Throughput Chemical Genetic Screens	Liang JW, Nichols RJ, Sen S	Genetics. 2019 Aug;212(4):1063–1073.	PMID: 31243057	Software link	We develop a flexible and computationally efficient approach for analyzing high-throughput chemical genetic screens. In such screens, a library of genetic mutants is phenotyped in a large number of stresses. Typically, interactions between genes and stresses are detected by grouping the mutants and stresses into categories, and performing modified t-tests for each combination. This approach does not have a natural extension if mutants or stresses have quantitative or nonoverlapping annotations (e.g., if conditions have doses or a mutant falls into more than one category simultaneously). We develop a matrix linear model (MLM) framework that allows us to model relationships between mutants and conditions in a simple, yet flexible, multivariate framework. It encodes both categorical and continuous relationships to enhance detection of associations. We develop a fast estimation algorithm that takes advantage of the structure of MLMs. We evaluate our method's performance in simulations and in an Escherichia coli chemical genetic screen, comparing it with an existing univariate approach based on modified t-tests. We show that MLMs perform slightly better than the univariate approach when mutants and conditions are classified in nonoverlapping categories, and substantially better when conditions can be ordered in dosage categories. Therefore, it is an attractive alternative to current methods, and provides a computationally scalable framework for larger and complex chemical genetic screens. A Julia language implementation of MLMs and the code used for this paper are available at https://github.com/janewliang/GeneticScreen.jl and https://bitbucket.org/jwliang/mlm_gs_supplement, respectively. Keywords: E. coli; chemical genetic screens; high-throughput data; linear models.
43	2019	Impact of Genetic Variation on Stress-Related Ethanol Consumption	Mulligan MK, Lu L, Cavigelli SA, Mormède P, Terenina E, Zhao W, Williams RW, Jones BC	Alcohol Clin Exp Res. 2019 Jul;43(7):1391–1402.	PMID: 31034606		Background: The effect of stress on alcohol consumption in humans is highly variable, and the underlying processes are not yet understood. Attempts to model a positive relationship between stress and increased ethanol (EtOH) consumption in animals have been only modestly successful. Our hypothesis is that individual differences in stress effects on EtOH consumption are mediated by genetics. Methods: We measured alcohol consumption, using the drinking-in-the-dark (DID) paradigm in females from 2 inbred mouse strains, C57BL/6J (B6) and DBA/2J (D2), and 35 of their inbred progeny (the BXD family). A control group was maintained in normal housing and a stress group was exposed to chronic mild stress (CMS), consisting of unpredictable stressors over 7 weeks. These included predator, social, and environmental perturbations. Alcohol intake was measured over 16 weeks in both groups during baseline (preceding 5-week period), CMS (intervening 7-week period), and post-CMS (final 4-week period). Results: We detected a strong effect of CMS on alcohol intake. A few strains demonstrated CMS-related increased alcohol consumption; however, most showed decreased intake. We identified 1 nearly significant quantitative trait locus on chromosome 5 that contains the neuronal nitric oxide synthase gene (Nos1). The expression of Nos1 is frequently changed following alcohol exposure, and variants in this gene segregating among the BXD population may modulate alcohol intake in response to stress. Conclusions: The results we present here represent the first study to combine chronic stress and alcohol consumption in a genetic reference population of mice. Differences in susceptibility to the effects of stressful environments vis-à-vis alcohol use disorders would suggest that the differences have at least some basis in genetic constitution. We have also nominated a likely candidate gene underlying the large individual differences in effects of stress on alcohol consumption. Keywords: Ethanol Chronic Mild Stress; Forward Genetic Analysis; Nos1; Quantitative Trait Loci Analysis.
44	2019	Evaluation of Sirtuin-3 probe quality and co-expressed genes using literature cohesion	Roy S, Zaman KI, Williams RW, Homayouni R	BMC Bioinformatics. 2019 Mar 14;20(Suppl 2):104.	PMID: 30871457		Background: Gene co-expression studies can provide important insights into molecular and cellular signaling pathways. The GeneNetwork database is a unique resource for co-expression analysis using data from a variety of tissues across genetically distinct inbred mice. However, extraction of biologically meaningful co-expressed gene sets is challenging due to variability in microarray platforms, probe quality, normalization methods, and confounding biological factors. In this study, we tested whether literature derived functional cohesion could be used as an objective metric in lieu of 'ground truth' to evaluate the quality of probes and microarray datasets. Results: We examined Sirtuin-3 (Sirt3) co-expressed gene sets extracted from either liver or brain tissues of BXD recombinant inbred mice in the GeneNetwork database. Depending on the microarray platform, there were as many as 26 probes that targeted different regions of Sirt3 primary transcript. Co-expressed gene sets (ranging from 100-1000 genes) associated with each Sirt3 probe were evaluated using the previously developed literature-derived cohesion p-value (LPv) and benchmarked against 'gold standards' derived from proteomic studies or Gene Ontology classifications. We found that the maximal F-measure was obtained at an average window size of 535 genes. Using set size of 500 genes, the Pearson correlations between LPv and F-measure as well as between LPv and mitochondrial gene enrichment p-values were 0.90 and 0.93, respectively. Importantly, we found that the LPv approach can distinguish high quality Sirt3 probes. Analysis of the most functionally cohesive Sirt3 co-expressed gene set revealed core metabolic pathways that were shared between hippocampus and liver as well as distinct pathways which were unique to each tissue. These results are consistent with other studies that suggest Sirt3 is a key metabolic regulator and has distinct functions in energy-producing vs. energy-demanding tissues. Conclusions: Our results provide proof-of-concept that literature cohesion analysis is useful for evaluating the quality of probes and microarray datasets, particularly when experimentally derived gold standards are unavailable. Our approach would enable researchers to rapidly identify biologically meaningful co-expressed gene sets and facilitate discovery from high throughput genomic data. Keywords: BXD mice; GeneNetwork.org; Latent Semantic Indexing; Microarray; Sirt3; Text mining.
45	2019	Unsupervised discovery of phenotype-specific multi-omics networks	Shi WJ, Zhuang Y, Russell PH, Hobbs BD, Parker MM, Castaldi PJ, Rudra P, Vestal B, Hersh CP, Saba LM, Kechris K	Bioinformatics. 2019 Nov 1;35(21):4336–4343.	PMID: 30957844	Software link	Motivation: Complex diseases often involve a wide spectrum of phenotypic traits. Better understanding of the biological mechanisms relevant to each trait promotes understanding of the etiology of the disease and the potential for targeted and effective treatment plans. There have been many efforts towards omics data integration and network reconstruction, but limited work has examined the incorporation of relevant (quantitative) phenotypic traits. Results: We propose a novel technique, sparse multiple canonical correlation network analysis (SmCCNet), for integrating multiple omics data types along with a quantitative phenotype of interest, and for constructing multi-omics networks that are specific to the phenotype. As a case study, we focus on miRNA-mRNA networks. Through simulations, we demonstrate that SmCCNet has better overall prediction performance compared to popular gene expression network construction and integration approaches under realistic settings. Applying SmCCNet to studies on chronic obstructive pulmonary disease (COPD) and breast cancer, we found enrichment of known relevant pathways (e.g. the Cadherin pathway for COPD and the interferon-gamma signaling pathway for breast cancer) as well as less known omics features that may be important to the diseases. Although those applications focus on miRNA-mRNA co-expression networks, SmCCNet is applicable to a variety of omics and other data types. It can also be easily generalized to incorporate multiple quantitative phenotype simultaneously. The versatility of SmCCNet suggests great potential of the approach in many areas. Availability and implementation: The SmCCNet algorithm is written in R, and is freely available on the web at https://cran.r-project.org/web/packages/SmCCNet/index.html.
46	2019	Taar1 gene variants have a causal role in methamphetamine intake and response and interact with Oprm1	Stafford AM, Reed C, Baba H, Walter NA, Mootz JR, Williams RW, Neve KA, Fedorov LM, Janowsky AJ, Phillips TJ	Elife. 2019 Jul 9;8:e46472.	PMID: 31274109		We identified a locus on mouse chromosome 10 that accounts for 60% of the genetic variance in methamphetamine intake in mice selectively bred for high versus low methamphetamine consumption. We nominated the trace amine-associated receptor 1 gene, Taar1, as the strongest candidate and identified regulation of the mu-opioid receptor 1 gene, Oprm1, as another contributor. This study exploited CRISPR-Cas9 to test the causal role of Taar1 in methamphetamine intake and a genetically-associated thermal response to methamphetamine. The methamphetamine-related traits were rescued, converting them to levels found in methamphetamine-avoiding animals. We used a family of recombinant inbred mouse strains for interval mapping and to examine independent and epistatic effects of Taar1 and Oprm1. Both methamphetamine intake and the thermal response mapped to Taar1 and the independent effect of Taar1 was dependent on genotype at Oprm1. Our findings encourage investigation of the contribution of Taar1 and Oprm1 variants to human methamphetamine addiction. Keywords: CRISPR-Cas9; addiction; genetics; genomics; hypothermia; mouse; mu-opioid receptor; neuroscience; self administration; trace amine-associated receptor 1.
47	2019	Development of a tissue augmented Bayesian model for expression quantitative trait loci analysis	Zhuang YH, Wade K, Saba LM, Kechris K	Math Biosci Eng. 2019 Sep 26;17(1):122–143.	PMID: 31731343		Expression quantitative trait loci (eQTL) analyses detect genetic variants (SNPs) associated with RNA expression levels of genes. The conventional eQTL analysis is to perform individual tests for each gene-SNP pair using simple linear regression and to perform the test on each tissue separately ignoring the extensive information known about RNA expression in other tissue(s). Although Bayesian models have been recently developed to improve eQTL prediction on multiple tissues, they are often based on uninformative priors or treat all tissues equally. In this study, we develop a novel tissue augmented Bayesian model for eQTL analysis (TA-eQTL), which takes prior eQTL information from a different tissue into account to better predict eQTL for another tissue. We demonstrate that our modified Bayesian model has comparable performance to several existing methods in terms of sensitivity and specificity using allele-specific expression (ASE) as the gold standard. Furthermore, the tissue augmented Bayesian model improves the power and accuracy for local-eQTL prediction especially when the sample size is small. In summary, TA-eQTL's performance is comparable to existing methods but has additional flexibility to evaluate data from different platforms, can focus prediction on one tissue using only summary statistics from the secondary tissue(s), and provides a closed form solution for estimation. Keywords: Bayesian model; eQTL; allele-specific expression.
48	2018	Post-genomic behavioral genetics: From revolution to routine	Ashbrook DG, Mulligan MK, Williams RW	Genes Brain Behav. 2018 Mar;17(3):e12441.	PMID: 29193773		What was once expensive and revolutionary-full-genome sequence-is now affordable and routine. Costs will continue to drop, opening up new frontiers in behavioral genetics. This shift in costs from the genome to the phenome is most notable in large clinical studies of behavior and associated diseases in cohorts that exceed hundreds of thousands of subjects. Examples include the Women's Health Initiative (www.whi.org), the Million Veterans Program (www. Research: va.gov/MVP), the 100 000 Genomes Project (genomicsengland.co.uk) and commercial efforts such as those by deCode (www.decode.com) and 23andme (www.23andme.com). The same transition is happening in experimental neuro- and behavioral genetics, and sample sizes of many hundreds of cases are becoming routine (www.genenetwork.org, www.mousephenotyping.org). There are two major consequences of this new affordability of massive omics datasets: (1) it is now far more practical to explore genetic modulation of behavioral differences and the key role of gene-by-environment interactions. Researchers are already doing the hard part-the quantitative analysis of behavior. Adding the omics component can provide powerful links to molecules, cells, circuits and even better treatment. (2) There is an acute need to highlight and train behavioral scientists in how best to exploit new omics approaches. This review addresses this second issue and highlights several new trends and opportunities that will be of interest to experts in animal and human behaviors. Keywords: GWAS; PheWAS; QTL mapping; behavior; complex traits; genomics; omics; quantitative trait locus; reverse genetics; systems genetics.
49	2018	Voluntary exposure to a toxin: the genetic influence on ethanol consumption	Hoffman PL, Saba LM, Vanderlinden LA, Tabakoff B	Mamm Genome. 2018 Feb;29(1–2):128–140.	PMID: 29196862		Ethyl alcohol is a toxin that, when consumed at high levels, produces organ damage and death. One way to prevent or ameliorate this damage in humans is to reduce the exposure of organs to alcohol by reducing alcohol ingestion. Both the propensity to consume large volumes of alcohol and the susceptibility of human organs to alcohol-induced damage exhibit a strong genetic influence. We have developed an integrative genetic/genomic approach to identify transcriptional networks that predispose complex traits, including propensity for alcohol consumption and propensity for alcohol-induced organ damage. In our approach, the phenotype is assessed in a panel of recombinant inbred (RI) rat strains, and quantitative trait locus (QTL) analysis is performed. Transcriptome data from tissues/organs of naïve RI rat strains are used to identify transcriptional networks using Weighted Gene Coexpression Network Analysis (WGCNA). Correlation of the first principal component of transcriptional coexpression modules with the phenotype across the rat strains, and overlap of QTLs for the phenotype and the QTLs for the coexpression modules (module eigengene QTL) provide the criteria for identification of the functionally related groups of genes that contribute to the phenotype (candidate modules). While we previously identified a brain transcriptional module whose QTL overlapped with a QTL for levels of alcohol consumption in HXB/BXH RI rat strains and 12 selected rat lines, this module did not account for all of the genetic variation in alcohol consumption. Our search for QTL overlap and correlation of coexpression modules with phenotype can, however, be applied to any organ in which the transcriptome has been measured, and this represents a holistic approach in the search for genetic contributors to complex traits. Previous work has implicated liver/brain interactions, particularly involving inflammatory/immune processes, as influencing alcohol consumption levels. We have now analyzed the liver transcriptome of the HXB/BXH RI rat panel in relation to the behavioral trait of alcohol consumption. We used RNA-Seq and microarray data to construct liver transcriptional networks, and identified a liver candidate transcriptional coexpression module that explained 24% of the genetic variance in voluntary alcohol consumption. The transcripts in this module focus attention on liver secretory products that influence inflammatory and immune signaling pathways. We propose that these liver secretory products can interact with brain mechanisms that affect alcohol consumption, and targeting these pathways provides a potential approach to reducing high levels of alcohol intake and also protecting the integrity of the liver and other organs.
50	2018	Reproducibility and replicability of rodent phenotyping in preclinical studies	Kafkafi N, Agassi J, Chesler EJ, Crabbe JC, Crusio WE, Eilam D, Gerlai R, Golani I, Gomez-Marin A, Heller R, Iraqi F, Jaljuli I, Karp NA, Morgan H, Nicholson G, Pfaff DW, Richter SH, Stark PB, Stiedl O, Stodden V, Tarantino LM, Tucci V, Valdar W, Williams RW, Würbel H, Benjamini Y	Neurosci Biobehav Rev. 2018 Apr;87:218–232.	PMID: 29357292		The scientific community is increasingly concerned with the proportion of published ""discoveries"" that are not replicated in subsequent studies. The field of rodent behavioral phenotyping was one of the first to raise this concern, and to relate it to other methodological issues: the complex interaction between genotype and environment; the definitions of behavioral constructs; and the use of laboratory mice and rats as model species for investigating human health and disease mechanisms. In January 2015, researchers from various disciplines gathered at Tel Aviv University to discuss these issues. The general consensus was that the issue is prevalent and of concern, and should be addressed at the statistical, methodological and policy levels, but is not so severe as to call into question the validity and the usefulness of model organisms as a whole. Well-organized community efforts, coupled with improved data and metadata sharing, have a key role in identifying specific problems and promoting effective solutions. Replicability is closely related to validity, may affect generalizability and translation of findings, and has important ethical implications. Keywords: Data sharing; False discoveries; GxE interaction; Heterogenization; Replicability; Reproducibility; Validity.
51	2018	Unsupervised, Statistically Based Systems Biology Approach for Unraveling the Genetics of Complex Traits: A Demonstration with Ethanol Metabolism	Lusk R, Saba LM, Vanderlinden LA, Zidek V, Silhavy J, Pravenec M, Hoffman PL, Tabakoff B	Alcohol Clin Exp Res. 2018 Jul;42(7):1177–1191.	PMID: 29689131		Background: A statistical pipeline was developed and used for determining candidate genes and candidate gene coexpression networks involved in 2 alcohol (i.e., ethanol [EtOH]) metabolism phenotypes, namely alcohol clearance and acetate area under the curve in a recombinant inbred (RI) (HXB/BXH) rat panel. The approach was also used to provide an indication of how EtOH metabolism can impact the normal function of the identified networks. Methods: RNA was extracted from alcohol-naïve liver tissue of 30 strains of HXB/BXH RI rats. The reconstructed transcripts were quantitated, and data were used to construct gene coexpression modules and networks. A separate group of rats, comprising the same 30 strains, were injected with EtOH (2 g/kg) for measurement of blood EtOH and acetate levels. These data were used for quantitative trait loci (QTL) analysis of the rate of EtOH disappearance and circulating acetate levels. The analysis pipeline required calculation of the module eigengene values, the correction of these values with EtOH metabolism rates and acetate levels across the rat strains, and the determination of the eigengene QTLs. For a module to be considered a candidate for determining phenotype, the module eigengene values had to have significant correlation with the strain phenotypic values and the module eigengene QTLs had to overlap the phenotypic QTLs. Results: Of the 658 transcript coexpression modules generated from liver RNA sequencing data, a single module satisfied all criteria for being a candidate for determining the alcohol clearance trait. This module contained 2 alcohol dehydrogenase genes, including the gene whose product was previously shown to be responsible for the majority of alcohol elimination in the rat. This module was also the only module identified as a candidate for influencing circulating acetate levels. This module was also linked to the process of generation and utilization of retinoic acid as related to the autonomous immune response. Conclusions: We propose that our analytical pipeline can successfully identify genetic regions and transcripts which predispose a particular phenotype and our analysis provides functional context for coexpression module components. Keywords: Alcohol Metabolism; HXB/BXH Recombinant Inbred Rat Panel; Liver; Quantitative Trait Locus Mapping; RNA Sequencing; Weighted Gene Coexpression Network Analysis.
52	2018	Condition-adaptive fused graphical lasso (CFGL): An adaptive procedure for inferring condition-specific gene co-expression network	Lyu Y, Xue L, Zhang F, Koch H, Saba L, Kechris K, Li Q	PLoS Comput Biol. 2018 Sep;14(9):e1006436.	PMID: 30240439	Software link	Co-expression network analysis provides useful information for studying gene regulation in biological processes. Examining condition-specific patterns of co-expression can provide insights into the underlying cellular processes activated in a particular condition. One challenge in this type of analysis is that the sample sizes in each condition are usually small, making the statistical inference of co-expression patterns highly underpowered. A joint network construction that borrows information from related structures across conditions has the potential to improve the power of the analysis. One possible approach to constructing the co-expression network is to use the Gaussian graphical model. Though several methods are available for joint estimation of multiple graphical models, they do not fully account for the heterogeneity between samples and between co-expression patterns introduced by condition specificity. Here we develop the condition-adaptive fused graphical lasso (CFGL), a data-driven approach to incorporate condition specificity in the estimation of co-expression networks. We show that this method improves the accuracy with which networks are learned. The application of this method on a rat multi-tissue dataset and The Cancer Genome Atlas (TCGA) breast cancer dataset provides interesting biological insights. In both analyses, we identify numerous modules enriched for Gene Ontology functions and observe that the modules that are upregulated in a particular condition are often involved in condition-specific activities. Interestingly, we observe that the genes strongly associated with survival time in the TCGA dataset are less likely to be network hubs, suggesting that genes associated with cancer progression are likely to govern specific functions or execute final biological functions in pathways, rather than regulating a large number of biological processes. Additionally, we observed that the tumor-specific hub genes tend to have few shared edges with normal tissue, revealing tumor-specific regulatory mechanism.
53	2018	Predictive modeling of miRNA-mediated predisposition to alcohol-related phenotypes in mouse	Rudra P, Shi WJ, Russell P, Vestal B, Tabakoff B, Hoffman P, Kechris K, Saba L	BMC Genomics. 2018 Aug 29;19(1):639.	PMID: 30157779		Background: MicroRNAs (miRNAs) are small non-coding RNAs that bind messenger RNAs and promote their degradation or repress their translation. There is increasing evidence of miRNAs playing an important role in alcohol related disorders. However, the role of miRNAs as mediators of the genetic effect on alcohol phenotypes is not fully understood. We conducted a high-throughput sequencing study to measure miRNA expression levels in alcohol naïve animals in the LXS panel of recombinant inbred (RI) mouse strains. We then combined the sequencing data with genotype data, microarry gene expression data, and data on alcohol-related behavioral phenotypes such as 'Drinking in the dark', 'Sleep time', and 'Low dose activation' from the same RI panel. SNP-miRNA-gene triplets with strong association within the triplet that were also associated with one of the 4 alcohol phenotypes were selected and a Bayesian network analysis was used to aggregate results into a directed network model. Results: We found several triplets with strong association within the triplet that were also associated with one of the alcohol phenotypes. The Bayesian network analysis found two networks where a miRNA mediates the genetic effect on the alcohol phenotype. The miRNAs were found to influence the expression of protein-coding genes, which in turn influences the quantitative phenotypes. The pathways in which these genes are enriched have been previously associated with alcohol-related traits. Conclusion: This work enhances association studies by identifying miRNAs that may be mediating the association between genetic markers (SNPs) and the alcohol phenotypes. It suggests a mechanism of how genetic variants are affecting traits of interest through the modification of miRNA expression. Keywords: Bayesian network; Ethanol; Systems genetics; microRNA.
54	2018	miR-MaGiC improves quantification accuracy for small RNA-seq	Russell PH, Vestal B, Shi W, Rudra PD, Dowell R, Radcliffe R, Saba L, Kechris K	BMC Res Notes. 2018 May 15;11(1):296.	PMID: 29764489	Software link	Objective: Many tools have been developed to profile microRNA (miRNA) expression from small RNA-seq data. These tools must contend with several issues: the small size of miRNAs, the small number of unique miRNAs, the fact that similar miRNAs can be transcribed from multiple loci, and the presence of miRNA isoforms known as isomiRs. Methods failing to address these issues can return misleading information. We propose a novel quantification method designed to address these concerns. Results: We present miR-MaGiC, a novel miRNA quantification method, implemented as a cross-platform tool in Java. miR-MaGiC performs stringent mapping to a core region of each miRNA and defines a meaningful set of target miRNA sequences by collapsing the miRNA space to ""functional groups"". We hypothesize that these two features, mapping stringency and collapsing, provide more optimal quantification to a more meaningful unit (i.e., miRNA family). We test miR-MaGiC and several published methods on 210 small RNA-seq libraries, evaluating each method's ability to accurately reflect global miRNA expression profiles. We define accuracy as total counts close to the total number of input reads originating from miRNAs. We find that miR-MaGiC, which incorporates both stringency and collapsing, provides the most accurate counts. Keywords: Expression quantification; MicroRNA; Small RNA-seq; miRNA.
55	2018	miRNA-regulated transcription associated with mouse strains predisposed to hypnotic effects of ethanol	Vestal B, Russell P, Radcliffe RA, Bemis L, Saba LM, Kechris K	Brain Behav. 2018 Jun;8(6):e00989.	PMID: 30106247		Introduction: Studying innate sensitivity to ethanol can be an important first step toward understanding alcohol use disorders. In brain, we investigated transcripts, with evidence of miRNA modulation related to a predisposition to the hypnotic effect of ethanol, as measured by loss of righting reflex (LORR). Methods: Expression of miRNAs (12 samples) and expression of mRNAs (353 samples) in brain were independently analyzed for an association with LORR in mice from the LXS recombinant inbred panel gathered across several small studies. These results were then integrated via a meta-analysis of miRNA-mRNA target pairs identified in miRNA-target interaction databases. Results: We found 112 significant miRNA-mRNA pairs where a large majority of miRNAs and mRNAs were highly interconnected. Most pairs indicated a pattern of increased levels of miRNAs and reduced levels of mRNAs being associated with more alcohol-sensitive strains. For example, CaMKIIn1 was targeted by multiple miRNAs associated with LORR. CAMK2N1 is an inhibitor of CAMK2, which among other functions, phosphorylates, or binds to GABAA and NMDA receptors. Conclusions: Our results suggest a novel role of miRNA-mediated regulation of an inhibitor of CAMK2 and its downstream targets including the GABAA and NMDA receptors, which have been previously implicated to have a role in ethanol-induced sedation and sensitivity. Keywords: RNA sequencing; calcium/calmodulin‐dependent protein kinase II inhibitor 1; ethanol sensitivity; loss of righting reflex; mRNA expression; microRNA expression; microRNA target interactions; mmu‐miR‐106b; mouse brain; recombinant Inbred mouse strains.