Statistical Methods for Whole Transcriptome Sequencing

Statistical Methods for Whole Transcriptome Sequencing
Author :
Publisher :
Total Pages : 0
Release :
ISBN-10 : OCLC:1334941765
ISBN-13 :
Rating : 4/5 (65 Downloads)

Synopsis Statistical Methods for Whole Transcriptome Sequencing by : Cheng Jia

RNA-Sequencing (RNA-Seq) has enabled detailed unbiased profiling of whole transcriptomes with incredible throughput. Recent technological breakthroughs have pushed back the frontiers of RNA expression measurement to single-cell level (scRNA-Seq). With both bulk and single-cell RNA-Seq analyses, modeling of the noise structure embedded in the data is crucial for drawing correct inference. In this dissertation, I developed a series of statistical methods to account for the technical variations specific in RNA-Seq experiments in the context of isoform- or gene- level differential expression analyses. In the first part of my dissertation, I developed MetaDiff (https://github.com/jiach/MetaDiff ), a random-effects meta-regression model, that allows the incorporation of uncertainty in isoform expression estimation in isoform differential expression analysis. This framework was further extended to detect splicing quantitative trait loci with RNA-Seq data. In the second part of my dissertation, I developed TASC (Toolkit for Analysis of Single-Cell data; https://github.com/scrna-seq/TASC), a hierarchical mixture model, to explicitly adjust for cell-to-cell technical differences in scRNA-Seq analysis using an empirical Bayes approach. This framework can be adapted to perform differential gene expression analysis. In the third part of my dissertation, I developed, TASC-B, a method extended from TASC to model transcriptional bursting- induced zero-inflation. This model can identify and test for the difference in the level of transcriptional bursting. Compared to existing methods, these new tools that I developed have been shown to better control the false discovery rate in situations where technical noise cannot be ignored. They also display superior power in both our simulation studies and real world applications.

Statistical Analysis of Next Generation Sequencing Data

Statistical Analysis of Next Generation Sequencing Data
Author :
Publisher : Springer
Total Pages : 438
Release :
ISBN-10 : 9783319072128
ISBN-13 : 3319072129
Rating : 4/5 (28 Downloads)

Synopsis Statistical Analysis of Next Generation Sequencing Data by : Somnath Datta

Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized medicine. About the editors: Somnath Datta is Professor and Vice Chair of Bioinformatics and Biostatistics at the University of Louisville. He is Fellow of the American Statistical Association, Fellow of the Institute of Mathematical Statistics and Elected Member of the International Statistical Institute. He has contributed to numerous research areas in Statistics, Biostatistics and Bioinformatics. Dan Nettleton is Professor and Laurence H. Baker Endowed Chair of Biological Statistics in the Department of Statistics at Iowa State University. He is Fellow of the American Statistical Association and has published research on a variety of topics in statistics, biology and bioinformatics.

Handbook of Statistical Genomics

Handbook of Statistical Genomics
Author :
Publisher : John Wiley & Sons
Total Pages : 1740
Release :
ISBN-10 : 9781119429258
ISBN-13 : 1119429250
Rating : 4/5 (58 Downloads)

Synopsis Handbook of Statistical Genomics by : David J. Balding

A timely update of a highly popular handbook on statistical genomics This new, two-volume edition of a classic text provides a thorough introduction to statistical genomics, a vital resource for advanced graduate students, early-career researchers and new entrants to the field. It introduces new and updated information on developments that have occurred since the 3rd edition. Widely regarded as the reference work in the field, it features new chapters focusing on statistical aspects of data generated by new sequencing technologies, including sequence-based functional assays. It expands on previous coverage of the many processes between genotype and phenotype, including gene expression and epigenetics, as well as metabolomics. It also examines population genetics and evolutionary models and inference, with new chapters on the multi-species coalescent, admixture and ancient DNA, as well as genetic association studies including causal analyses and variant interpretation. The Handbook of Statistical Genomics focuses on explaining the main ideas, analysis methods and algorithms, citing key recent and historic literature for further details and references. It also includes a glossary of terms, acronyms and abbreviations, and features extensive cross-referencing between chapters, tying the different areas together. With heavy use of up-to-date examples and references to web-based resources, this continues to be a must-have reference in a vital area of research. Provides much-needed, timely coverage of new developments in this expanding area of study Numerous, brand new chapters, for example covering bacterial genomics, microbiome and metagenomics Detailed coverage of application areas, with chapters on plant breeding, conservation and forensic genetics Extensive coverage of human genetic epidemiology, including ethical aspects Edited by one of the leading experts in the field along with rising stars as his co-editors Chapter authors are world-renowned experts in the field, and newly emerging leaders. The Handbook of Statistical Genomics is an excellent introductory text for advanced graduate students and early-career researchers involved in statistical genetics.

RNA-seq Data Analysis

RNA-seq Data Analysis
Author :
Publisher : CRC Press
Total Pages : 314
Release :
ISBN-10 : 9781466595019
ISBN-13 : 1466595019
Rating : 4/5 (19 Downloads)

Synopsis RNA-seq Data Analysis by : Eija Korpelainen

The State of the Art in Transcriptome AnalysisRNA sequencing (RNA-seq) data offers unprecedented information about the transcriptome, but harnessing this information with bioinformatics tools is typically a bottleneck. RNA-seq Data Analysis: A Practical Approach enables researchers to examine differential expression at gene, exon, and transcript le

Statistical Methods for RNA-sequencing Data

Statistical Methods for RNA-sequencing Data
Author :
Publisher :
Total Pages : 0
Release :
ISBN-10 : OCLC:1232108046
ISBN-13 :
Rating : 4/5 (46 Downloads)

Synopsis Statistical Methods for RNA-sequencing Data by : Rhonda Bacher

Major methodological and technological advances in sequencing have inspired ambitious biological questions that were previously elusive. Addressing such questions with novel and complex data requires statistically rigorous tools. In this dissertation, I develop, evaluate, and apply statistical and computational methods for analysis of high-throughput sequencing data. A unifying theme of this work is that all these methods are aimed at RNA-seq data. The first method focuses on characterizing gene expression in RNA-seq experiments with ordered conditions. The second focuses on single-cell RNA-seq data, where we develop a method for normalization to account for a previously unknown technical artifact in the data. Finally, we develop a simulation in order to recapitulate the source of the artifact [in silico].

Statistical Methods for Bulk and Single-cell RNA Sequencing Data

Statistical Methods for Bulk and Single-cell RNA Sequencing Data
Author :
Publisher :
Total Pages : 207
Release :
ISBN-10 : OCLC:1103714866
ISBN-13 :
Rating : 4/5 (66 Downloads)

Synopsis Statistical Methods for Bulk and Single-cell RNA Sequencing Data by : Wei Li

Since the invention of next-generation RNA sequencing (RNA-seq) technologies, they have become a powerful tool to study the presence and quantity of RNA molecules in biological samples and have revolutionized transcriptomic studies on bulk tissues. Recently, the emerging single-cell RNA sequencing (scRNA-seq) technologies enable the investigation of transcriptomic landscapes at a single-cell resolution, providing a chance to characterize stochastic heterogeneity within a cell population. The analysis of bulk and single-cell RNA-seq data at four different levels (samples, genes, transcripts, and exons) involves multiple statistical and computational questions, some of which remain challenging up to date. The first part of this dissertation focuses on the statistical challenges in the transcript-level analysis of bulk RNA-seq data. The next-generation RNA-seq technologies have been widely used to assess full-length RNA isoform structure and abundance in a high-throughput manner, enabling us to better understand the alternative splicing process and transcriptional regulation mechanism. However, accurate isoform identification and quantification from RNA-seq data are challenging due to the information loss in sequencing experiments. In Chapter 2, given the fast accumulation of multiple RNA-seq datasets from the same biological condition, we develop a statistical method, MSIQ, to achieve more accurate isoform quantification by integrating multiple RNA-seq samples under a Bayesian framework. The MSIQ method aims to (1) identify a consistent group of samples with homogeneous quality and (2) improve isoform quantification accuracy by jointly modeling multiple RNA-seq samples and allowing for higher weights on the consistent group. We show that MSIQ provides a consistent estimator of isoform abundance, and we demonstrate the accuracy of MSIQ compared with alternative methods through both simulation and real data studies. In Chapter 3, we introduce a novel method, AIDE, the first approach that directly controls false isoform discoveries by implementing the statistical model selection principle. Solving the isoform discovery problem in a stepwise manner, AIDE prioritizes the annotated isoforms and precisely identifies novel isoforms whose addition significantly improves the explanation of observed RNA-seq reads. Our results demonstrate that AIDE has the highest precision compared to the state-of-the-art methods, and it is able to identify isoforms with biological functions in pathological conditions. The second part of this dissertation discusses two statistical methods to improve scRNA-seq data analysis, which is complicated by the excess missing values, the so-called dropouts due to low amounts of mRNA sequenced within individual cells. In Chapter 5, we introduce scImpute, a statistical method to accurately and robustly impute the dropouts in scRNA-seq data. The scImpute method automatically identifies likely dropouts, and only performs imputation on these values by borrowing information across similar cells. Evaluation based on both simulated and real scRNA-seq data suggests that scImpute is an effective tool to recover transcriptome dynamics masked by dropouts, enhance the clustering of cell subpopulations, and improve the accuracy of differential expression analysis. In Chapter 6, we propose a flexible and robust simulator, scDesign, to optimize the choices of sequencing depth and cell number in designing scRNA-seq experiments, so as to balance the exploration of the depth and breadth of transcriptome information. It is the first statistical framework for researchers to quantitatively assess practical scRNA-seq experimental design in the context of differential gene expression analysis. In addition to experimental design, scDesign also assists computational method development by generating high-quality synthetic scRNA-seq datasets under customized experimental settings.

Statistical Methods for the Analysis of Genomic Data

Statistical Methods for the Analysis of Genomic Data
Author :
Publisher : MDPI
Total Pages : 136
Release :
ISBN-10 : 9783039361403
ISBN-13 : 3039361406
Rating : 4/5 (03 Downloads)

Synopsis Statistical Methods for the Analysis of Genomic Data by : Hui Jiang

In recent years, technological breakthroughs have greatly enhanced our ability to understand the complex world of molecular biology. Rapid developments in genomic profiling techniques, such as high-throughput sequencing, have brought new opportunities and challenges to the fields of computational biology and bioinformatics. Furthermore, by combining genomic profiling techniques with other experimental techniques, many powerful approaches (e.g., RNA-Seq, Chips-Seq, single-cell assays, and Hi-C) have been developed in order to help explore complex biological systems. As a result of the increasing availability of genomic datasets, in terms of both volume and variety, the analysis of such data has become a critical challenge as well as a topic of great interest. Therefore, statistical methods that address the problems associated with these newly developed techniques are in high demand. This book includes a number of studies that highlight the state-of-the-art statistical methods for the analysis of genomic data and explore future directions for improvement.

Handbook of Statistical Genomics

Handbook of Statistical Genomics
Author :
Publisher : John Wiley & Sons
Total Pages : 1223
Release :
ISBN-10 : 9781119429142
ISBN-13 : 1119429145
Rating : 4/5 (42 Downloads)

Synopsis Handbook of Statistical Genomics by : David J. Balding

A timely update of a highly popular handbook on statistical genomics This new, two-volume edition of a classic text provides a thorough introduction to statistical genomics, a vital resource for advanced graduate students, early-career researchers and new entrants to the field. It introduces new and updated information on developments that have occurred since the 3rd edition. Widely regarded as the reference work in the field, it features new chapters focusing on statistical aspects of data generated by new sequencing technologies, including sequence-based functional assays. It expands on previous coverage of the many processes between genotype and phenotype, including gene expression and epigenetics, as well as metabolomics. It also examines population genetics and evolutionary models and inference, with new chapters on the multi-species coalescent, admixture and ancient DNA, as well as genetic association studies including causal analyses and variant interpretation. The Handbook of Statistical Genomics focuses on explaining the main ideas, analysis methods and algorithms, citing key recent and historic literature for further details and references. It also includes a glossary of terms, acronyms and abbreviations, and features extensive cross-referencing between chapters, tying the different areas together. With heavy use of up-to-date examples and references to web-based resources, this continues to be a must-have reference in a vital area of research. Provides much-needed, timely coverage of new developments in this expanding area of study Numerous, brand new chapters, for example covering bacterial genomics, microbiome and metagenomics Detailed coverage of application areas, with chapters on plant breeding, conservation and forensic genetics Extensive coverage of human genetic epidemiology, including ethical aspects Edited by one of the leading experts in the field along with rising stars as his co-editors Chapter authors are world-renowned experts in the field, and newly emerging leaders. The Handbook of Statistical Genomics is an excellent introductory text for advanced graduate students and early-career researchers involved in statistical genetics.

Mathematical and Statistical Methods for Genetic Analysis

Mathematical and Statistical Methods for Genetic Analysis
Author :
Publisher : Springer Science & Business Media
Total Pages : 277
Release :
ISBN-10 : 9781475727395
ISBN-13 : 1475727399
Rating : 4/5 (95 Downloads)

Synopsis Mathematical and Statistical Methods for Genetic Analysis by : Kenneth Lange

Geneticists now stand on the threshold of sequencing the genome in its entirety. The unprecedented insights into human disease and evolution offered by mapping and sequencing are transforming medicine and agriculture. This revolution depends vitally on the contributions made by applied mathematicians, statisticians, and computer scientists. Kenneth Lange has written a book to enable graduate students in the mathematical sciences to understand and model the epidemiological and experimental data encountered in genetics research. Mathematical, statistical, and computational principles relevant to this task are developed hand-in-hand with applications to gene mapping, risk prediction, and the testing of epidemiological hypotheses. The book covers many topics previously only accessible in journal articles, such as pedigree analysis algorithms, Markov chain, Monte Carlo methods, reconstruction of evolutionary trees, radiation hybrid mapping, and models of recombination. The whole is backed by numerous exercise sets.