Truncated Bayesian Nonparametrics

Download Truncated Bayesian Nonparametrics PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get Truncated Bayesian Nonparametrics book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages.
Truncated Bayesian Nonparametrics

Many datasets can be thought of as expressing a collection of underlying traits with unknown cardinality. Moreover, these datasets are often persistently growing, and we expect the number of expressed traits to likewise increase over time. Priors from Bayesian nonparametrics are well-suited to this modeling challenge: they generate a countably infinite number of underlying traits, which allows the number of expressed traits to both be random and to grow with the dataset size. We also require corresponding streaming, distributed inference algorithms that handle persistently growing datasets without slowing down over time. However, a key ingredient in streaming, distributed inference-an explicit representation of the latent variables used to statistically decouple the data-is not available for nonparametric priors, as we cannot simulate or store infinitely many random variables in practice. One approach is to approximate the nonparametric prior by developing a sequential representation-such that the traits are generated by a sequence of finite-dimensional distributions-and subsequently truncating it at some finite level, thus allowing explicit representation. However, truncated sequential representations have been developed only for a small number of priors in Bayesian nonparametrics, and the order they impose on the traits creates identifiability issues in the streaming, distributed setting. This thesis provides a comprehensive theoretical treatment of sequential representations and truncation in Bayesian nonparametrics. It details three sequential representations of a large class of nonparametric priors, and analyzes their truncation error and computational complexity. The results generalize and improve upon those existing in the literature. Next, the truncated explicit representations are used to develop the first streaming, distributed, asynchronous inference procedures for models from Bayesian nonparametrics. The combinatorial issues associated with trait identifiability in such models are resolved via a novel matching optimization. The resulting algorithms are fast, learning rate-free, and truncation-free. Taken together, these contributions provide the practitioner with the means to (1) develop multiple finite approximations for a given nonparametric prior; (2) determine which is the best for their application; and (3) use that approximation in the development of efficient streaming, distributed, asynchronous inference algorithms.
Bayesian Nonparametrics for Causal Inference and Missing Data

Bayesian Nonparametrics for Causal Inference and Missing Data provides an overview of flexible Bayesian nonparametric (BNP) methods for modeling joint or conditional distributions and functional relationships, and their interplay with causal inference and missing data. This book emphasizes the importance of making untestable assumptions to identify estimands of interest, such as missing at random assumption for missing data and unconfoundedness for causal inference in observational studies. Unlike parametric methods, the BNP approach can account for possible violations of assumptions and minimize concerns about model misspecification. The overall strategy is to first specify BNP models for observed data and then to specify additional uncheckable assumptions to identify estimands of interest. The book is divided into three parts. Part I develops the key concepts in causal inference and missing data and reviews relevant concepts in Bayesian inference. Part II introduces the fundamental BNP tools required to address causal inference and missing data problems. Part III shows how the BNP approach can be applied in a variety of case studies. The datasets in the case studies come from electronic health records data, survey data, cohort studies, and randomized clinical trials. Features • Thorough discussion of both BNP and its interplay with causal inference and missing data • How to use BNP and g-computation for causal inference and non-ignorable missingness • How to derive and calibrate sensitivity parameters to assess sensitivity to deviations from uncheckable causal and/or missingness assumptions • Detailed case studies illustrating the application of BNP methods to causal inference and missing data • R code and/or packages to implement BNP in causal inference and missing data problems The book is primarily aimed at researchers and graduate students from statistics and biostatistics. It will also serve as a useful practical reference for mathematically sophisticated epidemiologists and medical researchers.
Bayesian Nonparametrics

Author: Nils Lid Hjort
language: en
Publisher: Cambridge University Press
Release Date: 2010-04-12
Bayesian nonparametrics works - theoretically, computationally. The theory provides highly flexible models whose complexity grows appropriately with the amount of data. Computational issues, though challenging, are no longer intractable. All that is needed is an entry point: this intelligent book is the perfect guide to what can seem a forbidding landscape. Tutorial chapters by Ghosal, Lijoi and Prünster, Teh and Jordan, and Dunson advance from theory, to basic models and hierarchical modeling, to applications and implementation, particularly in computer science and biostatistics. These are complemented by companion chapters by the editors and Griffin and Quintana, providing additional models, examining computational issues, identifying future growth areas, and giving links to related topics. This coherent text gives ready access both to underlying principles and to state-of-the-art practice. Specific examples are drawn from information retrieval, NLP, machine vision, computational biology, biostatistics, and bioinformatics.