Generalized Kernel Two Sample Tests

Download Generalized Kernel Two Sample Tests PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get Generalized Kernel Two Sample Tests book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages.
Generalized Kernel Equating with Applications in R

Generalized Kernel Equating is a comprehensive guide for statisticians, psychometricians, and educational researchers aiming to master test score equating. This book introduces the Generalized Kernel Equating (GKE) framework, providing the necessary tools and methodologies for accurate and fair score comparisons. The book presents test score equating as a statistical problem and covers all commonly used data collection designs. It details the five steps of the GKE framework: presmoothing, estimating score probabilities, continuization, equating transformation, and evaluating the equating transformation. Various presmoothing strategies are explored, including log-linear models, item response theory models, beta4 models, and discrete kernel estimators. The estimation of score probabilities when using IRT models is described and Gaussian kernel continuization is extended to other kernels such as uniform, logistic, epanechnikov and adaptive kernels. Several bandwidth selection methods are described. The kernel equating transformation and variants of it are defined, and both equating-specific and statistical measures for evaluating equating transformations are included. Real data examples, guiding readers through the GKE steps with detailed R code and explanations are provided. Readers are equipped with an advanced knowledge and practical skills for implementing test score equating methods.
Theory of Rank Tests

The first edition of Theory of Rank Tests (1967) has been the precursor to a unified and theoretically motivated treatise of the basic theory of tests based on ranks of the sample observations. For more than 25 years, it helped raise a generation of statisticians in cultivating their theoretical research in this fertile area, as well as in using these tools in their application oriented research. The present edition not only aims to revive this classical text by updating the findings but also by incorporating several other important areas which were either not properly developed before 1965 or have gone through an evolutionary development during the past 30 years. This edition therefore aims to fulfill the needs of academic as well as professional statisticians who want to pursue nonparametrics in their academic projects, consultation, and applied research works. - Asymptotic Methods - Nonparametrics - Convergence of Probability Measures - Statistical Inference
Prediction Games

Author: Michael Brückner
language: en
Publisher: Universitätsverlag Potsdam
Release Date: 2012
In many applications one is faced with the problem of inferring some functional relation between input and output variables from given data. Consider, for instance, the task of email spam filtering where one seeks to find a model which automatically assigns new, previously unseen emails to class spam or non-spam. Building such a predictive model based on observed training inputs (e.g., emails) with corresponding outputs (e.g., spam labels) is a major goal of machine learning. Many learning methods assume that these training data are governed by the same distribution as the test data which the predictive model will be exposed to at application time. That assumption is violated when the test data are generated in response to the presence of a predictive model. This becomes apparent, for instance, in the above example of email spam filtering. Here, email service providers employ spam filters and spam senders engineer campaign templates such as to achieve a high rate of successful deliveries despite any filters. Most of the existing work casts such situations as learning robust models which are unsusceptible against small changes of the data generation process. The models are constructed under the worst-case assumption that these changes are performed such to produce the highest possible adverse effect on the performance of the predictive model. However, this approach is not capable to realistically model the true dependency between the model-building process and the process of generating future data. We therefore establish the concept of prediction games: We model the interaction between a learner, who builds the predictive model, and a data generator, who controls the process of data generation, as an one-shot game. The game-theoretic framework enables us to explicitly model the players' interests, their possible actions, their level of knowledge about each other, and the order at which they decide for an action. We model the players' interests as minimizing their own cost function which both depend on both players' actions. The learner's action is to choose the model parameters and the data generator's action is to perturbate the training data which reflects the modification of the data generation process with respect to the past data. We extensively study three instances of prediction games which differ regarding the order in which the players decide for their action. We first assume that both player choose their actions simultaneously, that is, without the knowledge of their opponent's decision. We identify conditions under which this Nash prediction game has a meaningful solution, that is, a unique Nash equilibrium, and derive algorithms that find the equilibrial prediction model. As a second case, we consider a data generator who is potentially fully informed about the move of the learner. This setting establishes a Stackelberg competition. We derive a relaxed optimization criterion to determine the solution of this game and show that this Stackelberg prediction game generalizes existing prediction models. Finally, we study the setting where the learner observes the data generator's action, that is, the (unlabeled) test data, before building the predictive model. As the test data and the training data may be governed by differing probability distributions, this scenario reduces to learning under covariate shift. We derive a new integrated as well as a two-stage method to account for this data set shift. In case studies on email spam filtering we empirically explore properties of all derived models as well as several existing baseline methods. We show that spam filters resulting from the Nash prediction game as well as the Stackelberg prediction game in the majority of cases outperform other existing baseline methods.