Adaptivity In Data Stream Mining

Download Adaptivity In Data Stream Mining PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get Adaptivity In Data Stream Mining book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages.
Adaptive Stream Mining

This book is a significant contribution to the subject of mining time-changing data streams and addresses the design of learning algorithms for this purpose. It introduces new contributions on several different aspects of the problem, identifying research opportunities and increasing the scope for applications. It also includes an in-depth study of stream mining and a theoretical analysis of proposed methods and algorithms. The first section is concerned with the use of an adaptive sliding window algorithm (ADWIN). Since this has rigorous performance guarantees, using it in place of counters or accumulators, it offers the possibility of extending such guarantees to learning and mining algorithms not initially designed for drifting data. Testing with several methods, including Naïve Bayes, clustering, decision trees and ensemble methods, is discussed as well. The second part of the book describes a formal study of connected acyclic graphs, or 'trees', from the point of view of closure-based mining, presenting efficient algorithms for subtree testing and for mining ordered and unordered frequent closed trees. Lastly, a general methodology to identify closed patterns in a data stream is outlined. This is applied to develop an incremental method, a sliding-window based method, and a method that mines closed trees adaptively from data streams. These are used to introduce classification methods for tree data streams.
Machine Learning for Data Streams

A hands-on approach to tasks and techniques in data stream mining and real-time analytics, with examples in MOA, a popular freely available open-source software framework. Today many information sources—including sensor networks, financial markets, social networks, and healthcare monitoring—are so-called data streams, arriving sequentially and at high speed. Analysis must take place in real time, with partial data and without the capacity to store the entire data set. This book presents algorithms and techniques used in data stream mining and real-time analytics. Taking a hands-on approach, the book demonstrates the techniques using MOA (Massive Online Analysis), a popular, freely available open-source software framework, allowing readers to try out the techniques after reading the explanations. The book first offers a brief introduction to the topic, covering big data mining, basic methodologies for mining data streams, and a simple example of MOA. More detailed discussions follow, with chapters on sketching techniques, change, classification, ensemble methods, regression, clustering, and frequent pattern mining. Most of these chapters include exercises, an MOA-based lab session, or both. Finally, the book discusses the MOA software, covering the MOA graphical user interface, the command line, use of its API, and the development of new methods within MOA. The book will be an essential reference for readers who want to use data stream mining as a tool, researchers in innovation or data stream mining, and programmers who want to create new algorithms for MOA.
Adaptivity in Data Stream Mining

In recent years data streams became a ubiquitous source of information, and thus stream mining emerged as a new field in database research. Due to the inherently dynamic nature of data streams, stream mining algorithms benefit from being adaptive to changes in the properties of a data stream. In addition, when stream mining is done in a dynamic environment like a data stream management system or a sensor network, stream mining algorithms also profit from being adaptive to the changing conditions in this environment. This work investigates two kinds of adaptivity in data stream mining. First, a model for quality-driven resource adaptive stream mining is developed. The model is applied to stream mining algorithms so they efficiently utilize available resources to achieve mining results of the highest quality possible. Every stream mining algorithm is unique in its parameters, quality measures, and resource consumption patterns. We generalize these characteristics and develop a model that captures the interactions and correlations between variables involved in the stream mining process. We then express resource adaptive stream mining as a multiobjective optimization problem and use its solution to tune the input parameters of stream mining algorithms, which results in high quality mining and optimal resource utilization. The second topic investigated in this work is feature adaptive stream mining, which is concerned with adjusting the focus of the mining process to interesting features detected in the data stream. This research is motivated by the need to efficiently detect environmental phenomena from sensor data streams. We propose methods to detect and predict heterogeneous outlier regions, which represent areas of environmental phenomena of different intensities. With the help of predictions about the location and size of outlier regions, the sampling rate of individual sensors is adapted such that sensors in the vicinity of environmental phenomena obtain new measurements more frequently than other sensors in the network to allow for a precise and timely region tracking. The research in this work enhances the state-of-the-art in data stream mining as it makes stream mining algorithms more flexible to adapt to changes in the data stream and the mining environment.