Applied Data Science Using Pyspark


Download Applied Data Science Using Pyspark PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get Applied Data Science Using Pyspark book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages.

Download

Applied Data Science Using PySpark


Applied Data Science Using PySpark

Author: Ramcharan Kakarla

language: en

Publisher: Springer Nature

Release Date: 2024-12-01


DOWNLOAD





This comprehensive guide, featuring hand-picked examples of daily use cases, will walk you through the end-to-end predictive model-building cycle using the latest techniques and industry tricks. In Chapters 1, 2, and 3, we will begin by setting up the environment and covering the basics of PySpark, focusing on data manipulation. Chapter 4 delves into the art of variable selection, demonstrating various techniques available in PySpark. In Chapters 5, 6, and 7, we explore machine learning algorithms, their implementations, and fine-tuning techniques. Chapters 8 and 9 will guide you through machine learning pipelines and various methods to operationalize and serve models using Docker/API. Chapter 10 will demonstrate how to unlock the power of predictive models to create a meaningful impact on your business. Chapter 11 introduces some of the most widely used and powerful modeling frameworks to unlock real value from data. In this new edition, you will learn predictive modeling frameworks that can quantify customer lifetime values and estimate the return on your predictive modeling investments. This edition also includes methods to measure engagement and identify actionable populations for effective churn treatments. Additionally, a dedicated chapter on experimentation design has been added, covering steps to efficiently design, conduct, test, and measure the results of your models. All code examples have been updated to reflect the latest stable version of Spark. You will: Gain an overview of end-to-end predictive model building Understand multiple variable selection techniques and their implementations Learn how to operationalize models Perform data science experiments and learn useful tips

Machine Learning with PySpark


Machine Learning with PySpark

Author: Pramod Singh

language: en

Publisher: Apress

Release Date: 2018-12-14


DOWNLOAD





Build machine learning models, natural language processing applications, and recommender systems with PySpark to solve various business challenges. This book starts with the fundamentals of Spark and its evolution and then covers the entire spectrum of traditional machine learning algorithms along with natural language processing and recommender systems using PySpark. Machine Learning with PySpark shows you how to build supervised machine learning models such as linear regression, logistic regression, decision trees, and random forest. You’ll also see unsupervised machine learning models such as K-means and hierarchical clustering. A major portion of the book focuses on feature engineering to create useful features with PySpark to train the machine learning models. The natural language processing section covers text processing, text mining, and embedding for classification. After reading thisbook, you will understand how to use PySpark’s machine learning library to build and train various machine learning models. Additionally you’ll become comfortable with related PySpark components, such as data ingestion, data processing, and data analysis, that you can use to develop data-driven intelligent applications. What You Will Learn Build a spectrum of supervised and unsupervised machine learning algorithms Implement machine learning algorithms with Spark MLlib libraries Develop a recommender system with Spark MLlib libraries Handle issues related to feature engineering, class balance, bias and variance, and cross validation for building an optimal fit model Who This Book Is For Data science and machine learning professionals.

Applied Data Science in Tourism


Applied Data Science in Tourism

Author: Roman Egger

language: en

Publisher: Springer Nature

Release Date: 2022-01-31


DOWNLOAD





Access to large data sets has led to a paradigm shift in the tourism research landscape. Big data is enabling a new form of knowledge gain, while at the same time shaking the epistemological foundations and requiring new methods and analysis approaches. It allows for interdisciplinary cooperation between computer sciences and social and economic sciences, and complements the traditional research approaches. This book provides a broad basis for the practical application of data science approaches such as machine learning, text mining, social network analysis, and many more, which are essential for interdisciplinary tourism research. Each method is presented in principle, viewed analytically, and its advantages and disadvantages are weighed up and typical fields of application are presented. The correct methodical application is presented with a "how-to" approach, together with code examples, allowing a wider reader base including researchers, practitioners, and students entering the field. The book is a very well-structured introduction to data science – not only in tourism – and its methodological foundations, accompanied by well-chosen practical cases. It underlines an important insight: data are only representations of reality, you need methodological skills and domain background to derive knowledge from them - Hannes Werthner, Vienna University of Technology Roman Egger has accomplished a difficult but necessary task: make clear how data science can practically support and foster travel and tourism research and applications. The book offers a well-taught collection of chapters giving a comprehensive and deep account of AI and data science for tourism - Francesco Ricci, Free University of Bozen-Bolzano This well-structured and easy-to-read book provides a comprehensive overview of data science in tourism. It contributes largely to the methodological repository beyond traditional methods. - Rob Law, University of Macau