Data Engineering For Ai

Download Data Engineering For Ai PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get Data Engineering For Ai book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages.
Advances in Artificial Intelligence and Data Engineering

This book presents selected peer-reviewed papers from the International Conference on Artificial Intelligence and Data Engineering (AIDE 2019). The topics covered are broadly divided into four groups: artificial intelligence, machine vision and robotics, ambient intelligence, and data engineering. The book discusses recent technological advances in the emerging fields of artificial intelligence, machine learning, robotics, virtual reality, augmented reality, bioinformatics, intelligent systems, cognitive systems, computational intelligence, neural networks, evolutionary computation, speech processing, Internet of Things, big data challenges, data mining, information retrieval, and natural language processing. Given its scope, this book can be useful for students, researchers, and professionals interested in the growing applications of artificial intelligence and data engineering.
Data Engineering with Apache Spark, Delta Lake, and Lakehouse

Author: Manoj Kukreja
language: en
Publisher: Packt Publishing Ltd
Release Date: 2021-10-22
Understand the complexities of modern-day data engineering platforms and explore strategies to deal with them with the help of use case scenarios led by an industry expert in big data Key FeaturesBecome well-versed with the core concepts of Apache Spark and Delta Lake for building data platformsLearn how to ingest, process, and analyze data that can be later used for training machine learning modelsUnderstand how to operationalize data models in production using curated dataBook Description In the world of ever-changing data and schemas, it is important to build data pipelines that can auto-adjust to changes. This book will help you build scalable data platforms that managers, data scientists, and data analysts can rely on. Starting with an introduction to data engineering, along with its key concepts and architectures, this book will show you how to use Microsoft Azure Cloud services effectively for data engineering. You'll cover data lake design patterns and the different stages through which the data needs to flow in a typical data lake. Once you've explored the main features of Delta Lake to build data lakes with fast performance and governance in mind, you'll advance to implementing the lambda architecture using Delta Lake. Packed with practical examples and code snippets, this book takes you through real-world examples based on production scenarios faced by the author in his 10 years of experience working with big data. Finally, you'll cover data lake deployment strategies that play an important role in provisioning the cloud resources and deploying the data pipelines in a repeatable and continuous way. By the end of this data engineering book, you'll know how to effectively deal with ever-changing data and create scalable data pipelines to streamline data science, ML, and artificial intelligence (AI) tasks. What you will learnDiscover the challenges you may face in the data engineering worldAdd ACID transactions to Apache Spark using Delta LakeUnderstand effective design strategies to build enterprise-grade data lakesExplore architectural and design patterns for building efficient data ingestion pipelinesOrchestrate a data pipeline for preprocessing data using Apache Spark and Delta Lake APIsAutomate deployment and monitoring of data pipelines in productionGet to grips with securing, monitoring, and managing data pipelines models efficientlyWho this book is for This book is for aspiring data engineers and data analysts who are new to the world of data engineering and are looking for a practical guide to building scalable data platforms. If you already work with PySpark and want to use Delta Lake for data engineering, you'll find this book useful. Basic knowledge of Python, Spark, and SQL is expected.
97 Things Every Data Engineer Should Know

Author: Tobias Macey
language: en
Publisher: "O'Reilly Media, Inc."
Release Date: 2021-06-11
Take advantage of the sky-high demand for data engineers today. With this in-depth book, current and aspiring engineers will learn powerful, real-world best practices for managing data big and small. Contributors from Google, Microsoft, IBM, Facebook, Databricks, and GitHub share their experiences and lessons learned for overcoming a variety of specific and often nagging challenges. Edited by Tobias Macey from MIT Open Learning, this book presents 97 concise and useful tips for cleaning, prepping, wrangling, storing, processing, and ingesting data. Data engineers, data architects, data team managers, data scientists, machine learning engineers, and software engineers will greatly benefit from the wisdom and experience of their peers. Projects include: Building pipelines Stream processing Data privacy and security Data governance and lineage Data storage and architecture Ecosystem of modern tools Data team makeup and culture Career advice.