Extreme Scale Computing

Download Extreme Scale Computing PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get Extreme Scale Computing book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages.
Extreme-Scale Computing

Author: Raymond J. Spiteri
language: en
Publisher: Springer Nature
Release Date: 2025-09-02
Scientific computing is essential for tackling complex problems across many domains—but how can scientists develop high-performance and high-quality software that scales efficiently? This book serves as an accessible introduction to extreme-scale computing, specifically designed for domain scientists who may not have formal computer science training but need to harness the power of C++ and parallel computing for large-scale applications. The book begins by covering the fundamentals of scientific computing software management, including essential tools like Linux, Git, and CMake, before diving into a detailed exploration of C++ for extreme-scale computing. Readers familiar with languages like Python will gain the necessary skills to transition to C++ and build scalable, efficient software. Beyond basic programming, this book delves into hardware-aware computing, teaching readers how to optimize software performance by understanding the underlying architecture of modern computational systems. It then introduces parallel computing techniques, covering MPI for distributed memory parallelism, shared memory parallelism, CUDA for GPU programming, and Kokkos for performance portability. Further chapters focus on efficient I/O, debugging, and profiling, which all address aspects of the critical challenge of performance optimization in extreme-scale computing. The book concludes with an overview of popular libraries for extreme-scale computing, equipping readers with the tools they need to solve real-world computational problems. With a balance of theory, practical applications, and illustrative case studies, this book provides domain scientists with a comprehensive roadmap to mastering extreme-scale computing and developing highly parallel and performant software.
America's Next Generation Supercomputer

Author: United States. Congress. House. Committee on Science, Space, and Technology (2011). Subcommittee on Energy
language: en
Publisher:
Release Date: 2013
Fault-Tolerance Techniques for High-Performance Computing

This timely text presents a comprehensive overview of fault tolerance techniques for high-performance computing (HPC). The text opens with a detailed introduction to the concepts of checkpoint protocols and scheduling algorithms, prediction, replication, silent error detection and correction, together with some application-specific techniques such as ABFT. Emphasis is placed on analytical performance models. This is then followed by a review of general-purpose techniques, including several checkpoint and rollback recovery protocols. Relevant execution scenarios are also evaluated and compared through quantitative models. Features: provides a survey of resilience methods and performance models; examines the various sources for errors and faults in large-scale systems; reviews the spectrum of techniques that can be applied to design a fault-tolerant MPI; investigates different approaches to replication; discusses the challenge of energy consumption of fault-tolerance methods in extreme-scale systems.