《CUDA範例精解:通用GPU編程(影印版)》是2010年清華大學出版社出版的圖書,作者是(美)山德爾、康洛特。
基本介紹
- 書名:CUDA範例精解:通用GPU編程(影印版)
- 作者:(美)山德爾、康洛特
- ISBN:9787302239956
- 頁數:290
- 定價:39元
- 出版社:清華大學出版社
- 出版時間:2010.10.01
- 開本:16開
內容簡介,目錄,
內容簡介
CUDA是設計用於幫助開發並行程式的計算體系結構.通過與廣泛的軟體平台相結合,CUDA體系結構程式設計師可以充分利用圖形處理單元的強大能力構建高性能的應用程式.當然,GPU已經在很長時間內用於實現複雜的圖形和遊戲應用程式.現在,CUDA將這和極具價值的資源帶給在其他領域內從事應用程式開發的程式設計師.包括科學,工程和財務領域.這些程式設計師完全不需要了解圖形編程的相關知道,而只要能夠採用適當擴展的C語言版本進行編程即可.
目錄
ForewordPrefaceAcknowledgmentsAbout the Authors1 WHY CUDA? WHY NOW? 1.1 Chapter Objectives 1.2 The Age of Parallel Processing 1.2.1 Central Processing Units 1.3 The Rise of GPU Computing 1.3.1 A Brief History of GPUs 1.3.2 Early GPU Computing 1.4 CUDA 1.4.1 What Is theCUDAArchitecture? 1.4.2 Using the CUDAArchitecture 1.5 Applications of CUDA 1.5.1 Medical- Imaging 1.5.2 ComputationatFl-uid Dynamics 1.5.3 Environmental- Science 1.6 Chapter Review2 GETTING STARTED 2.1 Chapter Objectives 2.2 Devetopment Environment 2.2.1 CUDA-Enabled Graphics Processors 2.2.2 NVlDIA Device Driver 2.2.3 CUDA Development Toolkit 2.2.4 Standard C Compiler 2.3 Chapter Review3 INTRODUCTION TO CUBA C 3.1 Chapter Objectives 3.2 A First Program 3.2.1 Hetlo, Wortd! 3.2.2 A Kernet Catl 3.2.3 Passing Parameters 3.3 Querying Devices 3.4 Using Device Properties 3.5 Chapter Review4 PARALLEL PROGRAMMING IN CUDA C 4.1 Chapter Objectives 4.2 CUBA Parattel Programming 4.2.1 Summing Vectors 4.2.2 A Fun Exampte 4.3 Chapter Review5 THREAD COOPERATION 5.1 Chapter Objectives 5.2 Splitting Paraltel Blocks 5.2.1 Vector Sums: Redux 5.2.2 GPU Ripple Using Threads 5.3 Shared Memory and Synchronization 5.3.1 Dot Product 5.3.2 Dot Product Optimized lIncorrectLyl 5.3.3 Shared Memory Bitmap 5.4 Chapter Review6 CONSTANT MEMORY AND EVENTS 6.1 Chapter Objectives 6.2 Constant Memory 6.2.1 RayTracing Introduction 6.2.2 Ray Tracing on the GPU 6.2.3 Ray Tracing with Constant Memory 6.2.4 Performance with Constant Memory 6.3 Measuring Performance with Events 6.3.1 Measuring Ray Tracer Performance 6.4 Chapter Review7 TEXTURE MEMORY 7.1 Chapter Objectives 7.2 Texture Memory Overview 7.3 Simulating Heat Transfer 7.3.1 Simple Heating Model 7.3.2 Computing Temperature Updates 7.3.3 Animating the Simulation 7.3.4 Using Texture Memory 7.3.5 Using Two-Dimensional Texture Memory 7.4 Chapter Review8 GRAPHICS INTEROPERABILITY 8.1 Chapter Objectives 8.2 Graphics Interoperation 8.3 GPU Ripple with Graphics Interoperability 8.3.1 The GPUAnimBitmap Structure 8.3.2 GPU Ripple Redux 8.4 Heat Transfer with Graphics Interop 8.5 DirectX Interoperability 8.6 Chapter Review9 ATOHICS 9.1 Chapter Objectives 9.2 Compute Capability 9.2.1 The Compute Capability of NVIDIA GPUs 9.2.2 Compiling for a Minimum Compute Capability 9.3 Atomic Operations Overview 9.4 Computing Histograms 9.4.1 CPU Histogram Computation 9.4.2 GPU Histogram Computation 9.5 Chapter Review10 STREAMS 10.1 Chapter Objectives 10.2 Page-Locked Host Memory 10.3 CUDA Streams 10.4 Using a Single CUDA Stream 10.5 Using Muitipte CUDA Streams 10.6 GPU Work Scheduting 10.7 Using Muttipte CUDA Streams Effectivety 10.8 Chapter Review11 CUDA C ON MULTIPLE GPUS 11.1 Chapter Objectives 11.2 Zero-Copy Host Memory 11.2.1 Zero-Copy Dot Product 11.2.2 Zero-Copy Performance 11.3 Using Multiple GPUs 11.4 Portable Pinned Memory 11.5 Chapter Review12 THE FINAL COUNTDOWN 12.1 Chapter Objectives 12.2 CUDA Toots 12.2.1 CUDA Tootkit 12.2.2 CUFFT 12.2.3 CUBLAS 12.2.4 NVlDIAGPU ComputingSDK 12.2.5 NVIDIA Performance Primitives 12.2.6 Debugging CUDAC 12.2.7 CUDAVisual Profiler 12.3 Written Resources 12.3.1 Programming Massively Parallel Processors:A Hands-On Approach 12.3.2 CUDA U 12.3.3 NVIDIA Forums 12.4 Code Resources 12.4.1 CUDA Data Parallel Primitives Library 12.4.2 CULAtools 12.4.3 Language Wrappers 12.5 Chapter ReviewA ADVANCED ATOMICS A.1 Dot Product Revisited A.I.1 Atomic Locks A.I.2 Dot Product Redux:Atomic Locks A.2 Implementing a Hash Table A.2.1 Hash Table Overview A.2.2 ACPU HashTable A.2.3 Multithreaded Hash Table A.2.4 AGPU Hash Table A.2.5 Hash Table Performance A.3 Appendix ReviewIndex