並行計算機體系結構英文第2版

並行計算機體系結構英文第2版

《並行計算機體系結構英文第2版》是機械工業出版社出版的圖書,作者是David E.Culler

基本介紹

  • 作者:David E.Culler
  • 出版時間:1999年9月
  • 出版社:機械工業出版社
  • 頁數:1025 頁
  • ISBN:9787111074403
  • 定價:88.00 元
  • 叢書:計算機科學叢書
內容簡介,圖書目錄,

內容簡介

當今並行計算機體系結構令人振奮的發展是對傳統的相互各異的並行實現方式的完美綜合。本書正是以此技術為背景,通過大量的實例,精確的數據和作者對並行結構深邃的理解向人們提示了蘊藏於並行體系結構中的強大力量,並同時首次對設計的平衡性能做了充分的定量評估。本書用硬體、軟體方面的*技術對並行結構設計中的若干重大問題做了全面、深刻的追蹤探討。本書是諸多專家智慧的結晶、經驗的融合,是廣大學生、科研人員、工程人員的權威教材,也是奉獻於並行結構科學的經典之作。

圖書目錄

Contents
Foreword
Preface
1 Introduction
1.1 Why Parallel Architecture
1.2 Convergence of Parallel Architectures
1.3 Fundamental Design Issues
1.4 Concluding Remarks
1.5 Historical Refernces
1.6 Exercises
2 Parallel Programs
2.1 parallel Application Case Studies
2.2 The Parallelization Process
2.3 Paralleliation of an Example Program
2.4Concluding Remarks
2.5  Exercises
3 Programming for Performance
3.1 Partitioning for Performance
3.2 Data Access and Communication in a Multimemory System
3.3 Orchestration for Performance
3.4 Performance Factors from the Processor's Perspective
3.5 The Parallel Application Case Studies:An In-Depth Look
3.6 Implications for Programming Models
3.7 Concluding Reamarks
3.8 Exercises
4 Workload-Driven Evaluation
4.1 Scaling Workloads and Machines
4.2 Evaluating a Real Machine
4.3 Evaluating an Architectural Idea or Trade-off
4.4 Illustrating Workload Characterization
4.5 Concluding Remarks
4.6 Exercises
5 Shared Memory Multiprocessors
5.1 Cache Coherence
5.2 Memory consistency
5.3 Design Space for Snooping Protocols
5.4 Assessing Protocol Design Trade-offs
5.5 Synchronization
5.6 Implications for Software
5.7 Concluding Remarks
5.8 Exercises
6 Snoop-Based Multiprocessor Design
6.1 Correctness Requirements
6.2 Base Design :simgle-Level Caches with an Atomic Bus
6.3 Multilevel Cache Hierarchies
6.4 Split-Transaction Bus
6.5 Case Studies :SGI Challenge and Sun Enterprise
6.6 Extending Cache Coherence
6.7 Concluding Remarks
6.8 Exercises
7 Scalable Multiprocessors
7.1 Scalability
7.2 Realizing Programming Models
7.3 Physical DMA
7.4 User-Level Access
7.5 Dedicated Message Processing
7.6 Shared Physical Address Space
7.7 Clusters and Networks of Workstatiomns
7.8 Implications for Parallel Software
7.9 Synchronization
7.10 Concluding Remarks
7.11 Exercises
8 Directory-Based Cache Coherence
8.1 Scalable Cache Coherence
8.2 Overview of Directory-Based Approaches
8.3 Assessing Directory Protocols and Trade-Offs
8.4 Design Challenges for Directory Protocols
8.5 Memory-Based Directory Protocols:The SGI Origin System
8.6 Cache-Based Directory Protocols:The Sequent NUMA-Q
8.7 Performacne Parameters and Protocol Performacne
8.8 Synchronization
8.9 Implications for Parallel software
8.10 Advanced topics
8.11 Concluding Remarks
8.12 Exercises
9 Haradware/Software Trade-Offs
9.1 Relaxed Memory Consistency Models
9.2 Overcoming Capacity Limitations
9.3 Reducing Hardware Cost
9.4 Putting It All Together:Ataxonomy and Simple COMA
9.5 Implications for Parallel Software
9.6 Advanced topics
9.7 Concluding Remarks
9.8 Exercises
10 Interconnection Network Design
10.1 Basic Definitions
10.2 Basic Communication Performance
10.3 Organizational Structure
10.4 Interconnection Topologies
10.5 Evaluating Design Trade-Offs in Network Topology
10.6 Routing
10.7 Switch Design
10.8 Flow Control
10.9 Case Studies
10.10 Concluding Remarks
10.11 Exercises
11 Latency Tolerance
11.1 Overview of Latency tolerance
11.2 Latency Tolerance in Explicit message Passing
11.3 latency Tolerance in a Shared Address Space
11.4 Block Data TRansfer in a Shared Address Space
11.5 Proceeding Past Long-Latency Events
11.6 Precommunication in a Shared Address Space
11.7 Multithreading in a Shared Address Space
11.8 Lockup-Free Cache Design
11.9 Concluding Remarks
11.10 Exercises
12 Future Directions
12.1 Technology and Architecture
12.2 Applications and System Software
Appendix:Parallel Benchmark Suites 

相關詞條

熱門詞條

聯絡我們