BPF Performance Tools(英文版):洞悉Linux系統和套用性能

BPF Performance Tools(英文版):洞悉Linux系統和套用性能

BPF Performance Tools(英文版):洞悉Linux系統和套用性能》由電子工業出版社於2021年1月出版,作者是【美】Brendan Gregg(布倫丹·格雷格)。

作為BPF技術的開拓者和專家,Brendan Gregg在本書中不僅展示了超過150個可以立即使用的分析工具和調試工具,對這些工具的套用場景進行了分析,還提供了開發自定義工具的分步指南。在本書中,讀者可學習到如何分析CPU、記憶體、存儲設備、檔案系統、網路、程式語言、應用程式、容器、虛擬機管理器、安全及核心。Gregg帶領讀者由淺入深地了解從基礎工具到進階工具的使用,幫助讀者收集更有用、更深入的技術信息。

基本介紹

  • 中文名:BPF Performance Tools(英文版):洞悉Linux系統和套用性能
  • 作者:【美】Brendan Gregg(布倫丹·格雷格)
  • 出版時間:2021年1月
  • 出版社:電子工業出版社
  • 頁數:840 頁
  • ISBN:9787121386947
  • 定價:219.00 元
  • 開本:16 開
內容簡介,圖書目錄,作者簡介,

內容簡介

《BPF Performance Tools(英文版):洞悉Linux系統和套用性能》作為全面介紹 BPF 技術的圖書,廈您奔從 BPF 技術的起源到樂訂愚多未來發才煉敬格展方向都有涵蓋,不僅系統介紹了 BPF 的編程模型,還完整介紹了兩個主要的 BPF 前端編程框架—BCC 和 bpftrace,更給出了一系列實現範例,生動展示了 BPF 技術的實際能力和未來發展前景。
本書的另一個關注方向是 Linux 系統性能和應用程式性能的調優,內容涉及系統性能調優的策略、工具與實踐案紋兵拒例,不僅介紹了對應的 BPF 工具,騙笑少還著重介紹了這些工具如何與 Linux 傳統性能工具配合使用,這樣讀者可以選擇最佳方案。
本書介紹的工具小巧精緻,並提供了簡單易讀的原始碼,它們充分展現了 BPF 技術的魅力 :安全訂拔企、高效、快捷的系統擴展力。未來 BPF 技術在 Linux 中的葛道套用場景會越來越多、越來越重要。希望本書能在大家學習 BPF 技術並關注它的發展時提供幫助。

圖書目錄

Part I: Technologies
1 Introduction 1
1.1 What Are BPF and eBPF? 1
1.2 What Are Tracing, Snooping, Sampling, Profiling, and
Observability? 2
1.3 What Are BCC, bpftrace, and IO Visor? 3
1.4 A First Look at BCC: Quick Wins 4
1.5 BPF Tracing Visibility 6
1.6 Dynamic Instrumentation: kprobes and uprobes 8
1.7 Static Instrumentation: Tracepoints and USDT 9
1.8 A First Look at bpftrace: Tracing open() 10
1.9 Back to BCC: Tracing open() 12
1.10 Summary 14
2 Technology Background 15
2.1 BPF Illustrated 15
2.2 BPF 16
2.3 Extended BPF (eBPF) 17
2.3.1 Why Performance Tools Need BPF 19
2.3.2 BPF Versus Kernel Modules 21
2.3.3 Writing BPF Programs 22
2.3.4 Viewing BPF Instructions: bpftool 23
2.3.5 Viewing BPF Instructions: bpftrace 30
2.3.6 BPF API 31
2.3.7 BPF Concurrency Controls 35
2.3.8 BPF sysfs Interface 36
2.3.9 BPF Type Format (BTF) 37
2.3.10 BPF CO-RE 37
2.3.11 BPF Limitations 38
2.3.12 BPF Additional Reading 38
2.4 Stack Trace Walking 39
2.4.1 Frame Pointer–Based Stacks 39
2.4.2 debuginfo 40
2.4.3 Last Branch Record (LBR) 40
2.4.4 ORC 40
2.4.5 Symbols 41
2.4.6 More Reading 41
2.5 Flame Graphs 41
2.5.1 Stack Trace 41
2.5.2 Profiling Stack Traces 41
2.5.3 Flame Graph 42
2.5.4 Flame Graph Features 44
2.5.5 Variations 44
2.6 Event Sources 45
2.7 kprobes 46
2.7.1 How kprobes Work 46
2.7.2 kprobes Interfaces 47
2.7.3 BPF and kprobes 48
2.7.4 kprobes Additional Reading 49
2.8 uprobes 49
2.8.1 How uprobes Work 49
2.8.2 Uprobes Interfaces 51
2.8.3 BPF and uprobes 51
2.8.4 uprobes Overhead and Future Work 52
2.8.5 uprobes Additional Reading 52
2.9 Tracepoints 53
2.9.1 Adding Tracepoint Instrumentation 53
2.9.2 How Tracepoints Work 55
2.9.3 Tracepoint Interfaces 56
2.9.4 Tracepoints and BPF 56
2.9.5 BPF Raw Tracepoints 57
2.9.6 Additional Reading 58
2.10 USDT 58
2.10.1 Adding USDT Instrumentation 58
2.10.2 How USDT Works 60
2.10.3 BPF and USDT 61
2.10.4 USDT Additional Reading 61
2.11 Dynamic USDT 61
2.12 PMCs 63
2.12.1 PMC Modes 63
2.12.2 PEBS 64
2.12.3 Cloud Computing 64
2.13 perf_events 64
2.14 Summary 65
3 Performance Analysis 67
3.1 Overview 67
3.1.1 Goals 68
3.1.2 Activities 68
3.1.3 Mulitple Performance Issues 69
3.2 Performance Methodologies 69
3.2.1 Workload Characterization 70
3.2.2 Drill-Down Analysis 71
3.2.3 USE Method 72
3.2.4 Checklists 72
3.3 Linux 60-Second Analysis 73
3.3.1 uptime 73
3.3.2 dmesg | tail 74
3.3.3 vmstat 1 74
3.3.4 mpstat -P ALL 1 75
3.3.5 pidstat 1 75
3.3.6 iostat -xz 1 76
3.3.7 free -m 77
3.3.8 sar -n DEV 1 77
3.3.9 sar -n TCP,ETCP 1 78
3.3.10 top 78
3.4 BCC Tool Checklist 79
3.4.1 execsnoop 80
3.4.2 opensnoop 80
3.4.3 ext4slower 80
3.4.4 biolatency 81
3.4.5 biosnoop 81
3.4.6 cachestat 82
3.4.7 tcpconnect 82
3.4.8 tcpaccept 82
3.4.9 tcpretrans 83
3.4.10 runqlat 83
3.4.11 profile 84
3.5 Summary 84
4 BCC 85
4.1 BCC Components 86
4.2 BCC Features 86
4.2.1 Kernel-Level Features 87
4.2.2 BCC User-Level Features 87
4.3 BCC Installation 88
4.3.1 Kernel Requirements 88
4.3.2 Ubuntu 88
4.3.3 RHEL 89
4.3.4 Other Distributions 89
4.4 BCC Tools 89
4.4.1 Highlighted Tools 90
4.4.2 Tool Characteristics 91
4.4.3 Single-Purpose Tools 91
4.4.4 Multi-Purpose Tools 93
4.5 funccount 94
4.5.1 funccount Examples 94
4.5.2 funccount Syntax 97
4.5.3 funccount One-Liners 97
4.5.4 funccount Usage 98
4.6 stackcount 99
4.6.1 stackcount Example 99
4.6.2 stackcount Flame Graphs 100
4.6.3 stackcount Broken Stack Traces 101
4.6.4 stackcount Syntax 102
4.6.5 stackcount One-Liners 102
4.6.6 stackcount Usage 103
4.7 trace 104
4.7.1 trace Example 104
4.7.2 trace Syntax 105
4.7.3 trace One-Liners 106
4.7.4 trace Structs 107
4.7.5 trace Debugging File Descriptor Leaks 107
4.7.6 trace Usage 109
4.8 argdist 110
4.8.1 argdist Syntax 111
4.8.2 argdist One-Liners 112
4.8.3 argdist Usage 113
4.9 Tool Documentation 114
4.9.1 Man Page: opensnoop 115
4.9.2 Examples File: opensnoop 118
4.10 Developing BCC Tools 119
4.11 BCC Internals 120
4.12 BCC Debugging 121
4.12.1 printf() Debugging 122
4.12.2 BCC Debug Output 124
4.12.3 BCC Debug Flag 125
4.12.4 bpflist 126
4.12.5 bpftool 127
4.12.6 dmesg 127
4.12.7 Resetting Events 127
4.13 Summary 128
5 bpftrace 129
5.1 bpftrace Components 130
5.2 bpftrace Features 131
5.2.1 bpftrace Event Sources 131
5.2.2 bpftrace Actions 131
5.2.3 bpftrace General Features 132
5.2.4 bpftrace Compared to Other Observability Tools 132
5.3 bpftrace Installation 133
5.3.1 Kernel Requirements 133
5.3.2 Ubuntu 134
5.3.3 Fedora 134
5.3.4 Post-Build Steps 134
5.3.5 Other Distributions 135
5.4 bpftrace Tools 135
5.4.1 Highlighted Tools 136
5.4.2 Tool Characteristics 136
5.4.3 Tool Execution 137
5.5 bpftrace One-Liners 137
5.6 bpftrace Documentation 138
5.7 bpftrace Programming 138
5.7.1 Usage 139
5.7.2 Program Structure 140
5.7.3 Comments 140
5.7.4 Probe Format 141
5.7.5 Probe Wildcards 141
5.7.6 Filters 142
5.7.7 Actions 142
5.7.8 Hello, World! 142
5.7.9 Functions 143
5.7.10 Variables 143
5.7.11 Map Functions 144
5.7.12 Timing vfs_read() 145
5.8 bpftrace Usage 147
5.9 bpftrace Probe Types 148
5.9.1 tracepoint 148
5.9.2 usdt 150
5.9.3 kprobe and kretprobe 151
5.9.4 uprobe and uretprobe 151
5.9.5 software and hardware 152
5.9.6 profile and interval 153
5.10 bpftrace Flow Control 154
5.10.1 Filter 154
5.10.2 Ternary Operators 154
5.10.3 If Statements 155
5.10.4 Unrolled Loops 155
5.11 bpftrace Operators 155
5.12 bpftrace Variables 156
5.12.1 Built-in Variables 156
5.12.2 Built-ins: pid, comm, and uid 157
5.12.3 Built-ins: kstack and ustack 157
5.12.4 Built-ins: Positional Parameters 159
5.12.5 Scratch 160
5.12.6 Maps 160
5.13 bpftrace Functions 161
5.13.1 printf() 162
5.13.2 join() 163
5.13.3 str() 163
5.13.4 kstack() and ustack() 164
5.13.5 ksym() and usym() 165
5.13.6 kaddr() and uaddr() 166
5.13.7 system() 166
5.13.8 exit() 167
5.14 bpftrace Map Functions 167
5.14.1 count() 168
5.14.2 sum(), avg(), min(), and max() 169
5.14.3 hist() 170
5.14.4 lhist() 171
5.14.5 delete() 171
5.14.6 clear() and zero() 172
5.14.7 print() 172
5.15 bpftrace Future Work 173
5.15.1 Explicit Address Modes 173
5.15.2 Other Additions 174
5.15.3 ply 175
5.16 bpftrace Internals 175
5.17 bpftrace Debugging 176
5.17.1 printf() Debugging 177
5.17.2 Debug Mode 177
5.17.3 Verbose Mode 179
5.18 Summary 180
Part II: Using BPF Tools
6 CPUs 181
6.1 Background 181
6.1.1 CPU Fundamentals 182
6.1.2 BPF Capabilities 184
6.1.3 Strategy 185
6.2 Traditional Tools 186
6.2.1 Kernel Statistics 187
6.2.2 Hardware Statistics 189
6.2.3 Hardware Sampling 192
6.2.4 Timed Sampling 192
6.2.5 Event Statistics and Tracing 196
6.3 BPF Tools 198
6.3.1 execsnoop 200
6.3.2 exitsnoop 202
6.3.3 runqlat 203
6.3.4 runqlen 207
6.3.5 runqslower 210
6.3.6 cpudist 211
6.3.7 cpufreq 212
6.3.8 profile 215
6.3.9 offcputime 219
6.3.10 syscount 224
6.3.11 argdist and trace 226
6.3.12 funccount 229
6.3.13 softirqs 231
6.3.14 hardirqs 232
6.3.15 smpcalls 233
6.3.16 llcstat 237
6.3.17 Other Tools 238
6.4 BPF One-Liners 238
6.4.1 BCC 238
6.4.2 bpftrace 239
6.5 Optional Exercises 240
6.6 Summary 241
7 Memory 243
7.1 Background 244
7.1.1 Memory Fundamentals 244
7.1.2 BPF Capabilities 247
7.1.3 Strategy 250
7.2 Traditional Tools 250
7.2.1 Kernel Log 251
7.2.2 Kernel Statistics 252
7.2.3 Hardware Statistics and Sampling 255
7.3 BPF Tools 257
7.3.1 oomkill 258
7.3.2 memleak 259
7.3.3 mmapsnoop 261
7.3.4 brkstack 262
7.3.5 shmsnoop 264
7.3.6 faults 264
7.3.7 ffaults 267
7.3.8 vmscan 268
7.3.9 drsnoop 271
7.3.10 swapin 272
7.3.11 hfaults 273
7.3.12 Other Tools 274
7.4 BPF One-Liners 274
7.4.1 BCC 274
7.4.2 bpftrace 275
7.5 Optional Exercises 275
7.6 Summary 276
8 File Systems 277
8.1 Background 278
8.1.1 File Systems Fundamentals 278
8.1.2 BPF Capabilities 280
8.1.3 Strategy 281
8.2 Traditional Tools 282
8.2.1 df 282
8.2.2 mount 283
8.2.3 strace 283
8.2.4 perf 284
8.2.5 fatrace 286
8.3 BPF Tools 287
8.3.1 opensnoop 289
8.3.2 statsnoop 291
8.3.3 syncsnoop 293
8.3.4 mmapfiles 294
8.3.5 scread 295
8.3.6 fmapfault 297
8.3.7 filelife 298
8.3.8 vfsstat 299
8.3.9 vfscount 301
8.3.10 vfssize 302
8.3.11 fsrwstat 304
8.3.12 fileslower 306
8.3.13 filetop 308
8.3.14 writesync 310
8.3.15 filetype 311
8.3.16 cachestat 314
8.3.17 writeback 316
8.3.18 dcstat 318
8.3.19 dcsnoop 320
8.3.20 mountsnoop 322
8.3.21 xfsslower 323
8.3.22 xfsdist 324
8.3.23 ext4dist 327
8.3.24 icstat 330
8.3.25 bufgrow 331
8.3.26 readahead 332
8.3.27 Other Tools 334
8.4 BPF One-Liners 334
8.4.1 BCC 334
8.4.2 bpftrace 335
8.4.3 BPF One-Liners Examples 336
8.5 Optional Exercises 340
8.6 Summary 340
9 Disk I/O 341
9.1 Background 342
9.1.1 Disk Fundamentals 342
9.1.2 BPF Capabilities 344
9.1.3 Strategy 346
9.2 Traditional Tools 346
9.2.1 iostat 346
9.2.2 perf 348
9.2.3 blktrace 349
9.2.4 SCSI Logging 350
9.3 BPF Tools 351
9.3.1 biolatency 352
9.3.2 biosnoop 358
9.3.3 biotop 361
9.3.4 bitesize 362
9.3.5 seeksize 364
9.3.6 biopattern 366
9.3.7 biostacks 368
9.3.8 bioerr 371
9.3.9 mdflush 374
9.3.10 iosched 375
9.3.11 scsilatency 377
9.3.12 scsiresult 379
9.3.13 nvmelatency 381
9.4 BPF One-Liners 384
9.4.1 BCC 384
9.4.2 bpftrace 385
9.4.3 BPF One-Liners Examples 386
9.5 Optional Exercises 387
9.6 Summary 387
10 Networking 389
10.1 Background 390
10.1.1 Networking Fundamentals 390
10.1.2 BPF Capabilities 396
10.1.3 Strategy 398
10.1.4 Common Tracing Mistakes 399
10.2 Traditional Tools 399
10.2.1 ss 400
10.2.2 ip 402
10.2.3 nstat 402
10.2.4 netstat 403
10.2.5 sar 405
10.2.6 nicstat 406
10.2.7 ethtool 407
10.2.8 tcpdump 408
10.2.9 /proc 409
10.3 BPF Tools 411
10.3.1 sockstat 412
10.3.2 sofamily 414
10.3.3 soprotocol 416
10.3.4 soconnect 419
10.3.5 soaccept 422
10.3.6 socketio 424
10.3.7 socksize 426
10.3.8 sormem 429
10.3.9 soconnlat 432
10.3.10 so1stbyte 435
10.3.11 tcpconnect 437
10.3.12 tcpaccept 440
10.3.13 tcplife 443
10.3.14 tcptop 448
10.3.15 tcpsnoop 449
10.3.16 tcpretrans 450
10.3.17 tcpsynbl 453
10.3.18 tcpwin 454
10.3.19 tcpnagle 456
10.3.20 udpconnect 458
10.3.21 gethostlatency 460
10.3.22 ipecn 461
10.3.23 superping 463
10.3.24 qdisc-fq 466
10.3.25 qdisc-cbq, qdisc-cbs, qdisc-codel, qdisc-fq_codel, qdisc-red,
and qdisc-tbf 468
10.3.26 netsize 470
10.3.27 nettxlat 473
10.3.28 skbdrop 475
10.3.29 skblife 477
10.3.30 ieee80211scan 479
10.3.31 Other Tools 481
10.4 BPF One-Liners 482
10.4.1 BCC 482
10.4.2 bpftrace 482
10.4.3 BPF One-Liners Examples 484
10.5 Optional Exercises 487
10.6 Summary 488
11 Security 489
11.1 Background 489
11.1.1 BPF Capabilities 490
11.1.2 Unprivileged BPF Users 493
11.1.3 Configuring BPF Security 494
11.1.4 Strategy 495
11.2 BPF Tools 495
11.2.1 execsnoop 496
11.2.2 elfsnoop 497
11.2.3 modsnoop 498
11.2.4 bashreadline 499
11.2.5 shellsnoop 500
11.2.6 ttysnoop 502
11.2.7 opensnoop 503
11.2.8 eperm 504
11.2.9 tcpconnect and tcpaccept 505
11.2.10 tcpreset 506
11.2.11 capable 508
11.2.12 setuids 512
11.3 BPF One-Liners 514
11.3.1 BCC 514
11.3.2 bpftrace 514
11.3.3 BPF One-Liners Examples 514
11.4 Summary 515
12 Languages 517
12.1 Background 517
12.1.1 Compiled 518
12.1.2 JIT Compiled 519
12.1.3 Interpreted 520
12.1.4 BPF Capabilities 521
12.1.5 Strategy 521
12.1.6 BPF Tools 522
12.2 C 522
12.2.1 C Function Symbols 523
12.2.2 C Stack Traces 526
12.2.3 C Function Tracing 528
12.2.4 C Function Offset Tracing 529
12.2.5 C USDT 529
12.2.6 C One-Liners 530
12.3 Java 531
12.3.1 libjvm Tracing 532
12.3.2 jnistacks 533
12.3.3 Java Thread Names 536
12.3.4 Java Method Symbols 537
12.3.5 Java Stack Traces 539
12.3.6 Java USDT Probes 543
12.3.7 profile 549
12.3.8 offcputime 553
12.3.9 stackcount 559
12.3.10 javastat 562
12.3.11 javathreads 563
12.3.12 javacalls 565
12.3.13 javaflow 566
12.3.14 javagc 568
12.3.15 javaobjnew 568
12.3.16 Java One-Liners 569
12.4 Bash Shell 570
12.4.1 Function Counts 572
12.4.2 Function Argument Tracing (bashfunc.bt) 573
12.4.3 Function Latency (bashfunclat.bt) 576
12.4.4 /bin/bash 577
12.4.5 /bin/bash USDT 581
12.4.6 bash One-Liners 582
12.5 Other Languages 583
12.5.1 JavaScript (Node.js) 583
12.5.2 C++ 585
12.5.3 Golang 585
12.6 Summary 588
13 Applications 589
13.1 Background 590
13.1.1 Application Fundamentals 590
13.1.2 Application Example: MySQL Server 591
13.1.3 BPF Capabilities 592
13.1.4 Strategy 592
13.2 BPF Tools 593
13.2.1 execsnoop 595
13.2.2 threadsnoop 595
13.2.3 profile 598
13.2.4 threaded 601
13.2.5 offcputime 603
13.2.6 offcpuhist 607
13.2.7 syscount 610
13.2.8 ioprofile 611
13.2.9 libc Frame Pointers 613
13.2.10 mysqld_qslower 614
13.2.11 mysqld_clat 617
13.2.12 signals 621
13.2.13 killsnoop 623
13.2.14 pmlock and pmheld 624
13.2.15 naptime 629
13.2.16 Other Tools 630
13.3 BPF One-Liners 631
13.3.1 BCC 631
13.3.2 bpftrace 631
13.4 BPF One-Liners Examples 632
13.4.1 Counting libpthread Conditional Variable Functions for One
Second 632
13.5 Summary 633
14 Kernel 635
14.1 Background 636
14.1.1 Kernel Fundamentals 636
14.1.2 BPF Capabilities 638
14.2 Strategy 639
14.3 Traditional Tools 640
14.3.1 Ftrace 640
14.3.2 perf sched 643
14.3.3 slabtop 644
14.3.4 Other Tools 644
14.4 BPF Tools 644
14.4.1 loads 646
14.4.2 offcputime 647
14.4.3 wakeuptime 649
14.4.4 offwaketime 650
14.4.5 mlock and mheld 652
14.4.6 Spin Locks 656
14.4.7 kmem 657
14.4.8 kpages 658
14.4.9 memleak 659
14.4.10 slabratetop 660
14.4.11 numamove 661
14.4.12 workq 663
14.4.13 Tasklets 664
14.4.14 Other Tools 665
14.5 BPF One-Liners 666
14.5.1 BCC 666
14.5.2 bpftrace 666
14.6 BPF One-Liners Examples 667
14.7 Challenges 668
14.8 Summary 669
15 Containers 671
15.1 Background 671
15.1.1 BPF Capabilities 673
15.1.2 Challenges 673
15.1.3 Strategy 676
15.2 Traditional Tools 676
15.2.1 From the Host 676
15.2.2 From the Container 677
15.2.3 systemd-cgtop 677
15.2.4 kubectl top 678
15.2.5 docker stats 678
15.2.6 /sys/fs/cgroups 679
15.2.7 perf 679
15.3 BPF Tools 680
15.3.1 runqlat 680
15.3.2 pidnss 681
15.3.3 blkthrot 683
15.3.4 overlayfs 684
15.4 BPF One-Liners 687
15.5 Optional Exercises 687
15.6 Summary 687
16 Hypervisors 689
16.1 Background 689
16.1.1 BPF Capabilities 691
16.1.2 Suggested Strategies 691
16.2 Traditional Tools 692
16.3 Guest BPF Tools 693
16.3.1 Xen Hypercalls 693
16.3.2 xenhyper 697
16.3.3 Xen Callbacks 699
16.3.4 cpustolen 700
16.3.5 HVM Exit Tracing 701
16.4 Host BPF Tools 702
16.4.1 kvmexits 702
16.4.2 Future Work 706
16.5 Summary 707
Part III: Additional Topics
17 Other BPF Performance Tools 709
17.1 Vector and Performance Co-Pilot (PCP) 709
17.1.1 Visualizations 710
17.1.2 Visualization: Heat Maps 711
17.1.3 Visualization: Tabular Data 713
17.1.4 BCC Provided Metrics 714
17.1.5 Internals 714
17.1.6 Installing PCP and Vector 715
17.1.7 Connecting and Viewing Data 715
17.1.8 Configuring the BCC PMDA 717
17.1.9 Future Work 718
17.1.10 Further Reading 718
17.2 Grafana and Performance Co-Pilot (PCP) 718
17.2.1 Installation and Configuration 719
17.2.2 Connecting and Viewing Data 719
17.2.3 Future Work 721
17.2.4 Further Reading 721
17.3 Cloudflare eBPF Prometheus Exporter (with Grafana) 721
17.3.1 Build and Run the ebpf Exporter 721
17.3.2 Configure Prometheus to Monitor the ebpf_exporter
Instance 722
17.3.3 Set Up a Query in Grafana 722
17.3.4 Further Reading 723
17.4 kubectl-trace 723
17.4.1 Tracing Nodes 723
17.4.2 Tracing Pods and Containers 724
17.4.3 Further Reading 726
17.5 Other Tools 726
17.6 Summary 726
18 Tips, Tricks, and Common Problems 727
18.1 Typical Event Frequency and Overhead 727
18.1.1 Frequency 728
18.1.2 Action Performed 729
18.1.3 Test Yourself 731
18.2 Sample at 49 or 99 Hertz 731
18.3 Yellow Pigs and Gray Rats 732
18.4 Write Target Software 733
18.5 Learn Syscalls 734
18.6 Keep It Simple 735
18.7 Missing Events 735
18.8 Missing Stacks Traces 737
18.8.1 How to Fix Broken Stack Traces 738
18.9 Missing Symbols (Function Names) When Printing 738
18.9.1 How to Fix Missing Symbols: JIT Runtimes
(Java, Node.js, ...) 739
18.9.2 How to Fix Missing Symbols: ELF binaries (C, C++, ...) 739
18.10 Missing Functions When Tracing 739
18.11 Feedback Loops 740
18.12 Dropped Events 740
Part IV: Appendixes
A bpftrace One-Liners 741
B bpftrace Cheat Sheet 745
C BCC Tool Development 747
D C BPF 763
E BPF Instructions 783
Glossary 789
Bibliography 795

作者簡介

Netflix 資深性能工程師 Brendan Gregg 是 BPF(eBPF)的主要貢獻者,他幫助開發和維護了兩個主要的 BPF 前端編程框架,開創了 BPF 用於可觀測性的先河,並創建了數十種基於 BPF 的性能分析工具。他還編著有暢銷書《性能之巔:洞悉系統、企業與雲計算》。
2.1 BPF Illustrated 15
2.2 BPF 16
2.3 Extended BPF (eBPF) 17
2.3.1 Why Performance Tools Need BPF 19
2.3.2 BPF Versus Kernel Modules 21
2.3.3 Writing BPF Programs 22
2.3.4 Viewing BPF Instructions: bpftool 23
2.3.5 Viewing BPF Instructions: bpftrace 30
2.3.6 BPF API 31
2.3.7 BPF Concurrency Controls 35
2.3.8 BPF sysfs Interface 36
2.3.9 BPF Type Format (BTF) 37
2.3.10 BPF CO-RE 37
2.3.11 BPF Limitations 38
2.3.12 BPF Additional Reading 38
2.4 Stack Trace Walking 39
2.4.1 Frame Pointer–Based Stacks 39
2.4.2 debuginfo 40
2.4.3 Last Branch Record (LBR) 40
2.4.4 ORC 40
2.4.5 Symbols 41
2.4.6 More Reading 41
2.5 Flame Graphs 41
2.5.1 Stack Trace 41
2.5.2 Profiling Stack Traces 41
2.5.3 Flame Graph 42
2.5.4 Flame Graph Features 44
2.5.5 Variations 44
2.6 Event Sources 45
2.7 kprobes 46
2.7.1 How kprobes Work 46
2.7.2 kprobes Interfaces 47
2.7.3 BPF and kprobes 48
2.7.4 kprobes Additional Reading 49
2.8 uprobes 49
2.8.1 How uprobes Work 49
2.8.2 Uprobes Interfaces 51
2.8.3 BPF and uprobes 51
2.8.4 uprobes Overhead and Future Work 52
2.8.5 uprobes Additional Reading 52
2.9 Tracepoints 53
2.9.1 Adding Tracepoint Instrumentation 53
2.9.2 How Tracepoints Work 55
2.9.3 Tracepoint Interfaces 56
2.9.4 Tracepoints and BPF 56
2.9.5 BPF Raw Tracepoints 57
2.9.6 Additional Reading 58
2.10 USDT 58
2.10.1 Adding USDT Instrumentation 58
2.10.2 How USDT Works 60
2.10.3 BPF and USDT 61
2.10.4 USDT Additional Reading 61
2.11 Dynamic USDT 61
2.12 PMCs 63
2.12.1 PMC Modes 63
2.12.2 PEBS 64
2.12.3 Cloud Computing 64
2.13 perf_events 64
2.14 Summary 65
3 Performance Analysis 67
3.1 Overview 67
3.1.1 Goals 68
3.1.2 Activities 68
3.1.3 Mulitple Performance Issues 69
3.2 Performance Methodologies 69
3.2.1 Workload Characterization 70
3.2.2 Drill-Down Analysis 71
3.2.3 USE Method 72
3.2.4 Checklists 72
3.3 Linux 60-Second Analysis 73
3.3.1 uptime 73
3.3.2 dmesg | tail 74
3.3.3 vmstat 1 74
3.3.4 mpstat -P ALL 1 75
3.3.5 pidstat 1 75
3.3.6 iostat -xz 1 76
3.3.7 free -m 77
3.3.8 sar -n DEV 1 77
3.3.9 sar -n TCP,ETCP 1 78
3.3.10 top 78
3.4 BCC Tool Checklist 79
3.4.1 execsnoop 80
3.4.2 opensnoop 80
3.4.3 ext4slower 80
3.4.4 biolatency 81
3.4.5 biosnoop 81
3.4.6 cachestat 82
3.4.7 tcpconnect 82
3.4.8 tcpaccept 82
3.4.9 tcpretrans 83
3.4.10 runqlat 83
3.4.11 profile 84
3.5 Summary 84
4 BCC 85
4.1 BCC Components 86
4.2 BCC Features 86
4.2.1 Kernel-Level Features 87
4.2.2 BCC User-Level Features 87
4.3 BCC Installation 88
4.3.1 Kernel Requirements 88
4.3.2 Ubuntu 88
4.3.3 RHEL 89
4.3.4 Other Distributions 89
4.4 BCC Tools 89
4.4.1 Highlighted Tools 90
4.4.2 Tool Characteristics 91
4.4.3 Single-Purpose Tools 91
4.4.4 Multi-Purpose Tools 93
4.5 funccount 94
4.5.1 funccount Examples 94
4.5.2 funccount Syntax 97
4.5.3 funccount One-Liners 97
4.5.4 funccount Usage 98
4.6 stackcount 99
4.6.1 stackcount Example 99
4.6.2 stackcount Flame Graphs 100
4.6.3 stackcount Broken Stack Traces 101
4.6.4 stackcount Syntax 102
4.6.5 stackcount One-Liners 102
4.6.6 stackcount Usage 103
4.7 trace 104
4.7.1 trace Example 104
4.7.2 trace Syntax 105
4.7.3 trace One-Liners 106
4.7.4 trace Structs 107
4.7.5 trace Debugging File Descriptor Leaks 107
4.7.6 trace Usage 109
4.8 argdist 110
4.8.1 argdist Syntax 111
4.8.2 argdist One-Liners 112
4.8.3 argdist Usage 113
4.9 Tool Documentation 114
4.9.1 Man Page: opensnoop 115
4.9.2 Examples File: opensnoop 118
4.10 Developing BCC Tools 119
4.11 BCC Internals 120
4.12 BCC Debugging 121
4.12.1 printf() Debugging 122
4.12.2 BCC Debug Output 124
4.12.3 BCC Debug Flag 125
4.12.4 bpflist 126
4.12.5 bpftool 127
4.12.6 dmesg 127
4.12.7 Resetting Events 127
4.13 Summary 128
5 bpftrace 129
5.1 bpftrace Components 130
5.2 bpftrace Features 131
5.2.1 bpftrace Event Sources 131
5.2.2 bpftrace Actions 131
5.2.3 bpftrace General Features 132
5.2.4 bpftrace Compared to Other Observability Tools 132
5.3 bpftrace Installation 133
5.3.1 Kernel Requirements 133
5.3.2 Ubuntu 134
5.3.3 Fedora 134
5.3.4 Post-Build Steps 134
5.3.5 Other Distributions 135
5.4 bpftrace Tools 135
5.4.1 Highlighted Tools 136
5.4.2 Tool Characteristics 136
5.4.3 Tool Execution 137
5.5 bpftrace One-Liners 137
5.6 bpftrace Documentation 138
5.7 bpftrace Programming 138
5.7.1 Usage 139
5.7.2 Program Structure 140
5.7.3 Comments 140
5.7.4 Probe Format 141
5.7.5 Probe Wildcards 141
5.7.6 Filters 142
5.7.7 Actions 142
5.7.8 Hello, World! 142
5.7.9 Functions 143
5.7.10 Variables 143
5.7.11 Map Functions 144
5.7.12 Timing vfs_read() 145
5.8 bpftrace Usage 147
5.9 bpftrace Probe Types 148
5.9.1 tracepoint 148
5.9.2 usdt 150
5.9.3 kprobe and kretprobe 151
5.9.4 uprobe and uretprobe 151
5.9.5 software and hardware 152
5.9.6 profile and interval 153
5.10 bpftrace Flow Control 154
5.10.1 Filter 154
5.10.2 Ternary Operators 154
5.10.3 If Statements 155
5.10.4 Unrolled Loops 155
5.11 bpftrace Operators 155
5.12 bpftrace Variables 156
5.12.1 Built-in Variables 156
5.12.2 Built-ins: pid, comm, and uid 157
5.12.3 Built-ins: kstack and ustack 157
5.12.4 Built-ins: Positional Parameters 159
5.12.5 Scratch 160
5.12.6 Maps 160
5.13 bpftrace Functions 161
5.13.1 printf() 162
5.13.2 join() 163
5.13.3 str() 163
5.13.4 kstack() and ustack() 164
5.13.5 ksym() and usym() 165
5.13.6 kaddr() and uaddr() 166
5.13.7 system() 166
5.13.8 exit() 167
5.14 bpftrace Map Functions 167
5.14.1 count() 168
5.14.2 sum(), avg(), min(), and max() 169
5.14.3 hist() 170
5.14.4 lhist() 171
5.14.5 delete() 171
5.14.6 clear() and zero() 172
5.14.7 print() 172
5.15 bpftrace Future Work 173
5.15.1 Explicit Address Modes 173
5.15.2 Other Additions 174
5.15.3 ply 175
5.16 bpftrace Internals 175
5.17 bpftrace Debugging 176
5.17.1 printf() Debugging 177
5.17.2 Debug Mode 177
5.17.3 Verbose Mode 179
5.18 Summary 180
Part II: Using BPF Tools
6 CPUs 181
6.1 Background 181
6.1.1 CPU Fundamentals 182
6.1.2 BPF Capabilities 184
6.1.3 Strategy 185
6.2 Traditional Tools 186
6.2.1 Kernel Statistics 187
6.2.2 Hardware Statistics 189
6.2.3 Hardware Sampling 192
6.2.4 Timed Sampling 192
6.2.5 Event Statistics and Tracing 196
6.3 BPF Tools 198
6.3.1 execsnoop 200
6.3.2 exitsnoop 202
6.3.3 runqlat 203
6.3.4 runqlen 207
6.3.5 runqslower 210
6.3.6 cpudist 211
6.3.7 cpufreq 212
6.3.8 profile 215
6.3.9 offcputime 219
6.3.10 syscount 224
6.3.11 argdist and trace 226
6.3.12 funccount 229
6.3.13 softirqs 231
6.3.14 hardirqs 232
6.3.15 smpcalls 233
6.3.16 llcstat 237
6.3.17 Other Tools 238
6.4 BPF One-Liners 238
6.4.1 BCC 238
6.4.2 bpftrace 239
6.5 Optional Exercises 240
6.6 Summary 241
7 Memory 243
7.1 Background 244
7.1.1 Memory Fundamentals 244
7.1.2 BPF Capabilities 247
7.1.3 Strategy 250
7.2 Traditional Tools 250
7.2.1 Kernel Log 251
7.2.2 Kernel Statistics 252
7.2.3 Hardware Statistics and Sampling 255
7.3 BPF Tools 257
7.3.1 oomkill 258
7.3.2 memleak 259
7.3.3 mmapsnoop 261
7.3.4 brkstack 262
7.3.5 shmsnoop 264
7.3.6 faults 264
7.3.7 ffaults 267
7.3.8 vmscan 268
7.3.9 drsnoop 271
7.3.10 swapin 272
7.3.11 hfaults 273
7.3.12 Other Tools 274
7.4 BPF One-Liners 274
7.4.1 BCC 274
7.4.2 bpftrace 275
7.5 Optional Exercises 275
7.6 Summary 276
8 File Systems 277
8.1 Background 278
8.1.1 File Systems Fundamentals 278
8.1.2 BPF Capabilities 280
8.1.3 Strategy 281
8.2 Traditional Tools 282
8.2.1 df 282
8.2.2 mount 283
8.2.3 strace 283
8.2.4 perf 284
8.2.5 fatrace 286
8.3 BPF Tools 287
8.3.1 opensnoop 289
8.3.2 statsnoop 291
8.3.3 syncsnoop 293
8.3.4 mmapfiles 294
8.3.5 scread 295
8.3.6 fmapfault 297
8.3.7 filelife 298
8.3.8 vfsstat 299
8.3.9 vfscount 301
8.3.10 vfssize 302
8.3.11 fsrwstat 304
8.3.12 fileslower 306
8.3.13 filetop 308
8.3.14 writesync 310
8.3.15 filetype 311
8.3.16 cachestat 314
8.3.17 writeback 316
8.3.18 dcstat 318
8.3.19 dcsnoop 320
8.3.20 mountsnoop 322
8.3.21 xfsslower 323
8.3.22 xfsdist 324
8.3.23 ext4dist 327
8.3.24 icstat 330
8.3.25 bufgrow 331
8.3.26 readahead 332
8.3.27 Other Tools 334
8.4 BPF One-Liners 334
8.4.1 BCC 334
8.4.2 bpftrace 335
8.4.3 BPF One-Liners Examples 336
8.5 Optional Exercises 340
8.6 Summary 340
9 Disk I/O 341
9.1 Background 342
9.1.1 Disk Fundamentals 342
9.1.2 BPF Capabilities 344
9.1.3 Strategy 346
9.2 Traditional Tools 346
9.2.1 iostat 346
9.2.2 perf 348
9.2.3 blktrace 349
9.2.4 SCSI Logging 350
9.3 BPF Tools 351
9.3.1 biolatency 352
9.3.2 biosnoop 358
9.3.3 biotop 361
9.3.4 bitesize 362
9.3.5 seeksize 364
9.3.6 biopattern 366
9.3.7 biostacks 368
9.3.8 bioerr 371
9.3.9 mdflush 374
9.3.10 iosched 375
9.3.11 scsilatency 377
9.3.12 scsiresult 379
9.3.13 nvmelatency 381
9.4 BPF One-Liners 384
9.4.1 BCC 384
9.4.2 bpftrace 385
9.4.3 BPF One-Liners Examples 386
9.5 Optional Exercises 387
9.6 Summary 387
10 Networking 389
10.1 Background 390
10.1.1 Networking Fundamentals 390
10.1.2 BPF Capabilities 396
10.1.3 Strategy 398
10.1.4 Common Tracing Mistakes 399
10.2 Traditional Tools 399
10.2.1 ss 400
10.2.2 ip 402
10.2.3 nstat 402
10.2.4 netstat 403
10.2.5 sar 405
10.2.6 nicstat 406
10.2.7 ethtool 407
10.2.8 tcpdump 408
10.2.9 /proc 409
10.3 BPF Tools 411
10.3.1 sockstat 412
10.3.2 sofamily 414
10.3.3 soprotocol 416
10.3.4 soconnect 419
10.3.5 soaccept 422
10.3.6 socketio 424
10.3.7 socksize 426
10.3.8 sormem 429
10.3.9 soconnlat 432
10.3.10 so1stbyte 435
10.3.11 tcpconnect 437
10.3.12 tcpaccept 440
10.3.13 tcplife 443
10.3.14 tcptop 448
10.3.15 tcpsnoop 449
10.3.16 tcpretrans 450
10.3.17 tcpsynbl 453
10.3.18 tcpwin 454
10.3.19 tcpnagle 456
10.3.20 udpconnect 458
10.3.21 gethostlatency 460
10.3.22 ipecn 461
10.3.23 superping 463
10.3.24 qdisc-fq 466
10.3.25 qdisc-cbq, qdisc-cbs, qdisc-codel, qdisc-fq_codel, qdisc-red,
and qdisc-tbf 468
10.3.26 netsize 470
10.3.27 nettxlat 473
10.3.28 skbdrop 475
10.3.29 skblife 477
10.3.30 ieee80211scan 479
10.3.31 Other Tools 481
10.4 BPF One-Liners 482
10.4.1 BCC 482
10.4.2 bpftrace 482
10.4.3 BPF One-Liners Examples 484
10.5 Optional Exercises 487
10.6 Summary 488
11 Security 489
11.1 Background 489
11.1.1 BPF Capabilities 490
11.1.2 Unprivileged BPF Users 493
11.1.3 Configuring BPF Security 494
11.1.4 Strategy 495
11.2 BPF Tools 495
11.2.1 execsnoop 496
11.2.2 elfsnoop 497
11.2.3 modsnoop 498
11.2.4 bashreadline 499
11.2.5 shellsnoop 500
11.2.6 ttysnoop 502
11.2.7 opensnoop 503
11.2.8 eperm 504
11.2.9 tcpconnect and tcpaccept 505
11.2.10 tcpreset 506
11.2.11 capable 508
11.2.12 setuids 512
11.3 BPF One-Liners 514
11.3.1 BCC 514
11.3.2 bpftrace 514
11.3.3 BPF One-Liners Examples 514
11.4 Summary 515
12 Languages 517
12.1 Background 517
12.1.1 Compiled 518
12.1.2 JIT Compiled 519
12.1.3 Interpreted 520
12.1.4 BPF Capabilities 521
12.1.5 Strategy 521
12.1.6 BPF Tools 522
12.2 C 522
12.2.1 C Function Symbols 523
12.2.2 C Stack Traces 526
12.2.3 C Function Tracing 528
12.2.4 C Function Offset Tracing 529
12.2.5 C USDT 529
12.2.6 C One-Liners 530
12.3 Java 531
12.3.1 libjvm Tracing 532
12.3.2 jnistacks 533
12.3.3 Java Thread Names 536
12.3.4 Java Method Symbols 537
12.3.5 Java Stack Traces 539
12.3.6 Java USDT Probes 543
12.3.7 profile 549
12.3.8 offcputime 553
12.3.9 stackcount 559
12.3.10 javastat 562
12.3.11 javathreads 563
12.3.12 javacalls 565
12.3.13 javaflow 566
12.3.14 javagc 568
12.3.15 javaobjnew 568
12.3.16 Java One-Liners 569
12.4 Bash Shell 570
12.4.1 Function Counts 572
12.4.2 Function Argument Tracing (bashfunc.bt) 573
12.4.3 Function Latency (bashfunclat.bt) 576
12.4.4 /bin/bash 577
12.4.5 /bin/bash USDT 581
12.4.6 bash One-Liners 582
12.5 Other Languages 583
12.5.1 JavaScript (Node.js) 583
12.5.2 C++ 585
12.5.3 Golang 585
12.6 Summary 588
13 Applications 589
13.1 Background 590
13.1.1 Application Fundamentals 590
13.1.2 Application Example: MySQL Server 591
13.1.3 BPF Capabilities 592
13.1.4 Strategy 592
13.2 BPF Tools 593
13.2.1 execsnoop 595
13.2.2 threadsnoop 595
13.2.3 profile 598
13.2.4 threaded 601
13.2.5 offcputime 603
13.2.6 offcpuhist 607
13.2.7 syscount 610
13.2.8 ioprofile 611
13.2.9 libc Frame Pointers 613
13.2.10 mysqld_qslower 614
13.2.11 mysqld_clat 617
13.2.12 signals 621
13.2.13 killsnoop 623
13.2.14 pmlock and pmheld 624
13.2.15 naptime 629
13.2.16 Other Tools 630
13.3 BPF One-Liners 631
13.3.1 BCC 631
13.3.2 bpftrace 631
13.4 BPF One-Liners Examples 632
13.4.1 Counting libpthread Conditional Variable Functions for One
Second 632
13.5 Summary 633
14 Kernel 635
14.1 Background 636
14.1.1 Kernel Fundamentals 636
14.1.2 BPF Capabilities 638
14.2 Strategy 639
14.3 Traditional Tools 640
14.3.1 Ftrace 640
14.3.2 perf sched 643
14.3.3 slabtop 644
14.3.4 Other Tools 644
14.4 BPF Tools 644
14.4.1 loads 646
14.4.2 offcputime 647
14.4.3 wakeuptime 649
14.4.4 offwaketime 650
14.4.5 mlock and mheld 652
14.4.6 Spin Locks 656
14.4.7 kmem 657
14.4.8 kpages 658
14.4.9 memleak 659
14.4.10 slabratetop 660
14.4.11 numamove 661
14.4.12 workq 663
14.4.13 Tasklets 664
14.4.14 Other Tools 665
14.5 BPF One-Liners 666
14.5.1 BCC 666
14.5.2 bpftrace 666
14.6 BPF One-Liners Examples 667
14.7 Challenges 668
14.8 Summary 669
15 Containers 671
15.1 Background 671
15.1.1 BPF Capabilities 673
15.1.2 Challenges 673
15.1.3 Strategy 676
15.2 Traditional Tools 676
15.2.1 From the Host 676
15.2.2 From the Container 677
15.2.3 systemd-cgtop 677
15.2.4 kubectl top 678
15.2.5 docker stats 678
15.2.6 /sys/fs/cgroups 679
15.2.7 perf 679
15.3 BPF Tools 680
15.3.1 runqlat 680
15.3.2 pidnss 681
15.3.3 blkthrot 683
15.3.4 overlayfs 684
15.4 BPF One-Liners 687
15.5 Optional Exercises 687
15.6 Summary 687
16 Hypervisors 689
16.1 Background 689
16.1.1 BPF Capabilities 691
16.1.2 Suggested Strategies 691
16.2 Traditional Tools 692
16.3 Guest BPF Tools 693
16.3.1 Xen Hypercalls 693
16.3.2 xenhyper 697
16.3.3 Xen Callbacks 699
16.3.4 cpustolen 700
16.3.5 HVM Exit Tracing 701
16.4 Host BPF Tools 702
16.4.1 kvmexits 702
16.4.2 Future Work 706
16.5 Summary 707
Part III: Additional Topics
17 Other BPF Performance Tools 709
17.1 Vector and Performance Co-Pilot (PCP) 709
17.1.1 Visualizations 710
17.1.2 Visualization: Heat Maps 711
17.1.3 Visualization: Tabular Data 713
17.1.4 BCC Provided Metrics 714
17.1.5 Internals 714
17.1.6 Installing PCP and Vector 715
17.1.7 Connecting and Viewing Data 715
17.1.8 Configuring the BCC PMDA 717
17.1.9 Future Work 718
17.1.10 Further Reading 718
17.2 Grafana and Performance Co-Pilot (PCP) 718
17.2.1 Installation and Configuration 719
17.2.2 Connecting and Viewing Data 719
17.2.3 Future Work 721
17.2.4 Further Reading 721
17.3 Cloudflare eBPF Prometheus Exporter (with Grafana) 721
17.3.1 Build and Run the ebpf Exporter 721
17.3.2 Configure Prometheus to Monitor the ebpf_exporter
Instance 722
17.3.3 Set Up a Query in Grafana 722
17.3.4 Further Reading 723
17.4 kubectl-trace 723
17.4.1 Tracing Nodes 723
17.4.2 Tracing Pods and Containers 724
17.4.3 Further Reading 726
17.5 Other Tools 726
17.6 Summary 726
18 Tips, Tricks, and Common Problems 727
18.1 Typical Event Frequency and Overhead 727
18.1.1 Frequency 728
18.1.2 Action Performed 729
18.1.3 Test Yourself 731
18.2 Sample at 49 or 99 Hertz 731
18.3 Yellow Pigs and Gray Rats 732
18.4 Write Target Software 733
18.5 Learn Syscalls 734
18.6 Keep It Simple 735
18.7 Missing Events 735
18.8 Missing Stacks Traces 737
18.8.1 How to Fix Broken Stack Traces 738
18.9 Missing Symbols (Function Names) When Printing 738
18.9.1 How to Fix Missing Symbols: JIT Runtimes
(Java, Node.js, ...) 739
18.9.2 How to Fix Missing Symbols: ELF binaries (C, C++, ...) 739
18.10 Missing Functions When Tracing 739
18.11 Feedback Loops 740
18.12 Dropped Events 740
Part IV: Appendixes
A bpftrace One-Liners 741
B bpftrace Cheat Sheet 745
C BCC Tool Development 747
D C BPF 763
E BPF Instructions 783
Glossary 789
Bibliography 795

作者簡介

Netflix 資深性能工程師 Brendan Gregg 是 BPF(eBPF)的主要貢獻者,他幫助開發和維護了兩個主要的 BPF 前端編程框架,開創了 BPF 用於可觀測性的先河,並創建了數十種基於 BPF 的性能分析工具。他還編著有暢銷書《性能之巔:洞悉系統、企業與雲計算》。

相關詞條

熱門詞條

聯絡我們