SCaLE15x: Linux 4.x Tracing: Performance Analysis with bcc/BPF
Video: https://www.youtube.com/watch?v=w8nFRoFJ6EQTalk about bcc/eBPF for Southern California Linux Expo 15x (2017) by Brendan Gregg.
Description: "BPF (Berkeley Packet Filter) has been enhanced in the Linux 4.x series and now powers a large collection of performance analysis and observability tools ready for you to use, included in the bcc (BPF Complier Collection) open source project. BPF nowadays can do system tracing, software defined networks, and kernel fast path: much more than just filtering packets! This talk will focus on the bcc/BPF tools for performance analysis, which make use of other built in Linux capabilities: dynamic tracing (kprobes and uprobes) and static tracing (tracepoints and USDT). There are now bcc tools for measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. Tracing superpowers have finally arrived, built in to Linux."
next prev 1/70 | |
next prev 2/70 | |
next prev 3/70 | |
next prev 4/70 | |
next prev 5/70 | |
next prev 6/70 | |
next prev 7/70 | |
next prev 8/70 | |
next prev 9/70 | |
next prev 10/70 | |
next prev 11/70 | |
next prev 12/70 | |
next prev 13/70 | |
next prev 14/70 | |
next prev 15/70 | |
next prev 16/70 | |
next prev 17/70 | |
next prev 18/70 | |
next prev 19/70 | |
next prev 20/70 | |
next prev 21/70 | |
next prev 22/70 | |
next prev 23/70 | |
next prev 24/70 | |
next prev 25/70 | |
next prev 26/70 | |
next prev 27/70 | |
next prev 28/70 | |
next prev 29/70 | |
next prev 30/70 | |
next prev 31/70 | |
next prev 32/70 | |
next prev 33/70 | |
next prev 34/70 | |
next prev 35/70 | |
next prev 36/70 | |
next prev 37/70 | |
next prev 38/70 | |
next prev 39/70 | |
next prev 40/70 | |
next prev 41/70 | |
next prev 42/70 | |
next prev 43/70 | |
next prev 44/70 | |
next prev 45/70 | |
next prev 46/70 | |
next prev 47/70 | |
next prev 48/70 | |
next prev 49/70 | |
next prev 50/70 | |
next prev 51/70 | |
next prev 52/70 | |
next prev 53/70 | |
next prev 54/70 | |
next prev 55/70 | |
next prev 56/70 | |
next prev 57/70 | |
next prev 58/70 | |
next prev 59/70 | |
next prev 60/70 | |
next prev 61/70 | |
next prev 62/70 | |
next prev 63/70 | |
next prev 64/70 | |
next prev 65/70 | |
next prev 66/70 | |
next prev 67/70 | |
next prev 68/70 | |
next prev 69/70 | |
next prev 70/70 |
PDF: SCALE2017_perf_analysis_eBPF.pdf
Keywords (from pdftotext):
slide 1:
Linux 4.x Tracing: Performance Analysis with bcc/BPF Brendan Gregg Senior Performance Architect Mar 2017slide 2:
Linux tracing in the last 3 years…slide 3:
How do we use these superpowers?slide 4:
Take aways 1. Understanding the value of Linux tracing superpowers 2. Upgrade to Linux 4.4+ (4.9 is beDer) 3. Ask for eBPF support in your perf analysis/monitoring toolsslide 5:
slide 6:
Ye Olde BPF Berkeley Packet Filter # tcpdump host 127.0.0.1 and port 22 -d OpVmizes packet filter (000) ldh [12] performance (001) jeq #0x800 jt 2 jf 18 (002) ld [26] (003) jeq #0x7f000001 jt 6 jf 4 (004) ld [30] 2 x 32-bit registers (005) jeq #0x7f000001 jt 6 jf 18 (006) ldb [23] & scratch memory (007) jeq #0x84 jt 10 jf 8 (008) jeq #0x6 jt 10 jf 9 (009) jeq #0x11 jt 10 jf 18 User-defined bytecode (010) ldh [20] executed by an in-kernel (011) jset #0x1fff jt 18 jf 12 sandboxed virtual machine (012) ldxb 4*([14]&0xf) (013) ldh [x + 14] Steven McCanne and Van Jacobson, 1993 [...]slide 7:
Enhanced BPF aka eBPF or just "BPF" 10 x 64-bit registers maps (hashes) acIons Alexei Starovoitov, 2014+slide 8:
Enhanced BPF Use Cases User-Defined BPF Programs Kernel SDN ConfiguraIon RunIme Event Targets DDoS MiIgaIon verifier sockets kprobes Intrusion DetecIon BPF Container Security Observability uprobes tracepoints BPF acVons perf_eventsslide 9:
Enhanced BPF is in Linuxslide 10:
Demoslide 11:
New Observability Tools • Efficient, producVon safe, useful metrics: # biolatency -mT 1 Tracing block device I/O... Hit Ctrl-C to end. 06:20:16 msecs 0 ->gt; 1 2 ->gt; 3 4 ->gt; 7 8 ->gt; 15 16 ->gt; 31 32 ->gt; 63 64 ->gt; 127 […] : count : 36 : 1 : 3 : 17 : 33 : 7 : 6 distribution |**************************************| |*** |***************** |********************************** |******* |****** These CLI tools may be useful even if you never use them, as examples of what to implement in GUIsslide 12:
New VisualizaVons and GUIs Eg, Neclix self-service UI: Flame Graphs Tracing Reports Should be open sourced; you may also build/buy your ownslide 13:
Introducing enhanced BPF BPF TRACINGslide 14:
A Linux Tracing Timeline 1990’s: StaVc tracers, prototype dynamic tracers 2000: LTT + DProbes (dynamic tracing; not integrated) 2004: kprobes (2.6.9) 2005: DTrace (not Linux), SystemTap (out-of-tree) 2008: irace (2.6.27) 2009: perf_events (2.6.31) 2009: tracepoints (2.6.32) 2010-2016: irace & perf_events enhancements 2012: uprobes (3.5) 2014-2017: enhanced BPF patches: supporIng tracing events 2016-2017: irace hist triggers also: LTTng, ktap, sysdig, ...slide 15:
Linux Events & BPF Support BPF output Linux 4.4 Linux 4.7 BPF stacks Linux 4.6 Linux 4.3 Linux 4.1 (version BPF support arrived) Linux 4.9 Linux 4.9slide 16:
Event Tracing Efficiency E.g., tracing TCP retransmits Kernel Old way: packet capture tcpdump Analyzer 1. read 2. dump buffer 1. read 2. process 3. print file system send receive disks New way: dynamic tracing Tracer 1. configure 2. read tcp_retransmit_skb()slide 17:
BPF Tracing Internals Observability Program BPF bytecode BPF program event config output per-event data staVsVcs Kernel load verifier staVc tracing tracepoints aDach dynamic tracing BPF kprobes uprobes async copy sampling, PMCs maps perf_eventsslide 18:
Introducing bcc BPF COMPILER COLLECTIONslide 19:
bcc • BPF Compiler CollecVon Tracing layers: – hDps://github.com/iovisor/bcc – Lead developer: Brenden Blanco bcc tool • Includes tracing tools • Provides BPF front-ends: Python Lua C++ C helper libraries golang (gobpf) bcc tool bcc Python user kernel lua front-ends Kernel Events BPFslide 20:
Raw BPF samples/bpf/sock_example.c 87 lines truncatedslide 21:
C/BPF samples/bpf/tracex1_kern.c 58 lines truncatedslide 22:
bcc/BPF (C & Python) bcc examples/tracing/bitehist.py enIre programslide 23:
ply/BPF hDps://github.com/iovisor/ply/blob/master/README.md enIre programslide 24:
The Tracing Landscape, Mar 2017 (less brutal) (my opinion) dtrace4L. Ease of use sysdig (alpha) (brutal) ktap (many) perf LTTng recent changes ply/BPF (h i s t t rigge rs) irace (mature) stap bcc/BPF C/BPF Stage of Development Raw BPF Scope & Capabilityslide 25:
For end-users PERFORMANCE ANALYSIS WITH BCC/BPFslide 26:
Pre-BPF: Linux Perf Analysis in 60s 1. uptime 2. dmesg -T | tail 3. vmstat 1 4. mpstat -P ALL 1 5. pidstat 1 6. iostat -xz 1 7. free -m 8. sar -n DEV 1 9. sar -n TCP,ETCP 1 10. top hDp://techblog.neclix.com/2015/11/linux-performance-analysis-in-60s.htmlslide 27:
bcc InstallaVon • hDps://github.com/iovisor/bcc/blob/master/INSTALL.md • eg, Ubuntu Xenial: # echo "deb [trusted=yes] https://repo.iovisor.org/apt/xenial xenial-nightly main" |\ sudo tee /etc/apt/sources.list.d/iovisor.list # sudo apt-get update # sudo apt-get install bcc-tools – Also available as an Ubuntu snap – Ubuntu 16.04 is good, 16.10 beDer: more tools work • Installs many tools – In /usr/share/bcc/tools, and …/tools/old for older kernelsslide 28:
bcc toolsslide 29:
bcc General Performance Checklist execsnoop opensnoop ext4slower (…) biolatency biosnoop cachestat tcpconnect tcpaccept 9. tcpretrans 10. gethostlatency 11. runqlat 12. profileslide 30:
1. execsnoop • Trace new processes and debug short-lived process issues: # execsnoop PCOMM bash preconv man man man nroff nroff groff groff […] PID RET ARGS 0 /usr/bin/man ls 0 /usr/bin/preconv -e UTF-8 0 /usr/bin/tbl 0 /usr/bin/nroff -mandoc -rLL=169n -rLT=169n -Tutf8 0 /usr/bin/pager -s 0 /usr/bin/locale charmap 0 /usr/bin/groff -mtty-char -Tutf8 -mandoc -rLL=169n ... 0 /usr/bin/troff -mtty-char -mandoc -rLL=169n -rLT=16... 0 /usr/bin/grotty Efficient: only traces exec()slide 31:
2. opensnoop • Find config, log, and data files, and inefficient file usage: # opensnoop PID COMM 27159 catalina.sh redis-server redis-server 30668 sshd 30668 sshd 30668 sshd 30668 sshd 30668 sshd 30668 sshd 30668 sshd […] FD ERR PATH 0 /apps/tomcat8/bin/setclasspath.sh 0 /proc/4057/stat 0 /proc/2360/stat 0 /proc/sys/kernel/ngroups_max 0 /etc/group 0 /root/.ssh/authorized_keys 2 /var/run/nologin 2 /etc/nologin 0 /etc/login.defs 0 /etc/passwd Like "strace -feopen", but system-wide and low overheadslide 32:
3. ext4slower • Trace slow FS I/O, to beDer idenVfy I/O issues and outliers: # ext4slower 1 Tracing ext4 operations slower than 1 ms TIME COMM PID T BYTES OFF_KB 06:49:17 bash R 128 06:49:17 cksum R 39552 06:49:17 cksum R 96 06:49:17 cksum R 96 06:49:17 cksum R 10320 06:49:17 cksum R 65536 06:49:17 cksum R 55400 06:49:17 cksum R 36792 […] LAT(ms) FILENAME 7.75 cksum 1.34 [ 5.36 2to3-2.7 14.94 2to3-3.4 6.82 411toppm 4.01 a2p 8.77 ab 16.34 aclocal-1.14 More reliable and complete indicator than measuring disk I/O latency Also: btrfsslower, xfsslower, zfsslowerslide 33:
4. biolatency • IdenVfy mulVmodal latency and outliers with a histogram: # biolatency -mT 1 Tracing block device I/O... Hit Ctrl-C to end. 06:20:16 msecs 0 ->gt; 1 2 ->gt; 3 4 ->gt; 7 8 ->gt; 15 16 ->gt; 31 32 ->gt; 63 64 ->gt; 127 […] : count : 36 : 1 : 3 : 17 : 33 : 7 : 6 The "count" column is summarized in-kernel distribution |**************************************| |*** |***************** |********************************** |******* |****** Average latency (iostat/sar) may not be represenVVve with mulVple modes or outliersslide 34:
5. biosnoop • Dump disk I/O events for detailed analysis. tcpdump for disks: # biosnoop TIME(s) […] COMM supervise supervise supervise supervise supervise supervise supervise supervise xfsaild/md0 PID DISK xvda1 xvda1 xvda1 xvda1 xvda1 xvda1 xvda1 xvda1 xvdc SECTOR BYTES LAT(ms) Can import this into a spreadsheet and do a scaDer plot of Vme vs latency, e.t.c.slide 35:
6. cachestat • Measure file system cache hit raVo staVsVcs: # cachestat HITS MISSES […] DIRTIES READ_HIT% WRITE_HIT% 80.4% 19.6% 96.2% 3.7% 89.6% 10.4% 100.0% 0.0% 100.0% 0.0% 55.2% 4.5% 100.0% 0.0% 99.9% 0.0% 100.0% 0.0% 100.0% 0.0% BUFFERS_MB CACHED_MB These stats should be added to /procslide 36:
7. tcpconnect • Trace acVve ("outbound") TCP connecVons : # tcpconnect PID COMM IP SADDR DADDR DPORT 25333 recordProgra 4 127.0.0.1 25338 curl 4 100.66.3.172 25340 curl 4 100.66.3.172 25342 curl 4 100.66.3.172 25344 curl 4 100.66.3.172 25365 recordProgra 4 127.0.0.1 26119 ssh 6 ::1 ::1 25388 recordProgra 4 127.0.0.1 25220 ssh 6 fe80::8a3:9dff:fed5:6b19 fe80::8a3:9dff:fed5:6b19 22 […] Efficient: dynamic tracing of TCP connect funcVons only; does not trace send/receiveslide 37:
8. tcpaccept • Trace passive ("inbound") TCP connecVons : # tcpaccept PID COMM IP RADDR LADDR LPORT sshd 4 11.16.213.254 redis-server 4 127.0.0.1 redis-server 4 127.0.0.1 redis-server 4 127.0.0.1 redis-server 4 127.0.0.1 sshd 6 ::1 ::1 redis-server 4 127.0.0.1 redis-server 4 127.0.0.1 sshd 6 fe80::8a3:9dff:fed5:6b19 fe80::8a3:9dff:fed5:6b19 22 […] Efficiently: no send/receive tracingslide 38:
9. tcpretrans • Trace TCP retransmits with kernel state (not on the wire): # tcpretrans TIME PID 01:55:05 0 01:55:05 0 01:55:17 0 […] IP LADDR:LPORT 4 10.153.223.157:22 4 10.153.223.157:22 4 10.153.223.157:22 T>gt; RADDR:RPORT R>gt; 69.53.245.40:34619 R>gt; 69.53.245.40:34619 R>gt; 69.53.245.40:22957 STATE ESTABLISHED ESTABLISHED ESTABLISHED Efficiently: no send/receive tracingslide 39:
10. gethostlatency • Trace DNS latency system-wide (it's always DNS): # gethostlatency TIME PID COMM 06:10:24 28011 wget 06:10:28 28127 wget 06:10:41 28404 wget 06:10:48 28544 curl 06:11:10 29054 curl 06:11:16 29195 curl 06:11:24 25313 wget 06:11:25 29404 curl 06:11:28 29475 curl […] LATms HOST 90.00 www.iovisor.org 0.00 www.iovisor.org 9.00 www.netflix.com 35.00 www.netflix.com.au 31.00 www.plumgrid.com 3.00 www.facebook.com 3.00 www.usenix.org 72.00 foo 1.00 foo Instruments getaddrinfo(), gethostbyname(), e.t.c.slide 40:
11. runqlat • Examine CPU scheduler run queue latency as a histogram: # runqlat -m 5 Tracing run queue latency... Hit Ctrl-C to end. msecs 0 ->gt; 1 2 ->gt; 3 4 ->gt; 7 8 ->gt; 15 16 ->gt; 31 32 ->gt; 63 : count : 3818 : 39 : 39 : 62 : 2214 : 226 distribution |****************************************| |*********************** |** […] As efficient as possible: scheduler calls can become frequentslide 41:
12. profile • An efficient profiler. Count unique stacks in kernel: # profile Sampling at 49 Hertz of all threads by user + kernel stack... Ctrl-C to end. […] ffffffff813d0af8 __clear_user ffffffff813d5277 iov_iter_zero ffffffff814ec5f2 read_iter_zero ffffffff8120be9d __vfs_read ffffffff8120c385 vfs_read ffffffff8120d786 sys_read ffffffff817cc076 entry_SYSCALL_64_fastpath 00007fc5652ad9b0 read dd (25036) This stack was sampled 7 Vmes […]slide 42:
Advanced Analysis • Find/draw a funcVonal diagram • Apply performance methods hDp://www.brendangregg.com/methodology.html Workload CharacterizaVon Latency Analysis USE Method Performance Mantras • Start with the Q's, then find the A's • Use mulV-tools: – funccount, trace, argdist, stackcount e.g., storage I/O subsystem:slide 43:
Performance Mantras 1. Don't do it 2. Do it, but don't do it again 3. Do it less 4. Do it later 5. Do it when they're not looking 6. Do it concurrently 7. Do it cheaper eliminate unnecessary work from Craig Hanson and Pat Crain, and the performance engineering communityslide 44:
trace # trace 'sys_read (arg3 >gt; 20000) "read %d bytes", arg3' TIME PID COMM FUNC 05:18:23 4490 sys_read read 1048576 bytes 05:18:23 4490 sys_read read 1048576 bytes 05:18:23 4490 sys_read read 1048576 bytes # trace -h [...] trace –K blk_account_io_start Trace this kernel function, and print info with a kernel stack trace trace 'do_sys_open "%s", arg2' Trace the open syscall and print the filename being opened trace custom events trace -h lists example one-liners trace 'sys_read (arg3 >gt; 20000) "read %d bytes", arg3' Trace the read syscall and print a message for reads >gt;20000 bytes trace r::do_sys_return Trace the return from the open syscall trace 'c:open (arg2 == 42) "%s %d", arg1, arg2' Trace the open() call from libc only if the flags (arg2) argument is 42 trace 't:block:block_rq_complete "sectors=%d", args->gt;nr_sector' Trace the block_rq_complete kernel tracepoint and print # of tx sectors [...] by Sasha Goldshteinslide 45:
argdist # argdist -H 'p::tcp_cleanup_rbuf(struct sock *sk, int copied):int:copied' [15:34:45] copied : count distribution 0 ->gt; 1 : 15088 |********************************** 2 ->gt; 3 : 0 4 ->gt; 7 : 0 8 ->gt; 15 : 0 16 ->gt; 31 : 0 32 ->gt; 63 : 0 64 ->gt; 127 : 4786 |*********** 128 ->gt; 255 : 1 256 ->gt; 511 : 1 512 ->gt; 1023 : 4 1024 ->gt; 2047 : 11 funcVon | 2048 ->gt; 4095 : 5 4096 ->gt; 8191 : 27 argument | 8192 ->gt; 16383 : 105 distribuVons 16384 ->gt; 32767 : 0 by Sasha Goldshteinslide 46:
Coming to a GUI near you BCC/BPF VISUALIZATIONSslide 47:
Latency Heatmapsslide 48:
CPU + Off-CPU Flame Graphs • Can now be BPF opVmized hDp://www.brendangregg.com/flamegraphs.htmlslide 49:
Conquer Performance On-CPU + off-CPU means we can measure everything Except someVmes one off-CPU stack isn't enough…slide 50:
Off-Wake Flame Graphs • Shows blocking stack with waker stack – BeDer understand why blocked – Merged in-kernel using BPF – Include mulVple waker stacks == chain graphs • We couldn't do this beforeslide 51:
Overview for tool developers HOW TO PROGRAM BCC/BPFslide 52:
bcc Tutorials 1. hDps://github.com/iovisor/bcc/blob/master/INSTALL.md 2. …/docs/tutorial.md 3. …/docs/tutorial_bcc_python_developer.md 4. …/docs/reference_guide.md 5. .../CONTRIBUTING-SCRIPTS.mdslide 53:
bitehist.py Output # ./bitehist.py Tracing... Hit Ctrl-C to end. kbytes : count 0 ->gt; 1 : 3 2 ->gt; 3 : 0 4 ->gt; 7 : 211 8 ->gt; 15 : 0 16 ->gt; 31 : 0 32 ->gt; 63 : 0 64 ->gt; 127 : 1 128 ->gt; 255 : 800 distribution |********** |**************************************|slide 54:
bitehist.py Code bcc examples/tracing/bitehist.pyslide 55:
bytehist.py Annotated Map C BPF Program Python Program Event "kprobe__" is a shortcut for BPF.aDach_kprobe() StaVsVcs bcc examples/tracing/bitehist.pyslide 56:
Current ComplicaVons IniValize all variables Extra bpf_probe_read()s BPF_PERF_OUTPUT() Verifier errorsslide 57:
Bonus Round PLYslide 58:
File opens # ply -c 'kprobe:do_sys_open { printf("opened: %s\n", mem(arg(1), "128s")); }' 1 probe active opened: /sys/kernel/debug/tracing/events/enable opened: /etc/ld.so.cache opened: /lib/x86_64-linux-gnu/libselinux.so.1 opened: /lib/x86_64-linux-gnu/libc.so.6 opened: /lib/x86_64-linux-gnu/libpcre.so.3 opened: /lib/x86_64-linux-gnu/libdl.so.2 opened: /lib/x86_64-linux-gnu/libpthread.so.0 opened: /proc/filesystems opened: /usr/lib/locale/locale-archive opened: . [...]slide 59:
Count vfs calls # ply -c 'kprobe:vfs_* { @[func()].count(); }' WRN kprobe_attach_pattern: 'vfs_caches_init_early' will not be probed WRN kprobe_attach_pattern: 'vfs_caches_init' will not be probed 49 probes active ^Cde-activating probes vfs_fstat vfs_getattr vfs_getattr_nosec vfs_open vfs_read vfs_write [...]slide 60:
Read return size # ply -c 'kretprobe:SyS_read { @ret.quantize(retval()); }' 1 probe active ^Cde-activating probes @ret: [ 16, [ 32, [ 64, [ 128, [ 256, [ 512, 15] 31] 63] 127] 255] 511] 1k) 7 ┤█████████▍ 24 ┤████████████████████████████████│ 5 ┤██████▋ 0 ┤ 1 ┤█▍ 1 ┤█▍ 3 ┤████ 3 ┤████ 2 ┤██▋ 1 ┤█▍ 11 ┤██████████████▋slide 61:
Read return size (ASCII) # ply -A -c 'kretprobe:SyS_read { @ret.quantize(retval()); }' 1 probe active ^Cde-activating probes @ret: [ 16, [ 32, [ 64, [ 128, [ 256, [ 512, 15] 31] 63] 127] 255] 511] 1k) 7 |################### 12 |################################| 7 |################### 0 | 1 |### 2 |##### 7 |################### 3 |######## 2 |##### 1 |### 11 |#############################slide 62:
Read latency # ply -A -c 'kprobe:SyS_read { @start[tid()] = nsecs(); } kretprobe:SyS_read /@start[tid()]/ { @ns.quantize(nsecs() - @start[tid()]); @start[tid()] = nil; }' 2 probes active ^Cde-activating probes [...] @ns: [ 512, 1k) [ 1k, 2k) [ 2k, 4k) [ 4k, 8k) [ 8k, 16k) [ 16k, 32k) [ 32k, 64k) [ 64k, 128k) [128k, 256k) [256k, 512k) [512k, 1M) [...] 3 |######## 7 |################### 12 |################################| 3 |######## 2 |##### 0 | 0 | 3 |######## 1 |### 1 |### 2 |#####slide 63:
CounVng Off-CPU stacks # ply -c 'kprobe:schedule { @[stack()].count() }' 1 probe active ^Cde-activating probes schedule+0x1 sys_exit+0x17 do_syscall_64+0x5e return_from_SYSCALL_64 schedule+0x1 fuse_dev_read+0x63 new_sync_read+0xd2 __vfs_read+0x26 vfs_read+0x96 sys_read+0x55 do_syscall_64+0x5e return_from_SYSCALL_64 schedule+0x1 do_syscall_64+0xa2 return_from_SYSCALL_64 [...]slide 64:
ply One-Liners # Trace file opens: ply -c 'kprobe:do_sys_open { printf("opened: %s\n", mem(arg(1), "128s")); }' # Counting vfs functions by process name: ply -c 'kprobe:vfs_* { @[comm(), func()].count(); }' # Counting off-CPU stacks: ply -c 'kprobe:schedule { @[stack()].count() }' # Syscall read return size as a histogram: ply -c 'kretprobe:SyS_read { @ret.quantize(retval()); }' # Syscall read latency as a histogram: ply -A -c 'kprobe:SyS_read { @start[tid()] = nsecs(); } kretprobe:SyS_read /@start[tid()]/ { @ns.quantize(nsecs() - @start[tid()]); @start[tid()] = nil; }' [...] also see ply/oneliners.mdslide 65:
ply • A new BPF-based dynamic tracer for Linux – Created by Tobias Waldekranz – hDps://github.com/iovisor/ply hDps://wkz.github.io/ply/ • High-level language – Simple one-liners – Short scripts • In development – kprobes and tracepoints only, uprobes/perf_events not yet – Successful so far as a proof of concept – Not producVon tested yet (bcc is)slide 66:
Future work CHALLENGESslide 67:
Challenges MarkeVng DocumentaVon Training Community Without these, we may have another irace: a built in "secret" of Linux. Not good for adopVon! hDps://www.iovisor.org project helps, but tracing (observability) is only one part.slide 68:
Take aways 1. Understanding the value of Linux tracing superpowers 2. Upgrade to Linux 4.4+ (4.9 is beDer) 3. Ask for eBPF support in your perf analysis/monitoring tools Please contribute: - hDps://github.com/ iovisor/bcc - hDps://github.com/ iovisor/ply BPF Tracing in Linux • 3.19: sockets • 3.19: maps • 4.1: kprobes • 4.3: uprobes • 4.4: BPF output • 4.6: stacks • 4.7: tracepoints • 4.9: profiling • 4.9: PMCsslide 69:
Links & References iovisor bcc: - hDps://github.com/iovisor/bcc hDps://github.com/iovisor/bcc/tree/master/docs - hDp://www.brendangregg.com/blog/ (search for "bcc") - hDp://www.brendangregg.com/ebpf.html#bcc - hDp://blogs.microsoi.co.il/sasha/2016/02/14/two-new-ebpf-tools-memleak-and-argdist/ - I'll change your view of Linux tracing: hDps://www.youtube.com/watch?v=GsMs3n8CB6g - On designing tracing tools: hDps://www.youtube.com/watch?v=uibLwoVKjec BPF: - hDps://www.kernel.org/doc/DocumentaVon/networking/filter.txt - hDps://github.com/iovisor/bpf-docs - hDps://suchakra.wordpress.com/tag/bpf/ Flame Graphs: - hDp://www.brendangregg.com/flamegraphs.html - hDp://www.brendangregg.com/blog/2016-01-20/ebpf-offcpu-flame-graph.html - hDp://www.brendangregg.com/blog/2016-02-01/linux-wakeup-offwake-profiling.html Dynamic InstrumentaVon: - hDp://ip.cs.wisc.edu/par-distr-sys/papers/Hollingsworth94Dynamic.pdf - hDps://en.wikipedia.org/wiki/DTrace - DTrace: Dynamic Tracing in Oracle Solaris, Mac OS X and FreeBSD, Brendan Gregg, Jim Mauro; PrenVce Hall 2011 Neclix Tech Blog on Vector: - hDp://techblog.neclix.com/2015/04/introducing-vector-neclixs-on-host.html Linux Performance: hDp://www.brendangregg.com/linuxperf.htmlslide 70:
Thanks – QuesVons? – iovisor bcc: hDps://github.com/iovisor/bcc – hDp://www.brendangregg.com – hDp://slideshare.net/brendangregg – bgregg@neclix.com – @brendangregg Thanks to Alexei Starovoitov (Facebook), Brenden Blanco (PLUMgrid/ VMware), Sasha Goldshtein (Sela), Daniel Borkmann (Cisco), Wang Nan (Huawei), and other BPF and bcc contributors!