PerconaLive: Linux Performance 2018
Keynote for PerconaLive 2018 by Brendan Gregg.Video: https://youtu.be/sV3XfrfjrPo?t=30m51s
Description: "At over one thousand code commits per week, it's hard to keep up with Linux developments. This keynote will summarize recent Linux performance features, for a wide audience: the KPTI patches for Meltdown, eBPF for performance observability, Kyber for disk I/O scheduling, BBR for TCP congestion control, and more. This is about exposure: knowing what exists, so you can learn and use it later when needed. Get the most out of your systems, whether they are databases or application servers, with the latest Linux kernels and exciting features."
next prev 1/19 | |
next prev 2/19 | |
next prev 3/19 | |
next prev 4/19 | |
next prev 5/19 | |
next prev 6/19 | |
next prev 7/19 | |
next prev 8/19 | |
next prev 9/19 | |
next prev 10/19 | |
next prev 11/19 | |
next prev 12/19 | |
next prev 13/19 | |
next prev 14/19 | |
next prev 15/19 | |
next prev 16/19 | |
next prev 17/19 | |
next prev 18/19 | |
next prev 19/19 |
PDF: Percona2018_Linux_Performance.pdf
Keywords (from pdftotext):
slide 1:
Linux Performance Brendan Gregg Senior Performance Architect Apr 2018slide 2:
h1p://neuling.org/linux-next-size.htmlslide 3:
Post frequency: 4 per year h1ps://kernelnewbies.org/Linux_4.15 4 per week h1ps://lwn.net/Kernel/ 400 per day LKML h1p://vger.kernel.org/vger-lists.html#linux-kernelslide 4:
h1ps://meltdowna1ack.com/slide 5:
KPTI Linux 4.15 & backports Cloud Hypervisor (patches) Linux Kernel (KPTI) ApplicaUon (retpolne) CPU (microcode)slide 6:
Server A: 31353 MySQL queries/sec serverA# mpstat 1 Linux 4.14.12-virtual (bgregg-c5.9xl-i-xxx) 02/09/2018 _x86_64_ (36 CPU) 01:09:13 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle 01:09:14 AM all 86.89 0.00 13.08 01:09:15 AM all 86.77 0.00 13.23 01:09:16 AM all 86.93 0.00 13.02 [...] Server B: 22795 queries/sec (27% slower) serverB# mpstat 1 Linux 4.14.12-virtual (bgregg-c5.9xl-i-xxx) 02/09/2018 _x86_64_ (36 CPU) 01:09:44 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle 01:09:45 AM all 82.94 0.00 17.06 01:09:46 AM all 82.78 0.00 17.22 01:09:47 AM all 83.14 0.00 16.86 [...]slide 7:
Linux KPTI patches for Meltdown flush the Transla:on Lookaside Buffer Virtual Address CPU Physical Address MMU hit TLB miss (walk) Main Memory Page Tableslide 8:
Server A: TLB miss walks 3.5% serverA# ./tlbstat 1 K_CYCLES K_INSTR [...] IPC DTLB_WALKS ITLB_WALKS K_DTLBCYC 1.04 86588626 115441706 1507279 1.04 86281319 115306404 1507472 1.04 86564448 115555259 1511158 1.04 86187531 115292395 1508524 K_ITLBCYC DTLB% ITLB% 1.57 1.92 1.57 1.92 1.58 1.93 1.57 1.92 Server B: TLB miss walks 19.2% (16% higher) serverB# ./tlbstat 1 K_CYCLES K_INSTR [...] IPC DTLB_WALKS ITLB_WALKS K_DTLBCYC 0.84 911337888 719553692 10476524 0.84 913726197 721751988 10518488 0.84 912994135 721492911 10524675 0.84 912009660 720027006 10501926 K_ITLBCYC DTLB% ITLB% 10.92 8.19 10.96 8.25 10.97 8.26 10.93 8.24slide 9:
h1p://www.brendangregg.com/blog/2018-02-09/kpU-kaiser-meltdown-performance.htmlslide 10:
Enhanced BPF Linux 4.* also known as just "BPF" User-Defined BPF Programs Kernel SDN Configura:on Run:me Event Targets DDoS Mi:ga:on verifier sockets Intrusion Detec:on kprobes BPF Container Security Observability uprobes tracepoints BPF acUons perf_eventsslide 11:
eBPF bcc Linux 4.4+ h1ps://github.com/iovisor/bccslide 12:
Iden:fy mul:modal disk I/O latency and outliers with eBPF biolatency # biolatency -mT 10 Tracing block device I/O... Hit Ctrl-C to end. 19:19:04 msecs 0 ->gt; 1 2 ->gt; 3 4 ->gt; 7 8 ->gt; 15 16 ->gt; 31 32 ->gt; 63 64 ->gt; 127 128 ->gt; 255 19:19:14 msecs 0 ->gt; 1 2 ->gt; 3 […] : count : 238 : 424 : 834 : 506 : 986 : 97 : 7 : 27 distribution |********* |***************** |********************************* |******************** |****************************************| |*** : count : 427 : 424 distribution |******************* |******************slide 13:
eBPF bcc offcpuUme Linux 4.8+slide 14:
eBPF XDP Linux 4.8+ h1ps://www.netronome.com/blog/frnog-30-faster-networking-la-francaise/slide 15:
BBR Linux 4.9 TCP congesUon control algorithm Bo1leneck Bandwidth and RTT 1% packet loss: we see 3x be1er throughput h1ps://twi1er.com/amerneklix/status/892787364598132736 h1ps://blog.apnic.net/2017/05/09/bbr-new-kid-tcp-block/ h1ps://queue.acm.org/detail.cfm?id=3022184slide 16:
Linux 4.12 Kyber MulUqueue block I/O scheduler Tune target read & write latency Up to 300x lower 99th latencies in our tesUng reads (sync) writes (async) Kyber (simplified) dispatch dispatch queue size adjust compleUons h1ps://lwn.net/ArUcles/720675/slide 17:
More perf 4.4 - 4.16 (2016 - 2018) Major features: • TCP listener lockless (4.4) • copy_file_range() (4.5) • madvise() MADV_FREE (4.5) • epoll mulUthread scalability (4.5) • Kernel ConnecUon MulUplexor (4.6) • Writeback management (4.10) • Hybrid block polling (4.10) • BFQ I/O scheduler (4.12) • Async I/O improvements (4.13) • In-kernel TLS acceleraUon (4.13) • Socket MSG_ZEROCOPY (4.14) • Asynchronous buffered I/O (4.14) • Longer-lived TLB entries with PCID (4.14) • mmap MAP_SYNC (4.15) • Sosware-interrupt context hrUmers (4.16) Many minor improvements to: • perf • CPU scheduling • futexes • NUMA • Huge pages • Slab allocaUon • TCP, UDP • Drivers • Processor support • GPUsslide 18:
Take Aways 1. Run latest 2. Browse major features eg, h1ps://kernelnewbies.org/Linux_4.15slide 19:
Some Linux perf Resources h1p://www.brendangregg.com/linuxperf.html h1ps://kernelnewbies.org/LinuxChanges h1ps://lwn.net/Kernel h1ps://github.com/iovisor/bcc h1p://blog.stgolabs.net/search/label/linux h1p://www.brendangregg.com/blog/2018-02-09/kpU-kaiser-meltdown-performance.html