Systems Performance 2nd Ed.



BPF Performance Tools book

Recent posts:
Blog index
About
RSS

MacIT 2014: Analyzing OS X Systems Performance with the USE Method

Talk for MacIT 2014, by Brendan Gregg.

Description: "This talk is about systems performance on OS X, and introduces the USE Method to check for common performance bottlenecks and errors. This methodology can be used by beginners and experts alike, and begins by constructing a checklist of the questions we’d like to ask of the system, before reaching for tools to answer them. The focus is resources: CPUs, GPUs, memory capacity, network interfaces, storage devices, controllers, interconnects, as well as some software resources such as mutex locks. These areas are investigated by a wide variety of tools, including vm_stat, iostat, netstat, top, latency, the DTrace scripts in /usr/bin (which were written by Brendan), custom DTrace scripts, Instruments, and more. This is a tour of the tools needed to solve our performance needs, rather than understanding tools just because they exist. This talk will make you aware of many areas of OS X that you can investigate, which will be especially useful for the time when you need to get to the bottom of a performance issue."

next
prev
1/68
next
prev
2/68
next
prev
3/68
next
prev
4/68
next
prev
5/68
next
prev
6/68
next
prev
7/68
next
prev
8/68
next
prev
9/68
next
prev
10/68
next
prev
11/68
next
prev
12/68
next
prev
13/68
next
prev
14/68
next
prev
15/68
next
prev
16/68
next
prev
17/68
next
prev
18/68
next
prev
19/68
next
prev
20/68
next
prev
21/68
next
prev
22/68
next
prev
23/68
next
prev
24/68
next
prev
25/68
next
prev
26/68
next
prev
27/68
next
prev
28/68
next
prev
29/68
next
prev
30/68
next
prev
31/68
next
prev
32/68
next
prev
33/68
next
prev
34/68
next
prev
35/68
next
prev
36/68
next
prev
37/68
next
prev
38/68
next
prev
39/68
next
prev
40/68
next
prev
41/68
next
prev
42/68
next
prev
43/68
next
prev
44/68
next
prev
45/68
next
prev
46/68
next
prev
47/68
next
prev
48/68
next
prev
49/68
next
prev
50/68
next
prev
51/68
next
prev
52/68
next
prev
53/68
next
prev
54/68
next
prev
55/68
next
prev
56/68
next
prev
57/68
next
prev
58/68
next
prev
59/68
next
prev
60/68
next
prev
61/68
next
prev
62/68
next
prev
63/68
next
prev
64/68
next
prev
65/68
next
prev
66/68
next
prev
67/68
next
prev
68/68

PDF: MacIT2014_Analyzing_OSX_Perf_USE_Method.pdf

Keywords (from pdftotext):

slide 1:
    Analyzing
    OS X Systems Performance
    with the
    USE Method
    Brendan Gregg, Senior Performance Architect, Netflix	
    March, 2014	
    
slide 2:
    Find the Bottleneck
    Darwin Operating System
    Hardware
    Applications
    CPU
    GPU
    XNU Kernel
    System Libraries
    BSD
    OSFMK
    System Call Interface
    VFS
    Sockets
    HFS+/...
    TCP/UDP
    Ethernet
    Block Devices
    FSB
    Scheduler
    Northbridge
    Virtual!
    Memory
    DMI
    I/O Kit
    Device Drivers
    Memory!
    Bus
    DRAM
    Southbridge
    Device Interconnect (PCIe/USB)
    I/O Controller
    Disk
    Disk
    Network Controller
    Interface
    Transports
    Port
    Port
    Other Devices
    
slide 3:
    This Talk
    • Summarizes casual to serious performance analysis of OS X	
    • From the systems perspective, not the application	
    • Many application issues can be found easily this way	
    • Covering not just current tools, but suggestions for future work	
    • May change how you think about performance!
    
slide 4:
    whoami
    • Senior Performance Architect at Netflix	
    • Primary author of the DTrace book	
    • Wrote many DTrace scripts included with OS X.
    Eg: dtruss, iosnoop, iotop, opensnoop, execsnoop,
    procsystime, bitesize.d, seeksize.d, setuids.d, etc...	
    • These were ported and enhanced by Apple engineering (thanks!)	
    • Created the USE method and USE method checklist for OS X
    
slide 5:
    Agenda
    • The Tools Method	
    • The USE Method	
    • Future work
    
slide 6:
    The Tools Method
    
slide 7:
    The Tools Method
    • A tool-based performance analysis approach, commonly followed
    today. For reference, I've called it the "Tools Method".	
    • 1. List available performance tools	
    • 2. For each tool, list its useful metrics	
    • 3. For each metric, list possible interpretation	
    • Simple, useful, but analysis is limited to what the tools provide easily
    
slide 8:
    Tool Examples
    • Activity Monitor	
    • atMonitor, Temperature Monitor Lite	
    • Command Line	
    • DTrace	
    • Instruments
    
slide 9:
    Activity Monitor
    • High level process and
    system sumaries. A GUI
    version of top(1)	
    • Table shows processes
    by %CPU, memory	
    • CPU load over time	
    • Quit, info, and system
    diagnosis buttons
    
slide 10:
    Activity Monitor
    Network
    • Quick way to see current
    and recent network
    throughput	
    • Like the CPU summary,
    shows aggregate device
    stats, and not per-device
    
slide 11:
    Activity Monitor
    CPU Usage
    • Per-CPU utilization from previous 0.5 - 5 seconds (tunable)	
    • Handy to leave running. Look for single hot CPUs/threads
    
slide 12:
    Activity Monitor
    Floating CPU Window
    • Earlier OS X also had a compact version (gone in Mavericks)	
    • Was nice, but what I really want is a compact visualization for both
    per-CPU and historical data
    
slide 13:
    Activity Monitor
    CPU/Disk Suggestion
    • Could show both per-device and history using a utilization heat map:
    • http://dtrace.org/blogs/brendan/2011/12/18/visualizing-device-utilization/
    
slide 14:
    Activity Monitor
    Sample Process
    • The cog button ("System
    diagnostics options") has a
    "Sample process" option for
    profiling CPU code paths	
    • Explains %CPU usage	
    • Although output usually very
    long and time consuming to
    read (see scroll bar):
    
slide 15:
    Activity Monitor
    Flame Graphs ?
    • Suggestion: include a
    Flame Graph view	
    • Visualizes entire
    profile output in
    one screen	
    • http://github.com/
    brendangregg/
    FlameGraph
    
slide 16:
    atMonitor
    • 3rd party app. Version 2.7b crashes for me if "Top Window" is visible.	
    • Shows many useful metrics: per-CPU, RAM, GPU, per-disk, and pernetwork interface utilization perentages with histories.	
    • Currently the easiest way to see GPU, disk, and network utilization. 	
    • Utilization is easy to interpret. I/O per second is not.
    
slide 17:
    Temperature Monitor Lite
    • Another 3rd party application	
    • Easy way to infer GPU utilization	
    • Normal:	
    • Video:
    
slide 18:
    Command Line
    • Accessed via the Terminal application	
    • Numerous performance tools available, from UNIX/BSD/OSX	
    • Eg, the uptime(1) command shows recent and historic CPU load:
    $ uptime
    14:36 up 43 days,
    2:39, 30 users, load averages: 0.72 1.02 1.29
    • There numbers are the 1, 5, and 15 minute load averages.Values are
    really constants in an exponential decay moving sum.
    • Interpret: if average >gt; number of CPUs, then CPUs are overloaded
    
slide 19:
    Command Line: top
    • top(1): high level process and system summary:
    $ top -o cpu
    Processes: 272 total, 4 running, 268 sleeping, 1546 threads
    14:47:36
    Load Avg: 1.14, 0.75, 0.95 CPU usage: 13.95% user, 2.78% sys, 83.26% idle
    SharedLibs: 12M resident, 5112K data, 0B linkedit.
    MemRegions: 339218 total, 6689M resident, 184M private, 2153M shared.
    PhysMem: 3429M wired, 6502M active, 5910M inactive, 15G used, 537M free.
    VM: 552G vsize, 1052M framework vsize, 111312590(1) pageins, 1437348(0) pageouts
    Networks: packets: 120030109/127G in, 70582570/38G out.
    Disks: 22089197/1050G read, 26756359/1163G written.
    hey...
    PID
    COMMAND
    %CPU TIME
    #TH
    bash
    100.0 47:42.28 1/1
    94370 top
    17.2 00:03.77 1/1
    52617 firefox
    47:30:58 45/1
    92489- Google Chrom 2.2
    13:31.85 34
    [...]
    #WQ
    #PORT #MREGS RPRVT RSHRD
    236K
    816K
    4368K 216K
    576- 177307+ 1984M+ 200M
    273M
    271M
    RSIZE
    760K
    5116K
    2530M+
    734M
    
slide 20:
    Command Line: vm_stat
    • vm_stat(1): virtual memory statistics, including free memory, paging
    $ vm_stat 1
    Mach Virtual Memory Statistics: (page size of 4096 bytes, cache hits 0%)
    free active
    spec inactive
    wire
    faults
    copy
    0fill reactive
    101297 1662K 29920 1509998 888520
    17650M 106072K
    15926M 6833792
    100919 1658K 29920 1509998 893230
    101183 1658K 29918 1509998 893169
    100517 1658K 29921 1509998 893354
    96590 1657K 29923 1514414 894426
    93184 1662K 28486 1514414 894484
    91224 1663K 28486 1514414 894886
    89195 1649K 29924 1514413 909225
    87550 1636K 29917 1514155 923179
    61596 1644K 28309 1515551 941688
    52932 1669K 28442 1515663 925755
    76395 1681K 28417 1515685 889983
    73520 1679K 28449 1515777 894905
    60335 1684K 29073 1515560 903152
    [...]
    pageins
    111312K
    pageout
    
slide 21:
    Command Line: iostat
    • iostat(1): block device I/O statistics. Disks, USB drives.
    $ iostat 1
    disk0
    KB/t tps MB/s
    47.03 13 0.60
    972.42 19 18.02
    315.60 10 3.08
    1 0.00
    8 7.99
    1024.00 18 17.97
    1024.00 17 16.98
    165.27 272 43.84
    1024.00 18 17.98
    [...]
    disk2
    cpu
    load average
    KB/t tps MB/s us sy id
    15m
    0 0.00
    5 2 92 0.94 1.01 0.99
    128.00 141 17.60
    2 3 95 0.94 1.01 0.99
    128.00 24 3.00
    6 2 92 0.94 1.01 0.99
    0 0.00
    6 2 92 0.94 1.01 0.99
    128.00 69 8.61
    6 2 92 0.94 1.01 0.99
    128.00 143 17.85
    2 2 95 0.86 0.99 0.99
    128.00 142 17.72
    2 2 96 0.86 0.99 0.99
    127.13 146 18.10
    6 5 89 0.95 1.01 0.99
    128.00 143 17.85
    2 2 96 0.95 1.01 0.99
    • No percent utilization/busy, like other OSes? Makes it hard to interpret.
    
slide 22:
    Command Line: netstat
    • netstat(1): various network statistics. -i for interface stats:
    $ netstat -iI en0 1
    input
    (en0)
    packets errs
    bytes
    packets
    [...]
    output
    errs
    bytes colls
    • No percent utilization, but can figure it out: throughput / known max
    
slide 23:
    Command Line: tcpdump
    • tcpdump(1): sniff and examine network packets:
    $ tcpdump -n
    tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
    listening on en0, link-type EN10MB (Ethernet), capture size 65535 bytes
    18:00:55.228744 IP 10.0.1.92.53 >gt; 10.0.1.148.49228: 26359 1/0/0 A 69.192.253.15 (81)
    18:00:55.311056 ARP, Reply 10.0.1.162 is-at 2c:54:2d:a4:25:4c, length 28
    18:00:55.342793 IP 74.125.28.189.443 >gt; 10.0.1.148.62998: Flags [P.], seq
    3544891232:3544891287, ack 3832081572, win 661, options [nop,nop,TS val 2936982235 ecr
    2331923799], length 55
    18:00:55.342933 IP 10.0.1.148.62998 >gt; 74.125.28.189.443: Flags [.], ack 55, win 8188,
    options [nop,nop,TS val 2331932237 ecr 2936982235], length 0
    18:00:56.477029 IP 10.0.1.148.50359 >gt; 67.195.141.201.443: Flags [P.], seq
    696365506:696365533, ack 1903095540, win 16384, length 27
    18:00:56.477158 IP 10.0.1.148.50359 >gt; 67.195.141.201.443: Flags [F.], seq 27, ack 1, win
    16384, length 0
    [...]
    • Also dump to a file and examine later. Does incur overhead.
    
slide 24:
    Observability So Far...
    • We can see all the things!	
    • Not really...
    
slide 25:
    Observability So Far...
    top
    netstat Darwin Operating System
    ActivityMonitor
    Applications
    Hardware
    CPU
    GPU
    atMonitor
    XNU Kernel
    System Libraries
    BSD
    OSFMK
    System Call Interface
    VFS
    Sockets
    HFS+/...
    TCP/UDP
    Block Devices
    Ethernet
    tcpdump
    iostat
    atMonitor
    vm_stat
    Scheduler
    Northbridge
    Virtual!
    Memory
    DMI
    I/O Kit
    Device Drivers
    Memory!
    Bus
    DRAM
    top
    Southbridge
    ActivityMonitor
    Device Interconnect (PCIe/USB)
    I/O Controller
    Disk
    Temp.Monitor
    FSB
    Disk
    Network Controller
    Interface
    Transports
    Port
    Port
    Other Devices
    netstat
    ActivityMonitor
    
slide 26:
    DTrace
    • Programmable, real-time, dynamic
    and static tracing	
    • Write your own one-liners and
    scripts, or use other people's;
    including those in /usr/bin	
    • There is a great book about it...
    
slide 27:
    DTrace: Scripts
    • Over 40 DTrace scripts are shipped with OS X (which I mostly wrote
    originally). Listing them:
    $ man -k dtrace
    bitesize.d(1m)
    cpuwalk.d(1m)
    creatbyproc.d(1m)
    dappprof(1m)
    dapptrace(1m)
    diskhits(1m)
    dispqlen.d(1m)
    dtrace(1)
    dtruss(1m)
    errinfo(1m)
    execsnoop(1m)
    [...]
    - analyse disk I/O size by process. Uses DTrace
    - Measure which CPUs a process runs on. Uses DTrace
    - snoop creat()s by process name. Uses DTrace
    - profile user and lib function usage. Uses DTrace
    - trace user and library function usage. Uses DTrace
    - disk access by file offset. Uses DTrace
    - dispatcher queue length by CPU. Uses DTrace
    - generic front-end to the DTrace facility
    - process syscall details. Uses DTrace
    - print errno for syscall fails. Uses DTrace
    - snoop new process execution. Uses DTrace
    
slide 28:
    DTrace: iosnoop
    • iosnoop(1m): trace block device I/O
    $ iosnoop
    UID
    PID D
    BLOCK
    SIZE
    COMM PATHNAME
    176 R 148471184 8192 SystemUIServer ??/vm/swapfile10
    176 R 835310312 4096 SystemUIServer ??/vm/swapfile4
    503 92489 W 746204600 61440 Google Chrome ??/Chrome/.com.google.Chrome.hw1Inp
    503 92489 W 746204720 23472 Google Chrome ??/Default/.com.google.Chrome.76k4tG
    19 W 425711304 4096
    syslogd ??/DiagnosticMessages/2014.02.14.asl
    19 W 57246896
    syslogd ??/DiagnosticMessages/StoreData
    19 W 425710304 4096
    syslogd ??/DiagnosticMessages/2014.02.14.asl
    503 52617 W 214894232 4096
    firefox ??/iw4rbel9.default/_CACHE_CLEAN_
    19 W 57246896
    syslogd ??/DiagnosticMessages/StoreData
    19 W 425710304 4096
    syslogd ??/DiagnosticMessages/2014.02.14.asl
    [...]
    • Identify processes and files causing disk I/O
    
slide 29:
    DTrace: hfsslower.d
    • hfsslower.d: trace HFS calls slower than a threshold. Eg, 10 ms:
    $ ~/dtbook_scripts/Chap5/hfsslower.d 10
    TIME
    PROCESS
    2014 Feb 14 17:35:59 Terminal
    R 5751
    2014 Feb 14 17:35:59 Terminal
    R 6166
    2014 Feb 14 17:35:59 Terminal
    W 11921
    [...]
    ms FILE
    16 data.data
    17 data.data
    15 data.data
    • Traces all application I/O to the file system, not just disk I/O	
    • Script is on http://www.dtracebook.com
    
slide 30:
    DTrace: execsnoop
    • execsnoop(1m): trace process execution
    $ execsnoop -v
    STRTIME
    2014 Feb 14 19:40:55
    2014 Feb 14 19:40:55
    2014 Feb 14 19:40:55
    2014 Feb 14 19:40:55
    2014 Feb 14 19:40:55
    2014 Feb 14 19:40:56
    2014 Feb 14 19:40:56
    2014 Feb 14 19:40:56
    2014 Feb 14 19:40:58
    2014 Feb 14 19:41:03
    [...]
    UID
    PID
    PPID ARGS
    551 man
    551 man
    94837 groff
    94837 tbl
    94838 cat
    94841 grotty
    94841 troff
    94842 less
    92489 Google Chrome He
    92489 Google Chrome He
    • Shows what programs are launched
    
slide 31:
    DTrace: dtruss
    • dtruss(1m): trace system calls, from one or many processes
    dtruss -en bash
    PID/THRD
    ELAPSD SYSCALL(args)
    = return
    475/0x1199:
    87917 read(0x0, "a\0", 0x1)
    = 1 0
    475/0x1199:
    12 write_nocancel(0x2, "a\0", 0x1)
    = 1 0
    475/0x1199:
    3 sigprocmask(0x1, 0x0, 0x7FFF55F898E0)
    475/0x1199:
    2 sigaltstack(0x0, 0x7FFF55F898D0, 0x0)
    475/0x1199:
    48163 read(0x0, "t\0", 0x1)
    = 1 0
    475/0x1199:
    10 write_nocancel(0x2, "t\0", 0x1)
    = 1 0
    475/0x1199:
    3 sigprocmask(0x1, 0x0, 0x7FFF55F898E0)
    475/0x1199:
    2 sigaltstack(0x0, 0x7FFF55F898D0, 0x0)
    475/0x1199:
    12 write_nocancel(0x2, "m\0", 0x1)
    = 1 0
    475/0x1199:
    2 sigprocmask(0x1, 0x0, 0x7FFF55F898E0)
    [...]
    • dtruss is a script - edit it to add/modify it as desired
    = 0x0 0
    = 0 0
    = 0x0 0
    = 0 0
    = 0x0 0
    
slide 32:
    DTrace: sotop
    • sotop: summarize socket I/O by-process, top-style:
    $ sotop
    PROCESS
    kernel_task
    firefox
    Terminal
    WindowServer
    SIDPLAY
    Google Chrome H
    Google Chrome H
    clear
    Google Chrome
    [...]
    PID
    READS
    • Also from the DTrace book.
    WRITES
    READ_KB
    WRITE_KB
    CPU
    
slide 33:
    Instruments
    • Advanced analysis GUI	
    • Includes many
    "Instruments", which
    profile applications
    in different ways:	
    • Data sources include
    DTrace, CPU counters
    
slide 34:
    Instruments
    Thread States
    
slide 35:
    Instruments
    Low Level CPU Counters
    • Performance monitor
    counter (PMC) and
    performance monitor
    interrupts can be
    instrumented	
    • Hard work, but can be
    used to understand
    bus and interconnect
    activity
    
slide 36:
    Observability So Far...
    dtruss netstat
    sotop
    execsnoop
    ActivityMonitor Instruments
    CPU
    GPU
    atMonitor
    System Libraries
    BSD
    XNU Kernel
    dtrace
    Instruments
    Applications
    top
    OSFMK
    System Call Interface
    Sockets
    VFS
    HFS+/...
    TCP/UDP
    Block Devices
    tcpdump
    hfsslower
    iosnoop
    Ethernet
    iostat
    atMonitor
    vm_stat
    Scheduler
    Virtual!
    Memory
    Northbridge
    DMI
    I/O Kit
    Device Drivers
    Memory!
    Bus
    DRAM
    top
    Southbridge
    ActivityMonitor
    Device Interconnect (PCIe/USB)
    I/O Controller
    Disk
    Temp.Monitor
    FSB
    Disk
    Network Controller
    Interface
    Transports
    Port
    Port
    Other Devices
    netstat
    ActivityMonitor
    
slide 37:
    Tools Method in Practice
    • Tools Method provides reasonable coverage	
    • Some observability gaps, some uneven coverage	
    • Can improve coverage by adding more tools: ps, ping, traceroute,
    latency, df, sysctl, plockstat, opensnoop, dispqlen.d, runocc.d, nfsstat,
    iopending, soconnect_mac.d, httpdstat.d, sc_usage, fs_usage, ...	
    • I could keep covering tools for the rest of this talk...
    
slide 38:
    soconnect_mac.d,soaccept_mac.d
    dtruss,sc_usage
    errinfo,kill.d plockstat
    dapptrace httpdstat.d
    top,ps
    opensnoop netstat
    sotop execsnoop
    ActivityMonitor Instruments
    CPU
    GPU
    atMonitor
    System Libraries
    BSD
    XNU Kernel
    Instruments
    Applications
    dtrace
    Most DTrace scripts are in /usr/bin
    Some are from my DTrace book
    and are available online
    OSFMK
    System Call Interface
    VFS
    Sockets
    HFS+/...
    TCP/UDP
    Ethernet
    Block Devices
    tcpdump
    fs_usage
    hfsslower.d
    df,nfstat
    iostat
    iosnoop
    iopending
    maclife.d
    macvfssnoop.d
    Scheduler
    Virtual!
    Memory
    I/O Kit
    Device Drivers
    dispqlen.d
    FSB
    runocc.d
    latency Northbridge
    priclass.d
    pridist.d
    vm_stat
    atMonitor
    Disk
    Interface
    Transports
    bitesize.d
    seeksize.d
    Memory!
    Bus
    DRAM
    top
    ActivityMonitor
    Southbridge
    Custom Instruments using
    CPU counters/interrupts can
    be added for bus observability
    Device Interconnect (PCIe/USB)
    I/O Controller
    Disk
    DMI
    Temp.Monitor
    Network Controller
    Other Devices
    Port
    netstat
    Port
    ping traceroute
    ActivityMonitor
    
slide 39:
    The Focus on Tools
    • Useful, however, learning tools & metrics becomes laborious.	
    • Still limited by what the tools provide, or provide easily.	
    • You can try to approach this in a different way...
    
slide 40:
    Instead of starting with the tools, start with the questions
    
slide 41:
    The USE Method
    
slide 42:
    The USE Method
    • For every resource, check:	
    • 1. Utilization	
    • 2. Saturation	
    • 3. Errors
    
slide 43:
    The USE Method
    • For every resource, check:	
    • 1. Utilization: time resource was busy, or degree used	
    • 2. Saturation: degree of queued extra work	
    • 3. Errors: any errors
    
slide 44:
    Queueing System
    • If it helps, consider
    all resources as a
    a queueing system:	
    • Also check errors
    Saturation
    Errors
    Utilization
    
slide 45:
    Hardware Resources
    • CPUs	
    • Main Memory	
    • Network Interfaces	
    • Storage Devices	
    • Controllers, Interconnects	
    • Find the functional diagram and examine every item in the data path...
    
slide 46:
    Hardware
    Functional Diagram
    • For each check:	
    • 1. Utilization	
    • 2. Saturation	
    • 3. Errors
    CPU
    GPU
    FSB
    Northbridge
    DMI
    Memory!
    Bus
    DRAM
    Southbridge
    Device Interconnect (PCIe/USB)
    I/O Controller
    Disk
    Disk
    Network Controller
    Interface
    Transports
    Port
    Port
    Other Devices
    
slide 47:
    USE Method Checklists
    • Build a checklist for all combinations, identifying tools/metrics to use
    
slide 48:
    OS X Checklist
    Resource
    Type
    CPU
    Utilization
    CPU
    Saturation
    CPU
    Errors
    Metric
    
slide 49:
    OS X Checklist
    Resource
    CPU
    CPU
    CPU
    Type
    Metric
    Utilization
    system-wide: iostat 1, "us" + "sy"; per-cpu: DTrace [1]; Activity
    Monitor → CPU Usage or Floating CPU Window; per-process:top
    -o cpu, "%CPU"; Activity Monitor → Activity Monitor, "%CPU"; ...
    Saturation
    system-wide: uptime, "load averages" >gt; CPU count; latency,
    "SCHEDULER" and "INTERRUPTS"; per-cpu: dispqlen.d (DTT),
    non-zero "value"; runocc.d (DTT), non-zero "%runocc"; perprocess: Instruments → Thread States, "On run queue"; DTrace [2]
    Errors
    dmesg; /var/log/system.log; Instruments → Counters, for PMC and
    whatever error counters are supported (eg, thermal throttling)
    
slide 50:
    OS X Checklist
    Resource
    CPU
    CPU
    CPU
    Type
    Metric
    Utilization
    system-wide: iostat 1, "us" + "sy"; per-cpu: DTrace [1]; Activity
    Monitor → CPU Usage or Floating CPU Window; per-process:top
    -o cpu, "%CPU"; Activity Monitor → Activity Monitor, "%CPU"; ...
    Saturation
    system-wide: uptime, "load averages" >gt; CPU count; latency,
    "SCHEDULER" and "INTERRUPTS"; per-cpu: dispqlen.d (DTT),
    non-zero "value"; runocc.d (DTT), non-zero "%runocc"; perprocess: Instruments → Thread States, "On run queue"; DTrace [2]
    Errors
    dmesg; /var/log/system.log; Instruments → Counters, for PMC and
    whatever error counters are supported (eg, thermal throttling)
    
slide 51:
    OS X Checklist, cont.
    Resource
    Type
    Memory	
    Utilization
    Capacity
    Memory	
    Saturation
    Capacity
    Errors
    Metric
    
slide 52:
    OS X Checklist, cont.
    Resource
    Type
    Memory	
    Utilization
    Capacity
    Memory	
    Saturation
    Capacity
    Errors
    Metric
    system-wide: vm_stat 1, main memory free = "free" + "inactive", in
    units of pages; Activity Monitor → Activity Monitor → System
    Memory, "Free" for main memory; per-process: top -o rsize,
    "RSIZE" is resident main memory size, "VSIZE" is virtual memory
    size; ps -alx, "RSS" is resident set size, "SZ" is virtual memory size;
    aux similarvm_stat
    (legacy format)
    system-wide:
    1, "pageout"; per-process: anonpgpid.d
    (DTT), DTrace vminfo:::anonpgin [3] (frequent anonpgin == pain);
    Instruments → Memory Monitor, high rate of "Page Ins" and "Page
    Outs"; sysctl vm.memory_pressure [4]	
    !System Information → Hardware → Memory, "Status" for physical
    failures; DTrace failed malloc()s
    
slide 53:
    OS X Checklist, cont.
    Resource
    Type
    Memory	
    Utilization
    Capacity
    Memory	
    Saturation
    Capacity
    Errors
    Metric
    system-wide: vm_stat 1, main memory free = "free" + "inactive", in
    units of pages; Activity Monitor → Activity Monitor → System
    Memory, "Free" for main memory; per-process: top -o rsize,
    "RSIZE" is resident main memory size, "VSIZE" is virtual memory
    size; ps -alx, "RSS" is resident set size, "SZ" is virtual memory size;
    aux similarvm_stat
    (legacy format)
    system-wide:
    1, "pageout"; per-process: anonpgpid.d
    (DTT), DTrace vminfo:::anonpgin [3] (frequent anonpgin == pain);
    Instruments → Memory Monitor, high rate of "Page Ins" and "Page
    Outs"; sysctl vm.memory_pressure [4]	
    !System Information → Hardware → Memory, "Status" for physical
    failures; DTrace failed malloc()s
    
slide 54:
    OS X Checklist, cont.
    • Full list: http://www.brendangregg.com/USEmethod/use-macosx.html	
    • Includes
    references
    from earlier
    tables
    
slide 55:
    Software Resources
    • Can be studied using USE metrics as well, if possible	
    • OS X Checklist includes some example software resources:	
    • Processes, file descriptors, kernel mutexes, user-level mutexes
    
slide 56:
    Mutex Lock
    • Can you think of what these could mean for a mutex lock?:	
    • Utilization	
    • Saturation	
    • Errors
    
slide 57:
    Mutex Lock
    • Can you think of what these could mean for a mutex lock?:	
    • Utilization: held time per second	
    • Saturation: measure of contention time or waiters	
    • Errors: EDEADLK, EINVAL
    
slide 58:
    Future Work
    
slide 59:
    Future Work
    • Tools/Metrics for USE Method	
    • More methodologies, and then tools
    
slide 60:
    USE Method Tools
    • Tools can be developed to fetch USE metrics more easily	
    • Especially for busses and interconnects	
    • Would love to see USE metrics in Activity Monitor
    
slide 61:
    USE Method New Uses
    • Can be applied new areas, developing new metrics	
    • May not always work, but worth trying	
    • Find a functional diagram of your system, application, or environment,
    and look for U.S.E. metrics for each component
    
slide 62:
    USE Metrics for all of:
    Darwin Operating System
    Hardware
    Applications
    CPU
    GPU
    XNU Kernel
    System Libraries
    BSD
    OSFMK
    System Call Interface
    VFS
    Sockets
    HFS+/...
    TCP/UDP
    Ethernet
    Block Devices
    FSB
    Scheduler
    Northbridge
    Virtual!
    Memory
    DMI
    I/O Kit
    Device Drivers
    Memory!
    Bus
    DRAM
    Southbridge
    Device Interconnect (PCIe/USB)
    I/O Controller
    Disk
    Disk
    Network Controller
    Interface
    Transports
    Port
    Port
    Other!
    Devices
    
slide 63:
    Stranger Example: TCP
    $ netstat -s
    tcp:
    80444499 packets sent
    28706719 data packets (3613656050 bytes)
    76599 data packets (65712152 bytes) retransmitted
    68 resends initiated by MTU discovery
    41687640 ack-only packets (248964 delayed)
    0 URG only packets
    0 window probe packets
    9286129 window update packets
    707685 control packets
    0 data packets sent after flow control
    177149270 packets received
    16296459 acks (for 3602941580 bytes)
    556237 duplicate acks
    0 acks for unsent data
    154775303 packets (1214952475 bytes) received in-sequence
    200501 completely duplicate packets (151553377 bytes)
    1884 old duplicate packets
    79 packets with some dup. data (17270 bytes duped)
    6102493 out-of-order packets (4236017281 bytes)
    67 packets (0 bytes) of data after window
    0 window probes
    14180 window update packets
    72825 packets received after close
    85 bad resets
    0 discarded for bad checksums
    0 discarded for bad header offset fields
    0 discarded because packet too short
    378961 connection requests
    613 connection accepts
    37 bad connection attempts
    0 listen queue overflows
    332688 connections established (including accepts)
    381180 connections closed (including 13038 drops)
    14527 connections updated cached RTT on close
    14527 connections updated cached RTT variance on close
    5495 connections updated cached ssthresh on close
    1721 embryonic connections dropped
    16204052 segments updated rtt (of 8674926 attempts)
    374184 retransmit timeouts
    4465 connections dropped by rexmit timeout
    0 connections dropped after retransmitting FIN
    91 persist timeouts
    0 connections dropped by persist timeout
    12784 keepalive timeouts
    262 keepalive probes sent
    1214 connections dropped by keepalive
    1312411 correct ACK header predictions
    152849516 correct data packet header predictions
    17244 SACK recovery episodes
    21329 segment rexmits in SACK recovery episodes
    25852298 byte rexmits in SACK recovery episodes
    180630 SACK options (SACK blocks) received
    5682514 SACK options (SACK blocks) sent
    0 SACK scoreboard overflow
    • "netstat -s" output has over 50 metrics
    for TCP	
    • Do you understand them all?	
    • Could USE metrics provide a high level
    summary, treating TCP as a software
    resource? (might be a stretch)
    [...]
    
slide 64:
    USE Method: TCP
    • TCP as a software resource metrics:	
    • Utilization	
    • Saturation	
    • Errors
    
slide 65:
    USE Method: TCP
    • TCP as a software resource metrics:	
    • Utilization: time data was buffered per second	
    • Saturation: listen queue overflows	
    • Errors: bad connection attempts, bad resets, bad checksums, ...	
    • I think I'd classify retransmits and duplicates as errors.
    
slide 66:
    Other Methodologies
    • Other methodologies include:	
    • Drill Down Analysis Method	
    • Workload Characterization	
    • Thread State Analysis (TSA) Method	
    • These too can pose questions that tools then answer
    
slide 67:
    References
    • http://www.brendangregg.com/USEmethod/use-macosx.html	
    • http://www.brendangregg.com/usemethod.html	
    • http://dtracebook.com - has DTrace book scripts online	
    • http://dtrace.org/blogs/brendan/2011/10/10/top-10-dtrace-scripts-for-mac-os-x/	
    - utilization
    • http://dtrace.org/blogs/brendan/2011/12/18/visualizing-device-utilization/
    heat maps	
    • http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html - flame graphs
    
slide 68:
    Thanks
    • http://www.brendangregg.com	
    • bgregg@netflix.com	
    • @brendangregg