MacIT2014_Analyzing_OSX_Perf_USE

MacIT 2014: Analyzing OS X Systems Performance with the USE Method

Talk for MacIT 2014, by Brendan Gregg.

Description: "This talk is about systems performance on OS X, and introduces the USE Method to check for common performance bottlenecks and errors. This methodology can be used by beginners and experts alike, and begins by constructing a checklist of the questions we’d like to ask of the system, before reaching for tools to answer them. The focus is resources: CPUs, GPUs, memory capacity, network interfaces, storage devices, controllers, interconnects, as well as some software resources such as mutex locks. These areas are investigated by a wide variety of tools, including vm_stat, iostat, netstat, top, latency, the DTrace scripts in /usr/bin (which were written by Brendan), custom DTrace scripts, Instruments, and more. This is a tour of the tools needed to solve our performance needs, rather than understanding tools just because they exist. This talk will make you aware of many areas of OS X that you can investigate, which will be especially useful for the time when you need to get to the bottom of a performance issue."

	next prev 1/68
	next prev 2/68
	next prev 3/68
	next prev 4/68
	next prev 5/68
	next prev 6/68
	next prev 7/68
	next prev 8/68
	next prev 9/68
	next prev 10/68
	next prev 11/68
	next prev 12/68
	next prev 13/68
	next prev 14/68
	next prev 15/68
	next prev 16/68
	next prev 17/68
	next prev 18/68
	next prev 19/68
	next prev 20/68
	next prev 21/68
	next prev 22/68
	next prev 23/68
	next prev 24/68
	next prev 25/68
	next prev 26/68
	next prev 27/68
	next prev 28/68
	next prev 29/68
	next prev 30/68
	next prev 31/68
	next prev 32/68
	next prev 33/68
	next prev 34/68
	next prev 35/68
	next prev 36/68
	next prev 37/68
	next prev 38/68
	next prev 39/68
	next prev 40/68
	next prev 41/68
	next prev 42/68
	next prev 43/68
	next prev 44/68
	next prev 45/68
	next prev 46/68
	next prev 47/68
	next prev 48/68
	next prev 49/68
	next prev 50/68
	next prev 51/68
	next prev 52/68
	next prev 53/68
	next prev 54/68
	next prev 55/68
	next prev 56/68
	next prev 57/68
	next prev 58/68
	next prev 59/68
	next prev 60/68
	next prev 61/68
	next prev 62/68
	next prev 63/68
	next prev 64/68
	next prev 65/68
	next prev 66/68
	next prev 67/68
	next prev 68/68

PDF: MacIT2014_Analyzing_OSX_Perf_USE_Method.pdf

Keywords (from pdftotext):

slide 1:

Analyzing
OS X Systems Performance
with the
USE Method
Brendan Gregg, Senior Performance Architect, Netflix	
March, 2014

slide 2:

Find the Bottleneck
Darwin Operating System
Hardware
Applications
CPU
GPU
XNU Kernel
System Libraries
BSD
OSFMK
System Call Interface
VFS
Sockets
HFS+/...
TCP/UDP
Ethernet
Block Devices
FSB
Scheduler
Northbridge
Virtual!
Memory
DMI
I/O Kit
Device Drivers
Memory!
Bus
DRAM
Southbridge
Device Interconnect (PCIe/USB)
I/O Controller
Disk
Disk
Network Controller
Interface
Transports
Port
Port
Other Devices

slide 3:

This Talk
• Summarizes casual to serious performance analysis of OS X	
• From the systems perspective, not the application	
• Many application issues can be found easily this way	
• Covering not just current tools, but suggestions for future work	
• May change how you think about performance!

slide 4:

whoami
• Senior Performance Architect at Netflix	
• Primary author of the DTrace book	
• Wrote many DTrace scripts included with OS X.
Eg: dtruss, iosnoop, iotop, opensnoop, execsnoop,
procsystime, bitesize.d, seeksize.d, setuids.d, etc...	
• These were ported and enhanced by Apple engineering (thanks!)	
• Created the USE method and USE method checklist for OS X

slide 5:

Agenda
• The Tools Method	
• The USE Method	
• Future work

slide 6:

The Tools Method

slide 7:

The Tools Method
• A tool-based performance analysis approach, commonly followed
today. For reference, I've called it the "Tools Method".	
• 1. List available performance tools	
• 2. For each tool, list its useful metrics	
• 3. For each metric, list possible interpretation	
• Simple, useful, but analysis is limited to what the tools provide easily

slide 8:

Tool Examples
• Activity Monitor	
• atMonitor, Temperature Monitor Lite	
• Command Line	
• DTrace	
• Instruments

slide 9:

Activity Monitor
• High level process and
system sumaries. A GUI
version of top(1)	
• Table shows processes
by %CPU, memory	
• CPU load over time	
• Quit, info, and system
diagnosis buttons

slide 10:

Activity Monitor
Network
• Quick way to see current
and recent network
throughput	
• Like the CPU summary,
shows aggregate device
stats, and not per-device

slide 11:

Activity Monitor
CPU Usage
• Per-CPU utilization from previous 0.5 - 5 seconds (tunable)	
• Handy to leave running. Look for single hot CPUs/threads

slide 12:

Activity Monitor
Floating CPU Window
• Earlier OS X also had a compact version (gone in Mavericks)	
• Was nice, but what I really want is a compact visualization for both
per-CPU and historical data

slide 13:

Activity Monitor
CPU/Disk Suggestion
• Could show both per-device and history using a utilization heat map:
• http://dtrace.org/blogs/brendan/2011/12/18/visualizing-device-utilization/

slide 14:

Activity Monitor
Sample Process
• The cog button ("System
diagnostics options") has a
"Sample process" option for
profiling CPU code paths	
• Explains %CPU usage	
• Although output usually very
long and time consuming to
read (see scroll bar):

slide 15:

Activity Monitor
Flame Graphs ?
• Suggestion: include a
Flame Graph view	
• Visualizes entire
profile output in
one screen	
• http://github.com/
brendangregg/
FlameGraph

slide 16:

atMonitor
• 3rd party app. Version 2.7b crashes for me if "Top Window" is visible.	
• Shows many useful metrics: per-CPU, RAM, GPU, per-disk, and pernetwork interface utilization perentages with histories.	
• Currently the easiest way to see GPU, disk, and network utilization. 	
• Utilization is easy to interpret. I/O per second is not.

slide 17:

Temperature Monitor Lite
• Another 3rd party application	
• Easy way to infer GPU utilization	
• Normal:	
• Video:

slide 18:

Command Line
• Accessed via the Terminal application	
• Numerous performance tools available, from UNIX/BSD/OSX	
• Eg, the uptime(1) command shows recent and historic CPU load:
$ uptime
14:36 up 43 days,
2:39, 30 users, load averages: 0.72 1.02 1.29
• There numbers are the 1, 5, and 15 minute load averages.Values are
really constants in an exponential decay moving sum.
• Interpret: if average >gt; number of CPUs, then CPUs are overloaded

slide 19:

Command Line: top
• top(1): high level process and system summary:
$ top -o cpu
Processes: 272 total, 4 running, 268 sleeping, 1546 threads
14:47:36
Load Avg: 1.14, 0.75, 0.95 CPU usage: 13.95% user, 2.78% sys, 83.26% idle
SharedLibs: 12M resident, 5112K data, 0B linkedit.
MemRegions: 339218 total, 6689M resident, 184M private, 2153M shared.
PhysMem: 3429M wired, 6502M active, 5910M inactive, 15G used, 537M free.
VM: 552G vsize, 1052M framework vsize, 111312590(1) pageins, 1437348(0) pageouts
Networks: packets: 120030109/127G in, 70582570/38G out.
Disks: 22089197/1050G read, 26756359/1163G written.
hey...
PID
COMMAND
%CPU TIME
#TH
bash
100.0 47:42.28 1/1
94370 top
17.2 00:03.77 1/1
52617 firefox
47:30:58 45/1
92489- Google Chrom 2.2
13:31.85 34
[...]
#WQ
#PORT #MREGS RPRVT RSHRD
236K
816K
4368K 216K
576- 177307+ 1984M+ 200M
273M
271M
RSIZE
760K
5116K
2530M+
734M

slide 20:

Command Line: vm_stat
• vm_stat(1): virtual memory statistics, including free memory, paging
$ vm_stat 1
Mach Virtual Memory Statistics: (page size of 4096 bytes, cache hits 0%)
free active
spec inactive
wire
faults
copy
0fill reactive
101297 1662K 29920 1509998 888520
17650M 106072K
15926M 6833792
100919 1658K 29920 1509998 893230
101183 1658K 29918 1509998 893169
100517 1658K 29921 1509998 893354
96590 1657K 29923 1514414 894426
93184 1662K 28486 1514414 894484
91224 1663K 28486 1514414 894886
89195 1649K 29924 1514413 909225
87550 1636K 29917 1514155 923179
61596 1644K 28309 1515551 941688
52932 1669K 28442 1515663 925755
76395 1681K 28417 1515685 889983
73520 1679K 28449 1515777 894905
60335 1684K 29073 1515560 903152
[...]
pageins
111312K
pageout

slide 21:

Command Line: iostat
• iostat(1): block device I/O statistics. Disks, USB drives.
$ iostat 1
disk0
KB/t tps MB/s
47.03 13 0.60
972.42 19 18.02
315.60 10 3.08
1 0.00
8 7.99
1024.00 18 17.97
1024.00 17 16.98
165.27 272 43.84
1024.00 18 17.98
[...]
disk2
cpu
load average
KB/t tps MB/s us sy id
15m
0 0.00
5 2 92 0.94 1.01 0.99
128.00 141 17.60
2 3 95 0.94 1.01 0.99
128.00 24 3.00
6 2 92 0.94 1.01 0.99
0 0.00
6 2 92 0.94 1.01 0.99
128.00 69 8.61
6 2 92 0.94 1.01 0.99
128.00 143 17.85
2 2 95 0.86 0.99 0.99
128.00 142 17.72
2 2 96 0.86 0.99 0.99
127.13 146 18.10
6 5 89 0.95 1.01 0.99
128.00 143 17.85
2 2 96 0.95 1.01 0.99
• No percent utilization/busy, like other OSes? Makes it hard to interpret.

slide 22:

Command Line: netstat
• netstat(1): various network statistics. -i for interface stats:
$ netstat -iI en0 1
input
(en0)
packets errs
bytes
packets
[...]
output
errs
bytes colls
• No percent utilization, but can figure it out: throughput / known max

slide 23:

Command Line: tcpdump
• tcpdump(1): sniff and examine network packets:
$ tcpdump -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on en0, link-type EN10MB (Ethernet), capture size 65535 bytes
18:00:55.228744 IP 10.0.1.92.53 >gt; 10.0.1.148.49228: 26359 1/0/0 A 69.192.253.15 (81)
18:00:55.311056 ARP, Reply 10.0.1.162 is-at 2c:54:2d:a4:25:4c, length 28
18:00:55.342793 IP 74.125.28.189.443 >gt; 10.0.1.148.62998: Flags [P.], seq
3544891232:3544891287, ack 3832081572, win 661, options [nop,nop,TS val 2936982235 ecr
2331923799], length 55
18:00:55.342933 IP 10.0.1.148.62998 >gt; 74.125.28.189.443: Flags [.], ack 55, win 8188,
options [nop,nop,TS val 2331932237 ecr 2936982235], length 0
18:00:56.477029 IP 10.0.1.148.50359 >gt; 67.195.141.201.443: Flags [P.], seq
696365506:696365533, ack 1903095540, win 16384, length 27
18:00:56.477158 IP 10.0.1.148.50359 >gt; 67.195.141.201.443: Flags [F.], seq 27, ack 1, win
16384, length 0
[...]
• Also dump to a file and examine later. Does incur overhead.

slide 24:

Observability So Far...
• We can see all the things!	
• Not really...

slide 25:

Observability So Far...
top
netstat Darwin Operating System
ActivityMonitor
Applications
Hardware
CPU
GPU
atMonitor
XNU Kernel
System Libraries
BSD
OSFMK
System Call Interface
VFS
Sockets
HFS+/...
TCP/UDP
Block Devices
Ethernet
tcpdump
iostat
atMonitor
vm_stat
Scheduler
Northbridge
Virtual!
Memory
DMI
I/O Kit
Device Drivers
Memory!
Bus
DRAM
top
Southbridge
ActivityMonitor
Device Interconnect (PCIe/USB)
I/O Controller
Disk
Temp.Monitor
FSB
Disk
Network Controller
Interface
Transports
Port
Port
Other Devices
netstat
ActivityMonitor

slide 26:

DTrace
• Programmable, real-time, dynamic
and static tracing	
• Write your own one-liners and
scripts, or use other people's;
including those in /usr/bin	
• There is a great book about it...

slide 27:

DTrace: Scripts
• Over 40 DTrace scripts are shipped with OS X (which I mostly wrote
originally). Listing them:
$ man -k dtrace
bitesize.d(1m)
cpuwalk.d(1m)
creatbyproc.d(1m)
dappprof(1m)
dapptrace(1m)
diskhits(1m)
dispqlen.d(1m)
dtrace(1)
dtruss(1m)
errinfo(1m)
execsnoop(1m)
[...]
- analyse disk I/O size by process. Uses DTrace
- Measure which CPUs a process runs on. Uses DTrace
- snoop creat()s by process name. Uses DTrace
- profile user and lib function usage. Uses DTrace
- trace user and library function usage. Uses DTrace
- disk access by file offset. Uses DTrace
- dispatcher queue length by CPU. Uses DTrace
- generic front-end to the DTrace facility
- process syscall details. Uses DTrace
- print errno for syscall fails. Uses DTrace
- snoop new process execution. Uses DTrace

slide 28:

DTrace: iosnoop
• iosnoop(1m): trace block device I/O
$ iosnoop
UID
PID D
BLOCK
SIZE
COMM PATHNAME
176 R 148471184 8192 SystemUIServer ??/vm/swapfile10
176 R 835310312 4096 SystemUIServer ??/vm/swapfile4
503 92489 W 746204600 61440 Google Chrome ??/Chrome/.com.google.Chrome.hw1Inp
503 92489 W 746204720 23472 Google Chrome ??/Default/.com.google.Chrome.76k4tG
19 W 425711304 4096
syslogd ??/DiagnosticMessages/2014.02.14.asl
19 W 57246896
syslogd ??/DiagnosticMessages/StoreData
19 W 425710304 4096
syslogd ??/DiagnosticMessages/2014.02.14.asl
503 52617 W 214894232 4096
firefox ??/iw4rbel9.default/_CACHE_CLEAN_
19 W 57246896
syslogd ??/DiagnosticMessages/StoreData
19 W 425710304 4096
syslogd ??/DiagnosticMessages/2014.02.14.asl
[...]
• Identify processes and files causing disk I/O

slide 29:

DTrace: hfsslower.d
• hfsslower.d: trace HFS calls slower than a threshold. Eg, 10 ms:
$ ~/dtbook_scripts/Chap5/hfsslower.d 10
TIME
PROCESS
2014 Feb 14 17:35:59 Terminal
R 5751
2014 Feb 14 17:35:59 Terminal
R 6166
2014 Feb 14 17:35:59 Terminal
W 11921
[...]
ms FILE
16 data.data
17 data.data
15 data.data
• Traces all application I/O to the file system, not just disk I/O	
• Script is on http://www.dtracebook.com

slide 30:

DTrace: execsnoop
• execsnoop(1m): trace process execution
$ execsnoop -v
STRTIME
2014 Feb 14 19:40:55
2014 Feb 14 19:40:55
2014 Feb 14 19:40:55
2014 Feb 14 19:40:55
2014 Feb 14 19:40:55
2014 Feb 14 19:40:56
2014 Feb 14 19:40:56
2014 Feb 14 19:40:56
2014 Feb 14 19:40:58
2014 Feb 14 19:41:03
[...]
UID
PID
PPID ARGS
551 man
551 man
94837 groff
94837 tbl
94838 cat
94841 grotty
94841 troff
94842 less
92489 Google Chrome He
92489 Google Chrome He
• Shows what programs are launched

slide 31:

DTrace: dtruss
• dtruss(1m): trace system calls, from one or many processes
dtruss -en bash
PID/THRD
ELAPSD SYSCALL(args)
= return
475/0x1199:
87917 read(0x0, "a\0", 0x1)
= 1 0
475/0x1199:
12 write_nocancel(0x2, "a\0", 0x1)
= 1 0
475/0x1199:
3 sigprocmask(0x1, 0x0, 0x7FFF55F898E0)
475/0x1199:
2 sigaltstack(0x0, 0x7FFF55F898D0, 0x0)
475/0x1199:
48163 read(0x0, "t\0", 0x1)
= 1 0
475/0x1199:
10 write_nocancel(0x2, "t\0", 0x1)
= 1 0
475/0x1199:
3 sigprocmask(0x1, 0x0, 0x7FFF55F898E0)
475/0x1199:
2 sigaltstack(0x0, 0x7FFF55F898D0, 0x0)
475/0x1199:
12 write_nocancel(0x2, "m\0", 0x1)
= 1 0
475/0x1199:
2 sigprocmask(0x1, 0x0, 0x7FFF55F898E0)
[...]
• dtruss is a script - edit it to add/modify it as desired
= 0x0 0
= 0 0
= 0x0 0
= 0 0
= 0x0 0

slide 32:

DTrace: sotop
• sotop: summarize socket I/O by-process, top-style:
$ sotop
PROCESS
kernel_task
firefox
Terminal
WindowServer
SIDPLAY
Google Chrome H
Google Chrome H
clear
Google Chrome
[...]
PID
READS
• Also from the DTrace book.
WRITES
READ_KB
WRITE_KB
CPU

slide 33:

Instruments
• Advanced analysis GUI	
• Includes many
"Instruments", which
profile applications
in different ways:	
• Data sources include
DTrace, CPU counters

slide 34:

Instruments
Thread States

slide 35:

Instruments
Low Level CPU Counters
• Performance monitor
counter (PMC) and
performance monitor
interrupts can be
instrumented	
• Hard work, but can be
used to understand
bus and interconnect
activity

slide 36:

Observability So Far...
dtruss netstat
sotop
execsnoop
ActivityMonitor Instruments
CPU
GPU
atMonitor
System Libraries
BSD
XNU Kernel
dtrace
Instruments
Applications
top
OSFMK
System Call Interface
Sockets
VFS
HFS+/...
TCP/UDP
Block Devices
tcpdump
hfsslower
iosnoop
Ethernet
iostat
atMonitor
vm_stat
Scheduler
Virtual!
Memory
Northbridge
DMI
I/O Kit
Device Drivers
Memory!
Bus
DRAM
top
Southbridge
ActivityMonitor
Device Interconnect (PCIe/USB)
I/O Controller
Disk
Temp.Monitor
FSB
Disk
Network Controller
Interface
Transports
Port
Port
Other Devices
netstat
ActivityMonitor

slide 37:

Tools Method in Practice
• Tools Method provides reasonable coverage	
• Some observability gaps, some uneven coverage	
• Can improve coverage by adding more tools: ps, ping, traceroute,
latency, df, sysctl, plockstat, opensnoop, dispqlen.d, runocc.d, nfsstat,
iopending, soconnect_mac.d, httpdstat.d, sc_usage, fs_usage, ...	
• I could keep covering tools for the rest of this talk...

slide 38:

soconnect_mac.d,soaccept_mac.d
dtruss,sc_usage
errinfo,kill.d plockstat
dapptrace httpdstat.d
top,ps
opensnoop netstat
sotop execsnoop
ActivityMonitor Instruments
CPU
GPU
atMonitor
System Libraries
BSD
XNU Kernel
Instruments
Applications
dtrace
Most DTrace scripts are in /usr/bin
Some are from my DTrace book
and are available online
OSFMK
System Call Interface
VFS
Sockets
HFS+/...
TCP/UDP
Ethernet
Block Devices
tcpdump
fs_usage
hfsslower.d
df,nfstat
iostat
iosnoop
iopending
maclife.d
macvfssnoop.d
Scheduler
Virtual!
Memory
I/O Kit
Device Drivers
dispqlen.d
FSB
runocc.d
latency Northbridge
priclass.d
pridist.d
vm_stat
atMonitor
Disk
Interface
Transports
bitesize.d
seeksize.d
Memory!
Bus
DRAM
top
ActivityMonitor
Southbridge
Custom Instruments using
CPU counters/interrupts can
be added for bus observability
Device Interconnect (PCIe/USB)
I/O Controller
Disk
DMI
Temp.Monitor
Network Controller
Other Devices
Port
netstat
Port
ping traceroute
ActivityMonitor

slide 39:

The Focus on Tools
• Useful, however, learning tools & metrics becomes laborious.	
• Still limited by what the tools provide, or provide easily.	
• You can try to approach this in a different way...

slide 40:

Instead of starting with the tools, start with the questions

slide 41:

The USE Method

slide 42:

The USE Method
• For every resource, check:	
• 1. Utilization	
• 2. Saturation	
• 3. Errors

slide 43:

The USE Method
• For every resource, check:	
• 1. Utilization: time resource was busy, or degree used	
• 2. Saturation: degree of queued extra work	
• 3. Errors: any errors

slide 44:

Queueing System
• If it helps, consider
all resources as a
a queueing system:	
• Also check errors
Saturation
Errors
Utilization

slide 45:

Hardware Resources
• CPUs	
• Main Memory	
• Network Interfaces	
• Storage Devices	
• Controllers, Interconnects	
• Find the functional diagram and examine every item in the data path...

slide 46:

Hardware
Functional Diagram
• For each check:	
• 1. Utilization	
• 2. Saturation	
• 3. Errors
CPU
GPU
FSB
Northbridge
DMI
Memory!
Bus
DRAM
Southbridge
Device Interconnect (PCIe/USB)
I/O Controller
Disk
Disk
Network Controller
Interface
Transports
Port
Port
Other Devices

slide 47:

USE Method Checklists
• Build a checklist for all combinations, identifying tools/metrics to use

slide 48:

OS X Checklist
Resource
Type
CPU
Utilization
CPU
Saturation
CPU
Errors
Metric

slide 49:

OS X Checklist
Resource
CPU
CPU
CPU
Type
Metric
Utilization
system-wide: iostat 1, "us" + "sy"; per-cpu: DTrace [1]; Activity
Monitor → CPU Usage or Floating CPU Window; per-process:top
-o cpu, "%CPU"; Activity Monitor → Activity Monitor, "%CPU"; ...
Saturation
system-wide: uptime, "load averages" >gt; CPU count; latency,
"SCHEDULER" and "INTERRUPTS"; per-cpu: dispqlen.d (DTT),
non-zero "value"; runocc.d (DTT), non-zero "%runocc"; perprocess: Instruments → Thread States, "On run queue"; DTrace [2]
Errors
dmesg; /var/log/system.log; Instruments → Counters, for PMC and
whatever error counters are supported (eg, thermal throttling)

slide 50:

OS X Checklist
Resource
CPU
CPU
CPU
Type
Metric
Utilization
system-wide: iostat 1, "us" + "sy"; per-cpu: DTrace [1]; Activity
Monitor → CPU Usage or Floating CPU Window; per-process:top
-o cpu, "%CPU"; Activity Monitor → Activity Monitor, "%CPU"; ...
Saturation
system-wide: uptime, "load averages" >gt; CPU count; latency,
"SCHEDULER" and "INTERRUPTS"; per-cpu: dispqlen.d (DTT),
non-zero "value"; runocc.d (DTT), non-zero "%runocc"; perprocess: Instruments → Thread States, "On run queue"; DTrace [2]
Errors
dmesg; /var/log/system.log; Instruments → Counters, for PMC and
whatever error counters are supported (eg, thermal throttling)

slide 51:

OS X Checklist, cont.
Resource
Type
Memory	
Utilization
Capacity
Memory	
Saturation
Capacity
Errors
Metric

slide 52:

OS X Checklist, cont.
Resource
Type
Memory	
Utilization
Capacity
Memory	
Saturation
Capacity
Errors
Metric
system-wide: vm_stat 1, main memory free = "free" + "inactive", in
units of pages; Activity Monitor → Activity Monitor → System
Memory, "Free" for main memory; per-process: top -o rsize,
"RSIZE" is resident main memory size, "VSIZE" is virtual memory
size; ps -alx, "RSS" is resident set size, "SZ" is virtual memory size;
aux similarvm_stat
(legacy format)
system-wide:
1, "pageout"; per-process: anonpgpid.d
(DTT), DTrace vminfo:::anonpgin [3] (frequent anonpgin == pain);
Instruments → Memory Monitor, high rate of "Page Ins" and "Page
Outs"; sysctl vm.memory_pressure [4]	
!System Information → Hardware → Memory, "Status" for physical
failures; DTrace failed malloc()s

slide 53:

OS X Checklist, cont.
Resource
Type
Memory	
Utilization
Capacity
Memory	
Saturation
Capacity
Errors
Metric
system-wide: vm_stat 1, main memory free = "free" + "inactive", in
units of pages; Activity Monitor → Activity Monitor → System
Memory, "Free" for main memory; per-process: top -o rsize,
"RSIZE" is resident main memory size, "VSIZE" is virtual memory
size; ps -alx, "RSS" is resident set size, "SZ" is virtual memory size;
aux similarvm_stat
(legacy format)
system-wide:
1, "pageout"; per-process: anonpgpid.d
(DTT), DTrace vminfo:::anonpgin [3] (frequent anonpgin == pain);
Instruments → Memory Monitor, high rate of "Page Ins" and "Page
Outs"; sysctl vm.memory_pressure [4]	
!System Information → Hardware → Memory, "Status" for physical
failures; DTrace failed malloc()s

slide 54:

OS X Checklist, cont.
• Full list: http://www.brendangregg.com/USEmethod/use-macosx.html	
• Includes
references
from earlier
tables

slide 55:

Software Resources
• Can be studied using USE metrics as well, if possible	
• OS X Checklist includes some example software resources:	
• Processes, file descriptors, kernel mutexes, user-level mutexes

slide 56:

Mutex Lock
• Can you think of what these could mean for a mutex lock?:	
• Utilization	
• Saturation	
• Errors

slide 57:

Mutex Lock
• Can you think of what these could mean for a mutex lock?:	
• Utilization: held time per second	
• Saturation: measure of contention time or waiters	
• Errors: EDEADLK, EINVAL

slide 58:

Future Work

slide 59:

Future Work
• Tools/Metrics for USE Method	
• More methodologies, and then tools

slide 60:

USE Method Tools
• Tools can be developed to fetch USE metrics more easily	
• Especially for busses and interconnects	
• Would love to see USE metrics in Activity Monitor

slide 61:

USE Method New Uses
• Can be applied new areas, developing new metrics	
• May not always work, but worth trying	
• Find a functional diagram of your system, application, or environment,
and look for U.S.E. metrics for each component

slide 62:

USE Metrics for all of:
Darwin Operating System
Hardware
Applications
CPU
GPU
XNU Kernel
System Libraries
BSD
OSFMK
System Call Interface
VFS
Sockets
HFS+/...
TCP/UDP
Ethernet
Block Devices
FSB
Scheduler
Northbridge
Virtual!
Memory
DMI
I/O Kit
Device Drivers
Memory!
Bus
DRAM
Southbridge
Device Interconnect (PCIe/USB)
I/O Controller
Disk
Disk
Network Controller
Interface
Transports
Port
Port
Other!
Devices

slide 63:

Stranger Example: TCP
$ netstat -s
tcp:
80444499 packets sent
28706719 data packets (3613656050 bytes)
76599 data packets (65712152 bytes) retransmitted
68 resends initiated by MTU discovery
41687640 ack-only packets (248964 delayed)
0 URG only packets
0 window probe packets
9286129 window update packets
707685 control packets
0 data packets sent after flow control
177149270 packets received
16296459 acks (for 3602941580 bytes)
556237 duplicate acks
0 acks for unsent data
154775303 packets (1214952475 bytes) received in-sequence
200501 completely duplicate packets (151553377 bytes)
1884 old duplicate packets
79 packets with some dup. data (17270 bytes duped)
6102493 out-of-order packets (4236017281 bytes)
67 packets (0 bytes) of data after window
0 window probes
14180 window update packets
72825 packets received after close
85 bad resets
0 discarded for bad checksums
0 discarded for bad header offset fields
0 discarded because packet too short
378961 connection requests
613 connection accepts
37 bad connection attempts
0 listen queue overflows
332688 connections established (including accepts)
381180 connections closed (including 13038 drops)
14527 connections updated cached RTT on close
14527 connections updated cached RTT variance on close
5495 connections updated cached ssthresh on close
1721 embryonic connections dropped
16204052 segments updated rtt (of 8674926 attempts)
374184 retransmit timeouts
4465 connections dropped by rexmit timeout
0 connections dropped after retransmitting FIN
91 persist timeouts
0 connections dropped by persist timeout
12784 keepalive timeouts
262 keepalive probes sent
1214 connections dropped by keepalive
1312411 correct ACK header predictions
152849516 correct data packet header predictions
17244 SACK recovery episodes
21329 segment rexmits in SACK recovery episodes
25852298 byte rexmits in SACK recovery episodes
180630 SACK options (SACK blocks) received
5682514 SACK options (SACK blocks) sent
0 SACK scoreboard overflow
• "netstat -s" output has over 50 metrics
for TCP	
• Do you understand them all?	
• Could USE metrics provide a high level
summary, treating TCP as a software
resource? (might be a stretch)
[...]

slide 64:

USE Method: TCP
• TCP as a software resource metrics:	
• Utilization	
• Saturation	
• Errors

slide 65:

USE Method: TCP
• TCP as a software resource metrics:	
• Utilization: time data was buffered per second	
• Saturation: listen queue overflows	
• Errors: bad connection attempts, bad resets, bad checksums, ...	
• I think I'd classify retransmits and duplicates as errors.

slide 66:

Other Methodologies
• Other methodologies include:	
• Drill Down Analysis Method	
• Workload Characterization	
• Thread State Analysis (TSA) Method	
• These too can pose questions that tools then answer

slide 67:

References
• http://www.brendangregg.com/USEmethod/use-macosx.html	
• http://www.brendangregg.com/usemethod.html	
• http://dtracebook.com - has DTrace book scripts online	
• http://dtrace.org/blogs/brendan/2011/10/10/top-10-dtrace-scripts-for-mac-os-x/	
- utilization
• http://dtrace.org/blogs/brendan/2011/12/18/visualizing-device-utilization/
heat maps	
• http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html - flame graphs

slide 68:

Thanks
• http://www.brendangregg.com	
• bgregg@netflix.com	
• @brendangregg