DTrace Topics: Introduction
Description: "Introduction to DTrace (Dynamic Tracing), written by Brendan Gregg and delivered in 2007. While aimed at a Solaris-based audience, this introduction is still largely relevant today (2012). Since then, DTrace has appeared in other operating systems (Mac OS X, FreeBSD, and is being ported to Linux), and, many user-level providers have been developed to aid tracing of other languages."| next prev 1/56 | |
| next prev 2/56 | |
| next prev 3/56 | |
| next prev 4/56 | |
| next prev 5/56 | |
| next prev 6/56 | |
| next prev 7/56 | |
| next prev 8/56 | |
| next prev 9/56 | |
| next prev 10/56 | |
| next prev 11/56 | |
| next prev 12/56 | |
| next prev 13/56 | |
| next prev 14/56 | |
| next prev 15/56 | |
| next prev 16/56 | |
| next prev 17/56 | |
| next prev 18/56 | |
| next prev 19/56 | |
| next prev 20/56 | |
| next prev 21/56 | |
| next prev 22/56 | |
| next prev 23/56 | |
| next prev 24/56 | |
| next prev 25/56 | |
| next prev 26/56 | |
| next prev 27/56 | |
| next prev 28/56 | |
| next prev 29/56 | |
| next prev 30/56 | |
| next prev 31/56 | |
| next prev 32/56 | |
| next prev 33/56 | |
| next prev 34/56 | |
| next prev 35/56 | |
| next prev 36/56 | |
| next prev 37/56 | |
| next prev 38/56 | |
| next prev 39/56 | |
| next prev 40/56 | |
| next prev 41/56 | |
| next prev 42/56 | |
| next prev 43/56 | |
| next prev 44/56 | |
| next prev 45/56 | |
| next prev 46/56 | |
| next prev 47/56 | |
| next prev 48/56 | |
| next prev 49/56 | |
| next prev 50/56 | |
| next prev 51/56 | |
| next prev 52/56 | |
| next prev 53/56 | |
| next prev 54/56 | |
| next prev 55/56 | |
| next prev 56/56 |
PDF: dtrace_topics_intro.pdf
Keywords (from pdftotext):
slide 1:
# dtrace -n 'syscall:::entry { @[exe
dtrace: description 'syscall:::entry
DTrace Topics:
Introduction
Brendan Gregg
Sun Microsystems
April 2007
iscsitgtd
nscd
operapluginclean
screen-4.0.2
devfsadm
httpd
sendmail
xload
evince
operapluginwrapp
xclock
xntpd
FvwmIconMan
fmd
FvwmPager
dtrace
gnome-terminal
fvwm2
x64
akd
opera
Xorg
soffice.bin
slide 2:DTrace Topics: Introduction • This presentation is an introduction to DTrace, and is part of the “DTrace Topics” collection. >gt; Difficulty: >gt; Audience: Everyone • These slides cover: >gt; What is DTrace >gt; What is DTrace for >gt; Who uses DTrace >gt; DTrace Essentials >gt; Usage Featuresslide 3:
What is DTrace • DTrace is a dynamic troubleshooting and analysis tool first introduced in the Solaris 10 and OpenSolaris operating systems. • DTrace is many things, in particular: >gt; A tool >gt; A programming language interpreter >gt; An instrumentation framework • DTrace provides observability across the entire software stack from one tool. This allows you to examine software execution like never before.slide 4:
DTrace example #1
• Tracing new processes system-wide,
# dtrace -n 'syscall::exece:return { trace(execname); }'
dtrace: description 'syscall::exece:return ' matched 1 probe
CPU
FUNCTION:NAME
0 76044
exece:return
man
0 76044
exece:return
0 76044
exece:return
neqn
0 76044
exece:return
tbl
0 76044
exece:return
nroff
0 76044
exece:return
col
0 76044
exece:return
0 76044
exece:return
0 76044
exece:return
0 76044
exece:return
more
System calls are only one layer of the software stack.
slide 5:The Entire Software Stack • How did you analyze these? Dynamic Languages Memory allocation Examples: Java, JavaScript, ... User Executable /usr/bin/* Libraries /usr/lib/* Syscall Interface man -s2 Kernel File Systems Device Drivers Scheduler Hardware VFS, DNLC, UFS, ZFS, TCP, IP, ... sd, st, hme, eri, ... disk data controllerslide 6:
The Entire Software Stack • It was possible, but difficult: Dynamic Languages Previously: debuggers User Executable truss -ua.out apptrace, sotruss Libraries truss Syscall Interface Memory allocation Kernel File Systems Device Drivers Scheduler Hardware prex; tnf* lockstat mdb kstat, PICs, guessworkslide 7:
The Entire Software Stack • DTrace is all seeing: DTrace visibility: Yes, with providers Dynamic Languages Memory allocation User Executable Yes Libraries Yes Syscall Interface Yes Kernel File Systems Device Drivers Scheduler Hardware Yes No. Indirectly, yesslide 8:
What DTrace is like • DTrace has the combined capabilities of numerous previous tools and more: Tool Capability truss -ua.out tracing user functions apptrace tracing library calls truss prex; tnf* tracing system calls tracing some kernel functions lockstat profiling the kernel mdb -k accessing kernel VM mdb -p accessing process VM Plus a programming language similar to C and awk.slide 9:
Syscall Example
• Using truss:
Only examine 1 process
$ truss date
execve("/usr/bin/date", 0x08047C9C, 0x08047CA4) argc = 1
resolvepath("/usr/lib/ld.so.1", "/lib/ld.so.1", 1023) = 12
resolvepath("/usr/bin/date", "/usr/bin/date", 1023) = 13
xstat(2, "/usr/bin/date", 0x08047A58)
= 0
open("/var/ld/ld.config", O_RDONLY)
= 3
fxstat(2, 3, 0x08047988)
= 0
mmap(0x00000000, 152, PROT_READ, MAP_SHARED, 3, 0) = 0xFEFB0000
close(3)
= 0
mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, -1
sysconfig(_CONFIG_PAGESIZE)
= 4096
[...]
Output is
limited to
provided
options
truss slows down the target
slide 10:
Syscall Example
• Using DTrace:
You choose the output
# dtrace -n 'syscall:::entry { printf("%16s %x %x", execname, arg0, arg1); }'
dtrace: description 'syscall:::entry ' matched 233 probes
CPU
FUNCTION:NAME
1 75943
read:entry
Xorg f 8047130
1 76211
setitimer:entry
Xorg 0 8047610
1 76143
writev:entry
Xorg 22 80477f8
1 76255
pollsys:entry
Xorg 8046da0 1a
1 75943
read:entry
Xorg 22 85121b0
1 76035
ioctl:entry
soffice.bin 6 5301
1 76035
ioctl:entry
soffice.bin 6 5301
1 76255
pollsys:entry
soffice.bin 8047530 2
[...]
Minimum performance cost
Watch every process
slide 11:What is DTrace for • Troubleshooting software bugs >gt; Proving what the problem is, and isn't. >gt; Measuring the magnitude of the problem. • Detailed observability >gt; Observing devices, such as disk or network activity. >gt; Observing applications, whether they are from Solaris, 3rd party, or in-house. • Capturing profiling data for performance analysis >gt; If there is latency somewhere, DTrace can find itslide 12:
What isn't DTrace • DTrace isn't a replacement for kstat or SMNP >gt; kstat already provides inexpensive long term monitoring. • DTrace isn't sentient, it needs to borrow your brain to do the thinking • DTrace isn't “dTrace”slide 13:
Who is DTrace for • Application Developers >gt; Fetch in-flight profiling data without restarting the apps, even on customer production servers. >gt; Detailed visibility of all the functions that they wrote, and the rest of the software stack. >gt; Add static probes as a stable debug interface. • Application Support >gt; Provides a comprehensive insight into application behavior. >gt; Analyze faults and root-cause performance issues. >gt; Prove where issues are, and measure their magnitude.slide 14:
Who is DTrace for • System Administrators >gt; Troubleshoot, analyze, investigate where never before. >gt; See more of your system - fills in many observability gaps. • Database Administrators >gt; Analyze throughput performance issues across all system components. • Security Administrators >gt; Customized short-term auditing >gt; Malware decipheringslide 15:
Who is DTrace for • Kernel Engineers >gt; Fetch kernel trace data from almost every function. >gt; Function arguments are auto-casted providing access to all struct members. >gt; Fetch nanosecond timestamps for function execution. >gt; Troubleshoot device drivers, including during boot. >gt; Add statically defined trace points for debugging.slide 16:
How to use DTrace • DTrace can be used by either: >gt; Running prewritten one-liners and scripts – DTrace one-liners are easy to use and ofter useful, http://www.solarisinternals.com/dtrace – The DtraceToolkit contains over 100 scripts ready to run, http://www.opensolaris.org/os/community/dtrace/dtracetoolkit >gt; Writing your own one-liners and scripts – Encouraged – the possibilities are endless – It helps to know C – It can help to know operating system fundamentalsslide 17:
DTrace wins • Finding unnecessary work >gt; Having deep visibility often finds work being performed that isn't needed. Eliminating these can produce the biggest DTrace wins – 2x, 20x, etc. • Solving performance issues >gt; Being able to measure where the latencies are, and show what their costs are. These can produce typical performance wins – 5%, 10%, etc.slide 18:
DTrace wins • Finding bugs >gt; Many bugs are found though static debug frameworks; DTrace is a dynamic framework that allows custom and comprehensive debug info to be fetched when needed. • Proving performance issues >gt; Many valuable DTrace wins have no immediate percent improvement, they are about gathering evidence to prove the existence and magnitude of issues.slide 19:
Example scenario: The past • Take a performance issue on a complex customer system, Customer: “Why is our system slow?” • With previous observability tools, customers could often find problems but not take the measurements needed to prove that they found the problem. >gt; What is the latency cost for this issue? As a percent?slide 20:
Example scenario: The past Application Vendor: “The real problem may be the database.” Database Vendor: “The real problem may be the OS.” OS Vendor: “The real problem may be the application.” • The “blame wheel”slide 21:
Example scenario: The past Customer: “I think I've found the issue in the application code.” Application Vendor: “That issue is costly to fix. We are happy to fix it, so long as you can prove that this is the issue.” • The lack of proof can mean stalemate.slide 22:
Example scenario: The future A happy ending • With DTrace, all players can examine all of the software themselves. Customer: “I measured the problem, it is in the application.” Application Vendor: “I'd better fix that right away.” – Example: “80% of the average transaction time is spent in the application waiting for user-level locks.”slide 23:
Example scenario: The future An alternate happy ending for application vendors Application Vendor: “We measured the problem and found it was in the OS.” OS Vendor: “We'd better fix that right away.” – Example: “80% of our average transaction time is consumed by a bug in libc.”slide 24:
Answers to initial questions • DTrace is not available for Solaris 9. • You need to be root, or have the correct privileges, to run /usr/sbin/dtrace. • There is a GUI called chime. • DTrace is safe for production use, provided you don't deliberately try to cause harm. • DTrace has low impact when in use, and zero impact when not.slide 25:
What's next: • We just covered: >gt; What is DTrace >gt; What is DTrace for >gt; Who uses DTrace • Next up is: >gt; DTrace Essentials >gt; Usage Featuresslide 26:
Terminology
• Example #1
consumer
probe
action
# dtrace -n 'syscall::exece:return { trace(execname); }'
dtrace: description 'syscall::exece:return ' matched 1 probe
CPU
FUNCTION:NAME
0 76044
exece:return
man
0 76044
exece:return
0 76044
exece:return
neqn
0 76044
exece:return
tbl
0 76044
exece:return
nroff
[...]
slide 27:Consumer • Consumers of libdtrace(3LIB), dtrace lockstat plockstat intrstat command line and scripting interface kernel lock statistics user-level lock statistics run-time interrupt statistics • libdtrace is currently a private interface and not to be used directly (nor is there any great reason to); the supported interface is dtrace(1M). >gt; NOTE: You are still encouraged to use libkstat(3LIB) and proc(4) directly, rather than wrapping /usr/bin consumers.slide 28:
Privileges $ id uid=1001(user1) gid=1(other) $ /usr/sbin/dtrace -n 'syscall::exece:return' dtrace: failed to initialize dtrace: DTrace requires additional privileges • Non-root users need certain DTrace privileges to be able to use DTrace. • These privileges are from the Solaris 10 “Least Privilege” feature.slide 29:
Probes • Data is generated from instrumentation points called “probes”. • DTrace provides thousands of probes. • Probe examples: Probe Name Description syscall::read:entry A read() syscall began proc:::exec-success A process created successfully io:::start An I/O was issued (disk/vol/NFS) io:::done An I/O completedslide 30:
Probe Names • Probe names are a four-tuple: Provider Module Function Name syscall::exece:return >gt; Provider >gt; Module >gt; Function >gt; Name A library of related probes. The module the function belongs to, either a kernel module or user segment. The function name that contains the probe. The name of the probe.slide 31:
Listing Probes • dtrace -l lists all currently available probes that you have privilege to see, with one probe per line: # dtrace -l PROVIDER dtrace dtrace dtrace sched [...] # dtrace -l | wc -l MODULE FUNCTION NAME BEGIN END ERROR fx_yield schedctl-yi • Here the root user sees 69,879 available probes. • The probe count changes – it is dynamic (DTrace).slide 32:
Tracing Probes • dtrace -n takes a probe name and enables tracing: # dtrace -n syscall::exece:return dtrace: description 'syscall::exece:return' matched 1 probe CPU FUNCTION:NAME 0 76044 exece:return 0 76044 exece:return • The default output contains: CPU id that event occured on (if this changes, the output may be shuffled) DTrace probe id – ID Part of the probe name – FUNCTION:NAME – CPUslide 33:
Providers • Examples of providers: Provider syscall proc sched sysinfo vminfo profile pid fbt Description system call entries and returns process and thread events kernel scheduling events system statistic events virtual memory events system I/O events fixed rate sampling user-level tracing raw kernel tracingslide 34:
Providers • Example of probes: Provider Example probe syscall proc sched sysinfo vminfo profile pid syscall::read:entry proc:::exec-success sched:::on-cpu sysinfo:::readch vminfo:::maj_fault io:::start profile:::profile-1000hz pid172:libc:fopen:entry pid172:a.out:main:entry fbt::bdev_strategy:entry fbtslide 35:
Providers • Providers are documented in the DTrace Guide, as separate chapters. • Providers are dynamic, the number of available probes can vary. • Some providers are “unstable interface”, such as fbt and sdt. >gt; This means that their probes, while useful, may vary in name and arguments between Solaris versions. >gt; Try to use stable providers instead (if possible).slide 36:
Provider Documentation • Some providers assume a little background knowledge, other providers assume a lot. Knowing where to find supporting documentation is important. • Where do you find documentation on: >gt; Syscalls? >gt; User Libraries? >gt; Application Code? >gt; Kernel functions?slide 37:
Provider Documentation • Additional documentation may be found here: Target syscalls libraries app code raw kernel Provider syscall pid:lib* pid:a.out fbt Additional Docs man(2) man(3C) source code? Solaris Internals 2nd Ed, http://cvs.opensolaris.orgslide 38:
Actions • When a probe fires, an action executes. • Actions are written in the D programming language. • Actions can: >gt; print output >gt; save data to variables, and perform calculations >gt; walk kernel or process memory • With destruction actions allowed, actions can: >gt; raise signals on processes >gt; execute shell commands >gt; write to some areas of memoryslide 39:
trace() Example
# dtrace -n 'syscall::exece:return { trace(execname); }'
dtrace: description 'syscall::exece:return ' matched 1 probe
CPU
FUNCTION:NAME
0 76044
exece:return
man
0 76044
exece:return
0 76044
exece:return
neqn
0 76044
exece:return
tbl
0 76044
exece:return
nroff
0 76044
exece:return
col
[...]
• The trace() action accepts one argument and prints
it when the probe fired.
slide 40:
printf() Example
# dtrace -n 'syscall::exece:return { printf("%6d %s\n", pid, execname); }'
dtrace: description 'syscall::exece:return ' matched 1 probe
CPU
FUNCTION:NAME
0 74415
exece:return
4301 sh
0 74415
exece:return
4304 neqn
0 74415
exece:return
4305 nroff
0 74415
exece:return
4306 sh
0 74415
exece:return
4308 sh
[...]
• DTrace ships with a powerful printf(), to print
formatted output.
slide 41:Default Variables • Numerous predefined variables can be used, eg: >gt; pid, tid Process ID, Thread ID >gt; timestamp Nanosecond timestamp since boot >gt; probefunc Probe function name (3rd field) >gt; execname Process name >gt; arg0, ... Function arguments and return value >gt; errno Last syscall failure error code >gt; curpsinfo Struct contating current process info, eg, curpsinfo->gt;pr_psargs – process + args • Pointers and structs! DTrace can walk memory using C syntax, and has kernel types predefined.slide 42:
curthread • curthread is a pointer to current kthread_t From here you can walk kernel memory and answer endless questions about OS internals. • Eg, the current process user_t is, curthread->gt;t_procp->gt;p_user • You might not ever use curthread, but it is good to know that you can. (And there are other ways to get inside the kernel). Opinion: curthread is like the down staircase in nethack, angband, moria, ...slide 43:
Variable Types • DTrace supports the following variable types: >gt; Integers >gt; Structs >gt; Pointers >gt; Strings >gt; Associative arrays >gt; Aggregates • Including types from /usr/include/sys, eg uint32_t.slide 44:
Aggregations
• A great feature of DTrace is to process data as it is
captured, such as using aggregations.
• Eg, frequency counting syscalls:
# dtrace -n 'syscall:::entry { @num[probefunc] = count(); }'
dtrace: description 'syscall:::entry ' matched 233 probes
[...]
writev
write
read
pollsys
ioctl
@num is the aggregation variable, probefunc is the key,
and count() is the aggregating function.
slide 45:Aggregating Functions • These include: count events, useful for frequency counts >gt; sum(value) sum the value >gt; avg(value) average the value >gt; min(value) find the value minimum >gt; max(value) find the value maximum >gt; quantize(value) print power-2 distribution plots >gt; count()slide 46:
Quantize
• Very cool function, here we quantize write sizes:
# dtrace -n 'sysinfo:::writech { @dist[execname] = quantize(arg0); }'
dtrace: description 'sysinfo:::writech ' matched 4 probes
[...]
value ------------- Distribution ------------- count
4 |
8 |
16 |
32 |@@@@@@@@@@@@@@@@@@@
64 |@@@@@@@@@@@@@@@@@@@@@
128 |
[...]
• Here we see that ls processes usually write
between 32 and 127 bytes. Makes sense?
slide 47:ls -l # ls -l /etc dttotal 793 lrwxrwxrwx 1 root drwxr-xr-x 4 root drwxr-xr-x 2 adm drwxr-xr-x 3 root lrwxrwxrwx 1 root drwxr-xr-x 5 root drwxr-xr-x 7 root drwxr-xr-x 4 root drwxr-xr-x 2 root -rw-r--r-1 root -rw-r--r-1 root lrwxrwxrwx 1 root [...] root sys adm root root sys bin bin sys bin bin root 12 Mar 21 03:28 TIMEZONE ->gt; default/init 6 Apr 16 06:59 X11 3 Mar 20 09:25 acct 3 Apr 16 23:11 ak 12 Mar 21 03:28 aliases ->gt; mail/aliases 5 Feb 20 23:29 amd64 18 Mar 20 09:20 apache 7 Feb 20 23:12 apache2 5 Feb 20 23:27 apoc 1012 Mar 20 09:33 auto_home 1066 Mar 20 09:33 auto_master 16 Mar 21 03:28 autopush ->gt; ../sbin/autopu ls writes one line at a time, each around 80 chars long.slide 48:
Predicates
• DTrace predicates are used to filter probes, so that
the action fires when a conditional is true.
probename /predicate/ { action }
• Eg, syscalls for processes called “bash”:
# dtrace -n 'syscall:::entry /execname == "bash"/ { @num[probefunc] =
count(); }'
dtrace: description 'syscall:::entry ' matched 233 probes
exece
[...]
read
write
lwp_sigmask
sigaction
slide 49:Scripting • If your one-liners get too long, write scripts. Eg, bash-syscalls.d: #!/usr/sbin/dtrace -s syscall:::entry /execname == "bash"/ @num[probefunc] = count(); • Getting it running: # chmod 755 bash-syscalls.d # ./bash-syscalls.d dtrace: script './bash-syscalls.d' matched 233 probes [...]slide 50:
What's next: • We just covered: >gt; What is DTrace >gt; What is DTrace for >gt; Who uses DTrace >gt; DTrace Essentials • Next up is: >gt; Usage Featuresslide 51:
Measuring Time • Access to high resolution timestamps is of particular use for performance analysis. >gt; timestamp >gt; vtimestamp time since boot in nanoseconds thread on-CPU timestamp • Measuring these for application and operating system function calls will answer: >gt; timestamp >gt; vtimestamp where is the latency? why are the CPUs busy?slide 52:
Printing Stacks
• Printing user and kernel stack traces explains both
why and the how something happened.
• Why is bash calling read()? Using ustack():
# dtrace -n 'syscall::read:entry /execname == "bash"/ { ustack(); }'
dtrace: description 'syscall::read:entry ' matched 1 probe
CPU
FUNCTION:NAME
0 74314
read:entry
libc.so.1`_read+0x7
bash`rl_getc+0x22
bash`rl_read_key+0xad
bash`readline_internal_char+0x5f
bash`0x80b1171
bash`0x80b118c
bash`readline+0x3a
[...]
Ahh, readline()
slide 53:End of Intro • DTrace is a big topic, but you don't need to know it all to get value from DTrace. • To learn more, browse “DTrace Topics”, http://www.solarisinternals.com/dtrace. Here you will find: >gt; A wiki version of this presentation >gt; The PDF for this presentation >gt; dozens of other DTrace Topics (eg, one-liners!) • Also see the “Solaris Performance and Tools” book, http://www.sun.com/books/catalog/solaris_perf_tools.xmlslide 54:
Sampling
• DTrace isn't just about tracing events, DTrace can
also sample at customized rates.
• Eg, sampling 5-level user stack traces from Xorg:
# dtrace -n 'profile-1001 /execname == "Xorg"/ { @[ustack(5)] = count(); }'
dtrace: description 'profile-1001 ' matched 1 probe
libfb.so`fbSolid+0x2c6
libfb.so`fbFill+0xb8
libfb.so`fbPolyFillRect+0x1d5
nvidia_drv.so`0xfe09e87b
Xorg`miColorRects+0x124
nvidia_drv.so`_nv000592X+0x3d
0x1016c00
nvidia was on-CPU
87 times
slide 55:See Also • DTrace home: http://www.opensolaris.org/os/community/dtrace >gt; Main site of links >gt; DTrace-discuss mailing list • Team DTrace blogs: >gt; http://blogs.sun.com/bmc >gt; http://blogs.sun.com/mws >gt; http://blogs.sun.com/ahl • DTraceToolkit: >gt; http://www.opensolaris.org/os/community/dtrace/dtracetoolkitslide 56:
dtrace:::END Brendan Gregg brendan@sun.com

























































