Latency Heat Maps


Disk I/O Latency Heat Map

Response time - or latency - is crucial to understand in detail, but many of the common presentations of this data hide important details and patterns. Latency heat maps are an effective way to reveal these. I often use tools that provide heat maps directly, but sometimes I have separate trace output that I'd like to convert into a heat map. To answer this need, I just wrote trace2heatmap.pl, which generates interactive SVGs.

Problem

I whipped up a simple example to explain this, using disk I/O latency (I have plenty of real-world examples, but explaining them can get sidetracked). This is a single disk system, with a single process performing a sequential synchronous write workload.

Using iostat(1M) to examine average latency (asvc_t):

$ iostat -xnz 1
[...]
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0  220.0    0.0 9635.8  0.0  1.0    0.0    4.6   0  99 c1d0
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0  203.0    0.0 8976.2  0.0  1.0    0.0    5.1   0  99 c1d0

I could plot average latency (as many monitoring products do), but the average is seriously misleading, and doesn't explain what's really happening. And since latency is so important for performance, I want to know exactly what is happening.

I had "iosnoop -Dots" running, which collected two minutes of per-I/O latency and other details:

# iosnoop -Dots > out.iosnoop
^C
# more out.iosnoop
STIME(us)   TIME(us)    DELTA(us) DTIME(us) UID   PID D    BLOCK   SIZE      COMM PATHNAME
9339835115  9339835827  712       730       100 23885 W 253757952 131072    odsync <none>
9339835157  9339836008  850       180       100 23885 W 252168272   4096    odsync <none>
9339926948  9339927672  723       731       100 23885 W 251696640 131072    odsync <none>
[...15,000 lines truncated...]

I/O latency is the "DELTA(us)" column. This file was thousands of lines long - too much to read.

Latency Histogram: With Outliers

The latency distribution can be examined as a histogram (using R, and a subset of the trace file):

This shows that the average has been dragged up by latency outliers: I/O with very high latency.

This is a fairly common occurrence, and it's very useful to know when it has occurred. Those outliers may be individually causing problems, and can be easily be plucked from the trace file for further analysis; eg:

# awk '$3 > 50000' out.iosnoop_marssync01
STIME(us)   TIME(us)    DELTA(us) DTIME(us) UID   PID D    BLOCK   SIZE      COMM PATHNAME
9343218579  9343276502  57922     57398      0      0 W 142876112  4096      sched <none>
9343218595  9343276605  58010     103        0      0 W 195581749  5632      sched <none>
9343278072  9343328860  50788     50091      0      0 W 195581794  4608      sched <none>
[...]

Most of the I/O in the histogram was in a single column on the left.

Latency Histogram: Zoomed

Zooming in, by generating a histogram of the 0 - 2 ms range:

The I/O distribution is bi-modal. This also commonly occurs for latency or response time in different subsystems. Eg, the application has a "fast path" and a "slow path", or a resource has cache hits vs cache misses, etc.

But there is still more hidden here. The average latency reported by iostat hinted that there was per-second variance. This histogram is reporting the entire two minutes of iosnoop output.

Latency Histogram: Animation

I rendered the iosnoop output as per-second histograms, and generated the following animation (a subset of the frames):

Not only is this bi-modal, but the modes move over time. This had been obscured by rendering all data as a single histogram.

Heat Map

Using trace2heatmap.pl (github) to generate a heat map from the iosnoop output.

Mouse-over for details, and compare to the animation above. If this embedded SVG isn't working in your browser, try the standalone SVG or PNG.

The command used was:

$ awk '{ print $2, $3 }' out.iosnoop | ./trace2heatmap.pl --unitstime=us \
    --unitslatency=us --maxlat=2000 --grid > heatmap.svg

Without "--grid", the grid lines are not drawn (making it more Tufte-friendly); see the example.

trace2heatmap.pl gets the job done, but it's probably a bit buggy - I spent three hours writing it (and more than three hours writing this post about it!), really for just the trace files I don't already have heat maps for.

Heat Maps Explained

It may already be obvious how these work. Each frame of the histogram animation becomes a column in a latency heat map, with the histogram bar height represented as a color:

You can also see my summary for how heat maps work, and my longer article about them: "Visualizing System Latency" (ACMQ, CACM).

Scatter Plot

A heat map is similar to a scatter plot, however, placing points in buckets (the rectangles) rather than showing them individually. Because there is a finite number of buckets in a heatmap, the storage cost is fixed, unlike a scatter plot which must store x and y coordinates for every data point. This can make heat maps suitable for visualizing millions of data elements. Detail may also be seen beyond the point where a scatter plot becomes too cluttered.

Production Use

If you want to add heat maps to your monitoring product, then great! However, note that tracing per-event latency can be expensive to perform. Tools like DTrace can minimize the overheads as much as possible using per-CPU buffers and asynchronous kernel-user transfers; other tools (eg, strace, tcpdump) are expected to have higher overhead. This can cause problems for production use: you want to understand the overhead before tracing events.

Heat maps have been used successfully in production – and recorded at a one-second granularity 24x7x365 – by some products built upon DTrace. These use the DTrace aggregating feature to pass a quantized summary of latency, instead of every event, to user-level, reducing the data transfer by a large factor (eg, 1000x). This summary may consist of a per-second array with about 200 elements for different latency ranges, each containing the count of events, and is from the DTrace aggregating actions @quantize, @lquantize, or @llquantize (best). This array is then resampled (downsampled) to the resolution desired for the heat map (usually down to 30 or so levels). Examples of products that do this are the Oracle ZFS Storage Appliance, and Joyent Cloud Analytics.

Linux

On Linux, heat maps can be generated from many different sources of data. My perf_events heat map is one example.

Background

Bryan and I developed latency heat maps in 2008 for the ZFS Storage Appliance. For more background, see Visualizing System Latency.

Thanks to Deirdré Straughan for helping editing this page.

For more heat maps, see the main page.


Last updated: 08-Feb-2014