Recent posts:
Blog index
About
RSS

Brendan Gregg's Blog

home

Hybrid Storage Pool: Top Speeds

08 Oct 2009

I originally posted this at http://blogs.sun.com/brendan/entry/hybrid_storage_pool_top_speeds.

I'd like to show the benefit of the Hybrid Storage Pool (HSP) in the simplest possible way, as an addition to my top speed results for the Sun Storage 7410. The HSP uses both read and write optimized flash based solid state disks (SSDs) to improve the performance of file I/O, with the ZFS L2ARC and SLOG (separate ZIL) technologies:


old model

new model

with ZFS

I've demonstrated both the HSP technologies previously in detail, and explained how they work:

Here I'll test how these HSP technologies perform with simple workloads, to measure best-case latency.

Disclaimer: Performance isn't simple. Measuring the speed of simple workloads can be easy; inferring from this how a system will behave in any given production environment can be quite hard. The best way to understand performance is to test the system with a workload that closely resembles the production environment.

Latency

The results I'll measure for the L2ARC and SLOG will be latency, not IOPS or throughput. It doesn't make sense to compare IOPS values alone while ignoring the latency of those IOPS ("10,000 IOPS" from flash memory does not equal "10,000 IOPS" from slow rotating disks!) Throughput isn't characteristic either – while the HSP delivers great throughput, it usually does so by minimizing use of the L2ARC and SLOG and using multiple disk spindles instead.

The latency presented here will be measured from the NFS layer using Analytics on the Sun Storage server. This latency will include the overheads from processing the NFS protocol, the ZFS filesystem, and waiting for the flash based device I/O itself. The reason is to best portray the delivered performance of the whole product, rather than showing component performance.

Testing Notes

A couple of notes about performance testing SSDs.

L2ARC

See my screenshots for the full introduction to the L2ARC. To characterize it, I'll perform random 512 byte reads over NFSv3 from a single thread on a single client. The target is a 7410 (Barcelona) with 6 L2ARC devices (Readzillas). Some of the workload will hit from DRAM, some from the L2ARC, and some from disk. The amount from each will depend on cache warmth and working set size; what we are interested in here is the latency.

To find the top speed, the ZFS record size is set to 512 bytes to match the I/O size: but before you try this yourself, understand that ZFS record sizes smaller than 4 Kbytes do begin to cost noticeable amounts of DRAM (bigger metadata/data ratio) and significantly reduce streaming I/O performance. Before this sounds too unlikely to be interesting, note that a workload involving thousands of tiny files may behave in a similar way; the ZFS record size only takes effect if the file becomes bigger than that size.

The following screenshots show NFS read latency heat maps at different vertical ranges (click for larger versions), each corresponds to the HSP technology returning that read. I've also listed the latency range which covers most of the I/O:

DRAM latency: 0 - 10 us

This shows hits from the ZFS DRAM cache (the ARC: Adaptive Replacement Cache), which are fast as expected.

L2ARC latency: 128 - 260 us

The L2ARC is returning these 512 byte reads with consistently low latency. The range may sound high when compared to advertised SSD 512 byte read latency, but remember that these are aged SSDs, and this time includes the ZFS and NFS overheads.

Disk latency: 1900 - 9700 us

These 7200 RPM disks are returning the reads with latency from 2 to 10 ms, as expected for random disk I/O (platter rotation time + head seek time.)

SLOG

See my screenshots for the full introduction to the SLOG. To characterize it, I'll perform 512 byte O_DSYNC writes over NFSv3. Since these will all be handled by the SSD SLOG device (Logzilla), I'll also show what happens without them – the synchronous write latency to disk. As before, I'll show screenshots and latency ranges.

SLOG latency: 137 - 181 us

Most of the I/O is in a tight band in the 100 to 200 us range (the screenshot doesn't look so tight due to the false color palette; see the range average values on the left).

Disk latency: 1950 - 9170 us

Without the SSD based SLOG devices, synchronous write I/O is served here from 7200 RPM disks, with a latency between 2 and 9 ms.

Reality Check

While I've been showing top speeds of the L2ARC and SLOG devices, I can help set expectations by describing cases where you will be slower (or faster!) than these top speeds.

Summary

NFS latency ranges for the L2ARC:

Here DRAM latency doesn't quite register a pixel in height on the linear plot. The logarithmic plot shows the role of the SSD based L2ARC very well, in between DRAM and disk.

NFS latency ranges for the SLOG:

Conclusion

When the HSP uses read and write optimized flash memory SSDs the NFS I/O latency can reach as low as 0.1 ms, compared to 2 to 10 ms without these SSDs (disk + DRAM only.) These results aren't really surprising considering the technology.

I've updated my top speeds with these latency results. And as top speeds go, this is as good as it gets. Your latency will be higher with larger I/O sizes.