Systems Performance: Enterprise and the Cloud, 2nd Edition (2020)

This is the official site for the book Systems Performance: Enterprise and the Cloud, 2nd Edition, published by Addison-Wesley (2020). Here I'll describe the book, link to related content, and list errata.

You can find it on Amazon as paperback and Kindle, and on InformIT as paperback, EPUB, MOBI, and PDF. (If you purchase through the Amazon or InformIT links, the book's technical editor earns a commission.)

The first edition has been very successful, becoming recommended or required reading at many companies, and has sold editions translated to Chinese, Japanese, Polish, and Korean. I've had many emails from people studying the book for the Facebook engineering interview. I'm glad it's helpful.

There is also a companion book, BPF Performance Tools, that provides advanced coverage of BPF performance analysis tools.

On this page: What's New, Why, Operating Systems, Audience, Differences, TOC, Related Content, Errata.

Sample Figures

What is New in Second Edition?

The second edition adds content on BPF, BCC, bpftrace, perf, and Ftrace, mostly removes Solaris, makes numerous updates to Linux and cloud computing, and includes general improvements and additions. Since writing the first edition I now have over six years experience as a senior performance engineer at Netflix, working on new technologies with other engineering experts. This experience has helped me to improve this book.

My blog post about the book includes a visualization of every page, with text colored to show what has changed.

Chapters are structured to first cover durable skills (models, architecture, and methodologies) and then implementation with tools and tuning. This will be evident to those who read the first edition: most chapters begin with only light changes since the first edition, but the changes increase as each chapter progresses.

Why Systems Performance

Systems performance is an important skill for all computer users, whether you're trying to understand why your laptop is slow or optimizing the performance of a large-scale compute environment (for example, Facebook's datacenters or the Netflix cloud). Systems performance is the study of application, operating system, kernel, and hardware performance.

There are two general goals:

Improving price/performance
Reducing latency outliers

Other activities of systems performance include benchmarking to evaluate systems, capacity planning, bottleneck elimination, and scalability analysis – so that you discover scalability limiters early, in time to fix them.

Operating Systems

Topics are introduced in an OS-agnostic way, then Linux is covered as the primary example.

Audience

This book is primarily for system administrators, system reliability engineers, performance engineers, support staff, and other operators in enterprise and cloud environments. It is also a useful reference for developers, database administrators, and web server administrators who would like to understand operating system and application performance.

Why This Book is Different

While it covers performance tools and the background for understanding them, what makes this book different is the inclusion of many performance methodologies, including those covered briefly in my USENIX 2012 talk. I've been teaching and developing systems performance classes on and off for over ten years, and have found methodologies to be crucial for giving students a starting point and then guiding them through performance activities. The USE Method is a methodology I developed for this purpose.

The draft is roughly 800 pages.

PDF Download eBook ePUB

An eBook will be available at some point. It will not be a Safari "rough cut" after my experience with the BPF book.

Errata

1st Printing:

Thanks to all the reviewers, and to Deirdré Straughan for editing another one of my books!