Systems Performance: Enterprise and the Cloud, 2nd Edition (2020)
This is the official site for the book Systems Performance: Enterprise and the Cloud, 2nd Edition, published by Addison Wesley (2020). Here I'll describe the book, link to related content, and list errata.
The first edition has been very successful, becoming required or recommended reading at many companies. Thanks for all your support!
I will add links here to second edition when it appears online (Amazon, etc). The intended release date is around October 2020, although a rough cut will likely appear on Safari much sooner.
What is New in Second Edition?
The second edition adds content on BPF, BCC, bpftrace, perf, and Ftrace, mostly removes Solaris, makes numerous updates to Linux and cloud computing, and includes general improvements and additions. Since writing the first edition, I have been a senior performance engineer at Netflix for six years, where I have worked on new technologies with other engineering experts. These technologies are covered in the book.
Chapters are structured to first cover durable skills (models, architecture, and methodologies) and then implementation with tools and tuning. This will be evident to those who read the first edition: most chapters begin with only light changes since the first edition, but the changes increase as each chapter progresses.
Why Systems Performance
Systems performance is an important skill for all computer users, whether you're trying to understand why your laptop is slow or optimizing the performance of a large-scale production environment. Systems performance is the study of both operating system (kernel) and application performance.
There are two general goals:
- Improving price/performance
- Reducing latency outliers
Other activities of systems performance include benchmarking to evaluate systems, capacity planning, bottleneck elimination, and scalability analysis – so that you discover scalability limiters early, in time to fix them.
This book introduces topics in an OS-agnostic way, then uses Linux as a primary example implementation.
This book is primarily for system administrators, system reliability engineers, performance engineers, support staff, and other operators in enterprise and cloud environments. It is also a useful reference for developers, database administrators, and web server administrators who would like to understand operating system and application performance.
Why This Book is Different
While it covers performance tools and the background for understanding them, what makes this book different is the inclusion of many performance methodologies, including those covered briefly in my USENIX 2012 talk. I've been teaching and developing systems performance classes on and off for ten years, and have found methodologies to be crucial for giving students a starting point, and then guiding them through performance activities. The USE Method is one example I developed for this purpose.
Table of Contents
3. Operating Systems
4. Observability Tools
8. File Systems
11. Cloud Computing
16. Case Study
The draft is roughly 800 pages.
Thanks to all the reviewers, and to Deirdré Straughan for editing another one of my books!