The person who became known on the Internet for yelling at servers is now becoming famous for another, somewhat related, feat, creating a new type of data visualization for characterizing system performance.
Brendan Gregg, lead performance engineer at cloud provider Joyent, has developed a visualization technique called a flame graph that can be effective for charting how system resources such as CPUs and memory are used. It has subsequently been picked up by a number of engineers who have used it to enhance popular diagnostic tools such as DTrace and Windows XPerf.
Gregg explained how the flame graph works Thursday at the USENIX LISA (Large Installation System Administration) conference in Washington, D.C. Flame graphs could save hours of diagnostic time for system administrators, performance engineers, support staff and others trying to figure out why a system is running more slowly than expected.
"We've had stack traces for a long while, but what Brendan has done has given us a really fast way of seeing aspects that weren't easily visible before," said one attendee of the presentation, noting that flame graphs would have come in handy for him at work during a recent dispute with a software vendor over a performance issue.
The vendor might have been able to solve the problem in a few hours using a flame graph rather than the three weeks it ended up taking, he said.
Gregg's expertise lies in the area of measuring system performance. His book on the topic was published this year by Prentice Hall.
In 2008, Gregg, then an employee at Sun Microsystems, attracted attention for showing how disk I/O could be slowed by sudden loud noises, a fact he demonstrated by yelling, quite loudly, at a server. The resulting vibrations had slowed the disks.
Gregg created a YouTube video to demonstrate latency heat maps, a new type of visualization he created to chart system latency. The video went viral in the IT community.
The flame graph came about "under duress," Gregg said. A customer had voiced concern over an application that was running about 40 percent slower than expected. To investigate the problem, Gregg had to sort through 500,000 lines of diagnostic data. He quickly realized it was far too much data to easily comprehend.
Inspired by visualization guru Edward Tufte, Gregg brainstormed ways to visualize the entire data set within a single screen. What he came up with "merged and collapsed together the common elements," while preserving the relation among the elements in the amount of resources they consumed.
Flame graphs are composed of multiple stacks of vertical bars, with each row of bars representing a slice of time, the rows on the bottom being the oldest and the ones on the top of the graph being the newest. Each row might have multiple bars, with each bar representing a different function, and the length of each bar representing the percentage of resources that the function is using at that time.