For instance, if MySQL administrators need more memory for a database, they can just enter that single requirement into Chef. Any change made to a configuration file is then propagated across all of Facebook within minutes.
"If you make a change, it goes across the world in 30 minutes or less," Dibowitz said.
Giving line-of-business engineers this control, in turn, helps the core infrastructure team. "This allows us to do way more work with way fewer people," Dibowitz said.
Facebook, however, had to modify the way Chef handles out-of-date configuration changes. The software offers no easy way of automatically deleting changes when they are no longer needed. "This is not a comfortable thing to do when you are at the scale of hundreds of thousands of systems," Dibowitz said.
Typically, organizations use Chef by building their own "cookbooks," or list of configuration changes that Chef then applies to a system.
Facebook, however, took the process a step further. The company's engineers developed a set of templates for defining the defaults in each configuration file, calculating a numerical hash to match the original template.
Each time Chef runs a configuration file it can then use the hash value to delete all entries that have been designated as no longer needed.
It is an uncommon way of using Chef, Dibowitz admitted. This way of working with Chef does not require any additional tooling, though. "This is all stuff built into Chef, we're just using it differently," he said.
The company has rolled out Chef to manage all of its servers and is in the process to migrating other software and hardware onto Chef as well.
Facebook has also fed some of the code modifications it has made back to the company shepherding the Chef code base, also now called Chef, after changing its name from Opscode in December.
The engineers also wrote some additional open source utilities for Chef, some of which have been posted on GitHub. One such tool is called Grocery Delivery, which allows multiple Chef clusters to keep their cookbooks in sync. As a result of the code being on GitHub, others have contributed additional changes back to Facebook.
Perhaps most importantly for Facebook though, Chef allows system managers to have greater control of their own configurations, an approach that Dibowitz admitted has attracted some criticism, which he doesn't mind
"Our job at Facebook is to get out of the way of our engineers and let them do their jobs," Dibowitz said. "Our job is not only to make these systems reliable but also to give everyone the access they need to get their jobs done."