At the time, the cluster consisted of 2000 machines, 800 16-core systems and 1,200 eight-core machines. Each of the systems in the cluster stored between 12 and 24 terabytes of data.
Facebook had a pair of potential methods for moving the cluster to a new data center, Yang said in his post.
The company could physically move each node to the new location, a task that could have been completed in a few days "with enough hands at the job," he said. The company decided against that route because it would have resulted in an unacceptably long downtime, he said.
Instead, Facebook decided to build a new, larger Hadoop cluster and simply replicate data from the old cluster on it. The chosen approach was the more complex option because the source data that Facebook was seeking to replicate was on a live system with files being created and deleted continuously, Yang said in his blog.
Thus Facebook engineers built a new replication system that could handle the unprecedented cluster size and data load. "Because replication minimizes downtime, it was the approach that we decided to use for this massive migration," he said.
According to Yang, the data replication project was accomplished in two steps.
First, most of the data and directories from the original Hadoop cluster were copied in bulk to the new one using an open source tool called DistCp.
Then all changes to the files and data that happened after the bulk copying was done were replicated to the new cluster using Facebook's newly developed file replication system. The file changes were captured by a Hive plug-in that was also developed in-house by Facebook developers.
At switchover time, Facebook temporarily shutdown Hadoop's ability to create new files and let its replication system finish replicating all data on to the new cluster. It then changed its DNS settings so they pointed to the new server.
According to Yang, the fast internally built data replication tool was a key contributor to the success of the migration project.
In addition to its use for data migration, the replication tool is used to provide new disaster-recovery functionality to the Hadoop cluster, he said.
"We showed that it was possible to efficiently keep an active multi-petabyte cluster properly replicated, with only a small amount of lag," he said. "With replication deployed, operations could be switched over to the replica cluster with relatively little work in case of a disaster."
Jaikumar Vijayan covers data security and privacy issues, financial services security and e-voting for Computerworld. Follow Jaikumar on Twitter at @jaivijayan, or subscribe to Jaikumar's RSS feed. His email address is email@example.com.
Read more about BI and analytics in Computerworld's BI and Analytics Topic Center.