Forking is often viewed as a last resort for software projects. However, the growth of GitHub and other distributed version-control systems, along with a reluctant acknowledgement from a key vendor of the popular Subversion version-control system, suggests that forking is going to become commonplace in 2011. Plan ahead to ensure your company is ready for this shift in development methodology.
Centralized version control rules the day -- today
Version-control systems (VCSes) fall into two broad categories: centralized and distributed. The merits of each have been widely debated; here is a good detailed explanation of the differences.
[ Keep up to date on the key open source developments in the Technology: Open Source newsletter. ]
A centralized VCS relies on a central server hosting the main or trusted version of a project, often referred to as the "trunk." Developers check code out and in against that central copy of the project. There is only one copy of the entire source code for the project, on the central server. Developers on a project can see a change from another developer once the first developer has checked his or her changes into the main trunk.
Distributed VCSes, on the other hand, are designed so that any repository could be considered to be the "main" or "trusted" version of the project. Each developer has the entire project's source code in a local repository on his or her computer. As such, developers on a team can share changes with each other, in each other's local repository, before merging their changes into a common, centralized repository.
The vast majority of VCSes used in public open source projects and internal enterprise software projects are centralized. An analysis of more than 240,000 open source projects tracked by Ohloh demonstrates an overwhelming bias toward centralized VCS usage such as Svn (as Apache Subversion is usually called), Svnsync, and CVS. Distributed VCS such as Git, Mercurial, and Bazaar account for just 14 percent of usage.
Data from the 2010 Eclipse User survey, which can be used as a proxy for internal enterprise software project usage patterns, reveals a similar preference for centralized VCSes. Distributed VCS usage accounts for just 11 percent of VCSes used by the 1,528 respondents to this question in the survey.
|Name||Responses||% of responses|
|Distributed CVS: Git/GitHub||115||7.5|
|Distributed CVS: Mercurial||51||3.3|
|Centralized CVS: Subversion||989||64.7|
|Centralized CVS: CVS||214||14.0|
|Centralized CVS: Other||159||10.4|
|Source: 2010 Eclipse User Survey|
This data suggests that open source projects are ahead of the curve in adopting distributed VCSes. However, the 3 percent difference in usage between the two data sources could be well within the margin of error for each of the surveys.
Suffice to say, distributed VCSes are not commonplace in today's software development practice. But that's about to change.