It starts small. One curious user downloads Git to see what all the fuss is about. Before long he’s extolling its virtues to his buddy, and they start pushing/pulling work back and forth. Next comes the developer version of a grassroots movement to use Git on more projects. Sure, various challenges pop up, but there are tools on the market and workflows that help assuage Git’s initial growing pains.
When you go global, however, things get messy. It’s easy enough to rely on various Git hosting approaches when everybody’s connection is in the same geographic area. But what do you do when your developers and other stakeholders who need to contribute content to the project are scattered around the world? I recently heard a customer of ours explain that they literally cannot bring all the talent together required for a new product on a single continent. That’s a scary thought.
The purpose of this article is to explain some of the challenges and outline the key shortcomings to address when using Git in globally distributed teams, touching on:
- Distributed repositories
- Branch and workflow management
A number of third-party solutions attempt to address these in various ways, including Perforce Helix. In fact, Helix provides a comprehensive suite of tools that addresses all of those problems and more.
Replicate your content
There are key features to consider in choosing a solution, first and perhaps most obvious of which is that of simple network latency. The speed of light starts to seem slow when you’re connecting to a Git server on the other side of the world. Git’s protocol for transferring content across the network is pretty efficient, but there is no substitute for closing the distance to the repository. Nobody wants to work with servers in a different hemisphere.
It’s therefore important to look for solutions that allow replication to local repositories, preferably with as much automation as possible for pushes and pulls to and from the other offices. Centralized solutions like Perforce Helix, for example, supply such features via federated architecture, which replicate content over WAN links to provide LAN access for users.
Equally important is the question of handling conflicts. Git’s underlying data model was clearly designed with a single user in mind. This makes sense for the distributed nature of the system, of course, but it can lead to a variety of issues when you’re trying to keep multiple copies of replicas in sync around the world. You’ll want to look for systems built to replicate your content to local repositories with more than a little intelligence, as every merge issue you can avoid is more time to spend on your own content.
Branching and workflow
The branching structure you choose for separating individuals’ work from testing, preproduction, production, and so forth will play a role in how many of those conflicts arise in the first place. Even in this day and age I’m regularly surprised at suboptimal choices that don’t provide clean mechanisms for handing off content to the next stage of the production pipeline. Be sure your branching strategy carves up your workflow at the joints before going global.
For example, a typical development arrangement might include components developed internally by other teams, third-party open source components, and contributions from multiple outsourced teams in addition to all the code written internally for a particular product. The branching structure needs to let each of these groups work independently, handing off units of work with the appropriate granularity yet in such a way that the devops team can easily assemble all the pieces necessary for a release.
Further, the pipeline downstream from development may include quality assurance, preproduction, production, compliance, and perhaps other steps. Not every project is a simple web app with files that can be pushed to a server and released so simply. With more complicated projects, particularly in the enterprise, one simple “golden branch” likely won’t cut it.
QA may need to iterate its own test cases and environments multiple times to validate the release, working back and forth with development as needed. Subsequently, preproduction may need to receive that QA-blessed content and iterate their own validation, verification, or other procedures, while QA moves on to other projects. And heaven help you if compliance finds a fundamental breach without any good way to bring all the teams together to resolve the issue without blocking all other work.
Broadly speaking it’s useful to pick a proven Git workflow and let it guide your branching strategy, one that all your contributors can embrace. No matter what you choose, you’ll need clear conventions for the content that each repository represents, though this can be particularly tricky to maintain if your solution doesn’t give you the tools to tame “Git sprawl.” It’s critical to include the devops team in the discussion from the outset, as the resulting choices will largely dictate the general level of pain for all.
In short, nailing down the branching structure in advance can be crucial to maximizing productivity across multiple groups or departments working in parallel. You don’t want teams blocked or deadlocked waiting on other teams. Trust me, your future self will thank your present self later.
The importance of IT
Another oft-undervalued factor to consider is reliability. High availability and disaster recovery are often abbreviated as HA/DR, which might be doing organizations a disservice insofar as the acronym makes it too easy to forget what the "D" represents. Geographically distributed stakeholders radically increase the demand for uptime.
In a single time zone your IT personnel will typically have clear downtime in the evenings when emergencies can at least be addressed outside of regular working hours, unpleasant a task as that is. But there is no easy downtime when your organization is in every time zone. Having at least a hot backup can be crucial, and regularly validating the recovery plan is mission critical.
The way you organize and implement IT can make or break the experience for your contributors. It’s bad enough when a remote office can’t push or pull content to or from your Git hosting system; it’s worse when their only lifeline is an IT group an ocean away. It’s not enough to find a hosting system that gives you the aforementioned local performance and WAN synchronization. You also need great support available around the clock and around the globe.
Using Git with globally distributed teams comes with burdens above and beyond the typical challenges of global computer networks. You need to make sure all pieces are in place, and your workflows, content development, and content delivery points match the shape of your branching strategy and your organization. You need simple fallback plans for common sorts of failures, and you need a regularly validated recovery plan for complete disaster, the sort of thing insurance companies like to call “acts of God.” Plan ahead to protect your intellectual property.
John Williston is a veteran software developer for Windows, .Net, and the web. He is currently a product marketing manager at Perforce Software.
New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to email@example.com.