GitHub for the rest of us

How GitHub is making complex collaboration work for non-programmers

1 2 Page 2
Page 2 of 2

Visualizing change

Version control and change visualization are deeply wired into the work of software development. Nowadays no competent programmer would even think of discussing a proposed new version of some code without a "diff" that shows exactly what will change.

That expectation is another part of the unevenly distributed future inaccessible to most other knowledge workers. It's a fundamental kind of digital literacy, relevant to everyone in an organization, but not yet pervasive. Obstacles to its spread are both cultural ("we've never done it that way before") and technical ("my work product is not a text file").

The digital artifacts of software development are still files containing lines of text that hark back to punch cards. And we still visualize changes to those files on a line-by-line basis. Compilers and IDEs understand code in terms of modules and methods, but version control systems don't share that understanding. Attributing a change to module X or method Y, and observing such change over time, is cognitive grunt work that could in theory be machine-supported but in practice isn't.

This impedance mismatch exists for deep historical reasons and won't be resolved anytime soon. Meanwhile there are two ways to address it, and GitHub is pursuing both.

One approach is to convert rich documents into text files. That's a common practice in government agencies that have adopted GitHub for collaboration, according to Ben Balter. He's created a tool that can convert the Word documents widely used in such agencies into Markdown, a plain-text format used on GitHub and in many other environments. That workaround is less than ideal for two reasons. Roundtripping documents through format converters is perilous -- and Markdown isn't a standard format. There are many variants; in fact, the one used on GitHub is known as GitHub Flavored Markdown.

Ideally, GitHub would understand rich formats, and there's been progress on that front. It's long been possible to compare changed images in a visual way. A year ago, "prose diffs" enabled inline color-coded highlighting of differences between HTML renderings of Markdown files. This approach also helped make differences in tabular formats like CSV data and HTML tables more legible, but didn't leverage any deep awareness of document structure.

Such awareness is now available for one format: GeoJSON. It encodes geospatial information in a JSON format that GitHub uses not only to display a map as one rendering of the data, but also to show the map's revision history visually, using a slider to scroll through versions. Extending that approach to Word documents, PDF files, and spreadsheets would make GitHub-style collaboration vastly more appealing to people whose work products are expressed in those formats.

GitHub as a platform

GitHub can't be all things to all of its millions of hosted projects, but it can enable others to build on top of it and integrate with it. Tools that use GitHub APIs to wrap project management features around existing repositories include, HuBoard, and ZenHub.

Continuous integration systems like Travis and Jenkins can use the status API to report the outcome of tests associated with a commit. Moreover, the CRUD API enables programmatic commits that create, update, or delete files in a repository. Ben Balter has used it, for example, in an application that takes input from HTML forms and appends it to a CSV file in a GitHub repository. 

Of course GitHub isn't the platform for everybody. O'Reilly Media's Atlas, a hosted system for publishing books in multiple formats, is built directly on top of Git. But for many nontraditional uses of Git, GitHub's interface -- and evolving extensions to it -- will be a powerful combination.

A culture of collaboration

Like Git, GitHub enables many styles of collaboration. It encourages key best practices, like issuing pull requests from disposable branches, but doesn't try to legislate others. Consider labels, which are keywords assigned to issues and pull requests. (Other social systems might call these tags, but in Git tags identify specific points in a repository's history.) Nothing requires you to use labels, but if you do -- and if your team adopts a thoughtful and consistent vocabulary -- you enable filtered views that help everybody make sense of a project.

Crafting useful commit messages is another way to be a helpful collaborator. Programmers joke about commit messages like "I changed some stuff" and have, over the years, learned how and why to narrate their work more effectively. Most other knowledge workers, though, aren't used to formalizing small units of work, contributing them to widely shared spaces, and describing them in ways meaningful to others.

These practices, layered on a common understanding that all shared artifacts ought to be versioned and all changes carefully controlled and documented, should never have been restricted to software development. We're all doing distributed work that must be coordinated with care, attention to detail, and team awareness. Git made that possible for programmers. GitHub is making it possible for the rest of us.

Copyright © 2015 IDG Communications, Inc.

1 2 Page 2
Page 2 of 2