If the open source model has a sweet spot, it's in programming tools. Linus Torvalds's fabled "world domination" on the desktops of clerks or CEOs may never arrive, but it's already here on the computers of programmers everywhere. Even in the deepest corners of proprietary stacks, open source tools can be found, often dominating.
The reason is clear: Open source licenses are designed to allow users to revise, fix, and extend their code. The barber or cop may not be familiar enough with code to contribute, but programmers sure know how to fiddle with their tools.
[ Also on InfoWorld: Find out which 7 programming languages are on the rise in today's enterprise and beware the 12 programming mistakes to avoid. | Keep up on key application development insights with the Fatal Exception blog and Developer World newsletter. ]
The result is a fertile ecology of ideas and source code, fed by the enthusiasm of application developers who know how to "scratch an itch." Programmers are a knowledgable and opinionated bunch; open source lets them share their knowledge and implement what they want.
Here is a very unscientific survey of worthwhile open source tools that have caught our eye. Some are entirely new projects; others are old favorites that continue to generate new ways to surprise us as they morph to support the latest programming trends.
This is the beauty of open source. Tweak and recompile, and your old programming tool can be new again.
Ruby may be the second most popular language on Github, but that won't do you any good if you want to program for the iPhone, a platform that prefers Objective-C, the way God intended when he first created the NeXT machine.
Rhomobile Rhodes is an open source platform for bundling up Ruby websites and stuffing them into an iPhone app. You can even use jQuery Mobile to handle the layout if you wish. It's like building a Web app, but you have to remember that the user has big fat fingers instead of a much more precise mouse pointer.
While many developers continue to use CVS and Subversion, a number of projects are moving to Git, a source-control tool that works well for less centralized teams where a dominant central repository might not exist.
What Git does is it makes practically every copy its own central repository and offers sophisticated tools for merging the resulting proliferation of repositories. With SVN or CVS, users check out just a copy, a subordinate version of the code that must eventually rejoin the center. Git users, on the other hand, create stand-alone repositories with all the rights and privileges of the center. With Git, you can create four or five repositories on your development box and eventually merge them all. To use an analogy, Git is like democracy, while CVS represents the old feudal world.
Of course, not everyone welcomes the flexibility Git provides. Some see this freedom enabling confusion. Proponents counter that you're not required to use all of Git's power, but it's there to help out when the project requires more than a central government. Some developers have create Repo to combat the complexity of Git. A tool for pushing changes through multiple repositories, Repo is, in a way, the re-emergence of central control for the Git ecosystem.
Open source programming tool on the rise: Gerrit
The rise of code reviews at larger development shops could lead to only one thing: the creation of a tool to automate the process. Enter Gerrit.
Meant to work closely with Git and Repo, Gerrit allows code validators to send comments to the central Git repository, creating an extensive meta layer of discussion on top of the code itself. In the old days, discussions took place in header comments, but by separating comments to a dedicated layer, Gerrit allows for a more sophisticated discussion that doesn't force future readers to wade through old change discussions before getting to the code.
The power of Hadoop was put publicly on display in the form of Watson, IBM's "DeepQA" machine that recently beat the two greatest human champions in a game of "Jeopardy." The framework was used to orchestrate dozens of algorithms searching for an answer in parallel.
Hadoop is a general tool kit for splitting apart the work into pieces that can be computed on separate servers, then joined together into a final product. Google pioneered the idea when it needed to choreograph a vast army of servers to crawl the Web, and now Hadoop offers a general framework that's being used again and again in similar situtations.
Hadoop's original simple core may be several years old now, but there's a great deal of interest in spinoffs that bundle Hadoop with code for tackling specific problems. Mahout, for one, is a scalable machine-learning framework that analyzes large data sets for patterns that might emerge. Hive offers a data warehouse that can be queried with parallel search using HiveQL. This method is fast becoming a popular approach for dealing with massive quantities of Web logs.
These plug-ins are usually pretty easy to string together and glue into a coherent display. There are even some bigger collections of plug-ins that harmonize the widgets. jQuery Mobile, for instance, is dedicated to producing applications that run well on the small screens of smartphones.
Open source programming tool on the rise: Emacs LISP
Every so often, I come back to emacs and recognize just how wonderful it is, 20-odd years after it first took hold. Even today, it is easier to record macros, rebind keys, and customize the tool kit than many of the bigger, flashier programming tools out there.
While it's probably not fair to call emacs "new" or "rising," the platform isn't dropping off anyone's radar. Git ranks "emacs lisp" as the 13th most popular language based on projects and interest. By comparison, C# is 12th. Most of the code is built by programmers and for programmers only. One project, Rinari, for instance, turns emacs into a Ruby IDE. Another, MozRepl, allows Mozilla users to monkey around with the guts of Firefox using emacs.
It's hard to write about programming tools without mentioning Eclipse. While the IDE is well-established, plug-ins continue to re-invigorate it. Take, for instance, the fact that Eclipse plug-ins exist for practically every important language available. PHP, Ruby, Python, and C all live comfortably in this IDE, thanks to the evolving Eclipse plug-in ecosystem.
Almost as important as the plug-ins are the sophisticated ecologies that support them, many of which are open source. The Eclipse Marketplace is one such site devoted to helping users discover the tools they need. The site includes a social networking layer, showing who likes a particular plug-in and which plug-ins offer similar or competing solutions, thereby opening your search beyond simple lists of the most popular or the most downloaded.
Most people see the browser as a mechanism for getting to Facebook or finding directions from Google Maps. Programmers, however, are increasingly able to take advantage of programming tools built right into the browser, with the Firefox plug-in Firebug leading the way.
The Firebug ecology is so fertile that it has spawned a subcategory of plug-ins that extend Firebug itself, often in surprising ways. FirePython, for example, doesn't actually live on the browser; it gets inserted into the server, where it delivers debugging information to the browser.
Thanks in some measure to Firebug's popularity among developers, all the major browsers now offer detailed information about the images, scraps of code, and whatnot that make up the page on view -- an approach that will only become more common as more software is written to take advantage of increasingly robust browsers.
Many programmers often say, "I like the libraries and the distribution and the reliability of X, but I can't stand the syntax." This is why we now have a proliferation of preprocessors that modify the code before the compiler takes hold. They let you program in language X while writing something sort of different because whatever you write is converted into X after you write it and before the compiler reads it.
Over the past few years, tools for building Java projects have evolved from something one person ran on a desktop occasionally into tools that run on a server every few seconds to coordinate the work of a team of programmers. The server constantly monitors the source tree, executing an Ant or Maven script whenever new code appears. Compilation and test results are then posted for all the developers to see. Fancy dashboards that display bugs and fixes in real time are a popular attraction.
The wealth of rising open source projects in this area indicates that programmers still haven't found the optimal mix of features. Cruise Control is the original open source build tool that is well-integrated with most repositories and bug databases. Apache's Continuum is highly integrated with Maven, and users of Continuum like to say that all you need to do is "point the pom.xml file at the repository." Another popular project once known only as Hudson is more open to using building scripts written for Ant or a few others. In late 2010, the team broke in two and the group dominated by Oracle's paid developers kept the name "Hudson," while the others are creating a new open source build management tool called Jenkins.
Many users stress that constantly building the software and often deploying it almost immediately afterward increases the harmony of the team and prevents programmers from drifting down different paths that require too much time to harmonize. By continuously rebuilding the software and applying unit tests, the team is more likely to converge.
Graphics processing units are best known for popping up triangles from mythical worlds where people are always shooting at each other. This is rapidly changing as both video card manufacturers and programmers are realizing that the chips are massively parallel computers primed to manipulate almost any code, not just game realms. Scientists everywhere are learning that the cool graphics card used to play Grand Theft Auto can also run simulations to help cure humans. Many scientific problems can be structured to include a huge number of events that happen simultaneously, a perfect job for a massively parallel computer that does things simultaneously: in other words, a video card.
The OpenVidia repository is filled with projects that perform image recognition, searching, and more. It makes a perfect excuse for every programmer to ask their boss for an expensive graphics card with the potential to generate a very high frame rate -- er, I mean a very high rate of curing cancer in simulations.
The NoSQL trend started several years ago, but it keeps heating up as more websites recognize that their future is in vast quantities of data that don't need all of the belts and suspenders protections offered by serious databases like Oracle.
The latest tools make it easier to deploy NoSQL into clouds, many of which are now sold directly to the IT department. Amazon's SimpleDB can be paid for by the byte, and many other teams are offering additional NoSQL tools as services. Cassandra, for example, is supported by DataStax. MongoDB has inspired more than a handful of cloud hosts. The tools continue to proliferate, boasting almost too many to list. Thank goodness someone is maintaining a list of all the NoSQL databases.