Knorr: That kind of data reconciliation is the dirtiest job in IT.
Nath: It gets dirtier particularly in home-grown applications, because the semantics of any particular data element are unclear. For example, if you put a flag in a particular field, it may mean one thing; if you put a date in it, it means something else. So that level of interpretation is sometimes hidden within the code so it's not obvious in any form, which mostly means that the metadata associated with this information is very, very lightweight. It exists either in an implicit form within the code or in some user's head or in some specification. It's not available for anybody to interpret.
Knorr: So in the process of building Trigo, these problems came into sharp relief. And that was the genesis of zAgile?
Nath: Yes, but the challenge of integrating product information was only half of the inspiration. I had another problem: managing 160 developers in four countries. It was always impossible to keep teams in sync with respect to methodology, processes, timelines, status, and so on.
It was frustrating because I knew that each tool and application had the needed information in its own repository, but there was no efficient way to aggregate and reconcile all that data into a real-time or even a weekly report. As it turned out, the problems of integrating multiple development teams were similar to the ones we were encountering with integrating product information.
But information management highlights only one dimension of the problem: Taking data from multiple sources and creating a consistent and centralized repository out of it. There's a bigger problem, and that's getting even bigger now, which has to do with integrating social collaboration and processes in the enterprise. So how do you pull all of that together? The problems of integration were compounding before we even tackled one dimension of it.
Knorr: So how did you determine that the Semantic Web held the solution?
Nath: I wasn't happy with my other options. When I looked around at existing solutions, they were always custom. Whether it was ETL, application integration with a service bus, or even data federation, it was the same thing time and again. We were required to map every single artifact, and then still there wasn't a holistic, contextual relevance to the content.
Whenever you integrate content, you have the same problem. The integration entails mapping data to data and does not capture any intrinsic understanding of the process or context of how the data is related. It's essentially dumb mapping. We needed to represent taxonomies and manage attribute-level inheritance, which aren't very natural at all using conventional database technology. And I saw that semantic technologies could do not only that but a heck of a lot more -- and with ease.
The fundamental premise of our platform and architecture is integration based on semantic reconciliation. That gives you the ability to define, regardless of what the tool is or what's coming from the tool, a single, agreed-upon classification scheme. Not only that, you can capture a lot more information. The richness of metadata that you can capture simply isn't there with relational models.
Knorr: Can you give me an example?
Nath: Sure. Here's a simple one. We just implemented a solution for a customer yesterday that would be impossible using conventional integration schemes. On this customer's internal website, there were pages in a wiki that represented software requirements, which were in turn implemented using an issue tracking system. There's a very simple point-to-point integration scheme: A page implicitly represents a requirement definition, and on this page they would add links to implementation tasks for easy reference.