Garlik applies Semantic technology to consumer data

UK startup wards off ID thieves by tracking identity data

One of the most exciting developments at this year's Semantic Technology Conference is the blossoming of new and robust startups that are using Semantic Web concepts as the foundation for their business.

One such company that's getting a lot of attention at this year's show is Garlik, a U.K.-based startup in the consumer privacy space. Garlik's DataPatrol product uses Semantic Web technology to track personal information for more than 50,000 customers in the United Kingdom, and the company has plans to expand the service into other markets.

CEO Tom Ilube sat down with InfoWorld at the Semantic Technology Conference to talk about his company and about the decision to build a new business on Semantic Web technology that has yet to find widespread adoption.

InfoWorld: Tell us a little about Garlik and how the company got started.

Tom Ilube: Garlik got started in 2005 when I left my role as CIO of a large U.K. bank, Egg. We were an online bank with four million customers. We grew from nothing to that size in about five years, and we were a pure consumer play. When I left Egg and was thinking about the next thing I wanted to do, I wanted it to be a consumer-oriented initiative. I was thinking five years down the road and looking around for what would be one of the biggest issues -- something that seemed like it would grow and grow -- and decided that the whole issue of personal information and personal privacy was it.

InfoWorld: Good instinct.

TI: It wasn't rocket science. You could just look around and see 'Yeah, people are putting all this information out there.' There didn't seem to be companies taking a very strong consumer position in that space. There were a lot of companies that were collecting your information and selling it to other companies, but not many saying, 'We're out here to protect consumer identities.' So we took that position deliberately but also approached it from the position of protection and promotion -- that there would be some information you want to protect and other information about yourself that you want to promote. We tried to think of Garlik as a brand that can span anything you want to do to look after your profile and your identity.

We also looked at what technology we can deploy to go on that journey. We took about six months and looked at a tremendous number of different technologies. We were asking ourselves, 'What will the world would look like in five years when it comes to data?' One thing we observed was the shift from a world of documents to one of data. You don't see that up close every day, but if you take a step back you can see all the data that's out there in different formats, like podcasts and video and underlying databases that are exposed. Just continue that trend another five years and you see the Web as a whole dominated by data and information in that form with a little rump of documents, where that rump is actually the whole Web as we know it today. So the documents don't disappear, but they become dominated by that Web of data, and most of the information we we're interested in is in that Web.

InfoWorld: How does Semantic Web technology help you with that problem?

TI: We asked ourselves, 'What's emerging to make sense of this Web of data?' and that's what led us to the Semantic Web. Because the Semantic Web assumes data rather than documents. It assumes that what's important is the metadata that describes data and attaches meaning to data. It allows you to ask questions about the meaning of data, and that's what attracted us.

InfoWorld: What's the core intellectual property at Garlik?

TI: I  guess our core IP is our understanding of the consumer and their demands. We're very focused on the consumer and what the consumer thinks and how they behave. What we had to do was build an underlying technology platform from the ground up that assumes Semantic marked up data, and if it isn't marked up, to turn it into that so we can work with it. We do lots of harvesting in a directed way -- harvesting things that are interesting to us. So we tailor our harvesting by the semantics we're interested in, and we're not interested in anything else. So we don't have a problem with 'How do we harvest the whole Web?' We're not interested in the whole Web, we're just interested in our customers and their information -- where is it and how can we extract it, turn it into RDF [Resource Description Framework] and reason over it? Semantic Web technology allows us to do those useful things for our customers and helps our customers gain a better understanding of their personal information.

InfoWorld: You're working with technologies that are still in their infancy in some ways. What types of things have you learned and what roadblocks have you encountered?

TI: One question you always have with technology that's still at the research stage is, 'How real is this emerging technology?' We've all had that experience where someone describes some new technology and you say, 'Yeah, I'll take one of those,' and then you realize that the description is all there is. What we found with Semantic Web technology is that most of it is real, but it's not industrial strength and not scaleable. In some cases, the guys solving problems are working on things that are problems, but that don't need to be solved for a few years. So you hear a lot about developing more complex logic-based reasoning for a Semantic world, but until you have information in a form you can work with to do basic reasoning, there's no need to worry about the more complex stuff. So, for us, it's been a process of winnowing out stuff that we don't need to worry about and finding bits of the jigsaw that we do need to worry about and then taking those and making them industrial strength -- with reputation, redundancy, and the things you have to have in big corporate organizations.

InfoWorld: You're a startup. What do you see as the role of the big IT firms like Oracle and BEA and IBM in promoting Semantic Web technology?

TI: It's interesting to see big IT vendors focusing on the big corporate customers. I think that they'll find those are tough nuts to crack. I think what you often find -- when I look at adoption of Web technology with big corporate customers -- is they get validation by looking at the startup community. You know 'If Google and Yahoo are using it, we should be as well.' I would encourage big IT vendors to focus on startups and make sure they're using this technology. When I was a CIO at Egg, before I bought a mainstream platform, I'd look around and make sure others were using it too, and the early adopters are going to be the startups, so the big vendors need to engage with that community.

InfoWorld: What advice would you give to an enterprise IT person who's considering incorporating Semantic Web technologies?

TI: My advice would be to carve out a short two- or three-month pilot project that involves data integration and use that to show that using RDF and ontologies and some harvesting or access to databases can very quickly repurpose siloed information to a graphical front end. I'd also suggest that they focus more than the Semantic Technology conference does at the moment on the front end. This community is highly technical and, as with all technical communities, the front ends can be kind of grey. In the enterprise, you really need to sell this stuff, so you need a UI that will cause people to go 'Oh, wow. How did you do that?' and realize 'We've been trying to integrate these five databases for the last three years and you did it in three months.'