What are graph databases good for? Here's a killer app

Still using an RDBMS for friend-of-a-friend queries? Big mistake. Enlist a graph database using Neo4j instead

Page 2 of 2

For example: Tommy is friends with Billy and Bryan. Bryan is friends with Kelly and Jamie. Jamie is friends with Steven. Billy is friends with Keith. We can look at the "distance of the relationships" thusly:

Tommy = 0 (the origin)

Billy = 1

Bryan = 1

Kelly = 2

Jamie = 2

Keith = 2

Steven = 3

Doing this in a relational database structure is pretty painful. In the JPA/SQL version of granny, I need two tables and multiple trips to the database just to walk the graph of relationships. I also have to do pretty much all the ordering and logic myself.

In other words, this is a classic graphical database problem. Relationships matter as much as, if not more than, the data itself. Neo4j is the most popular graph database on the market these days. While graph databases are part of the NoSQL movement, they really solve different problems than, say, Couchbase or MongoDB. We aren't necessarily concerned with handling massive scale or doing analytics across terabytes of big data a la Hadoop's HBase. In fact, most graph databases are transactional, and the reason they are NoSQL is that SQL is simply inadequate to express the problems, as you can see in the amount of code it took in the findSuggestions method.

For the Granny4j version using Neo4j the main query comes down to this:

// select friends and friends of friends, order by depth of the relationship

String findFriendsQuery = "start n=node(*), person=node({userNode}) MATCH p = (person)-[:FRIEND*1..2]-(friend) return distinct p order by length(p)";

As you can see there is a lot less code -- and it does the job. It's also more efficient. Check out all the code for Granny4j.

Why is this important? Theoretically, you can hire offshore developers for as little as $15 an hour who know SQL -- meaning the technology and people who know it are commoditized. Neo4j presumably requires more expensive expertise that is in lower demand. Nonetheless, there's always a correlation between lines of code and the number of bugs. We can decrease downtime and errors by decreasing the number of bugs per line, but it's an expensive process, and ultimately, it's easier to decrease the number of lines of code.

There's also a big efficiency issue. Even on my laptop, the unit test for GrannyJPA takes considerably longer than Granny4j. If you consider this at the kind of scale that a major retailer would require and take into account the law of diminishing returns, there's a real performance and scalability issue.

The biggest objections to introducing a new structured storage technology are usually related to the experience with the technology within the organization or "single source of record." While the latter concern is indeed a problem when combining many types of NoSQL databases with an existing SQL database, it wouldn't be a problem with Neo4j. Like most graph databases, Neo4j is transactional. As for the former consideration, that exists with any new or in this case different technology.

Personally, I'd rather be moving forward at a deliberate pace and finding new efficiencies than standing in place because it's what I've always done. Moreover, graph database technology isn't that much younger than the RDBMS.

As I've mentioned in the past, it all comes down to data structures. By using the RDBMS for everything over the last few decades, the industry has done the equivalent of using a list for every data structure. You wouldn't use only one data structure for every type of data in memory, why would you do that just because you're storing the data?

Sadly, my mom refuses to use any of the fancy tools I've developed. She's just stopped asking me what to get the kids and asks my wife instead.

This article, "What are graph databases good for? Here's a killer app," was originally published at InfoWorld.com. Keep up on the latest developments in application development and NoSQL, and read more of Andrew Oliver's Strategic Developer blog at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.

| 1 2 Page 2