As the sources, types, and amounts of data continue to expand, so will the need for different kinds of analytics to make something of that data. Unfortunately, there is not a one-size-fits-all approach to analytics -- no magic pill that will get your organization the insight it needs to stay competitive. Graph analytics has emerged as the new hot topic, but to what end? What is the impact of graph analytics technology on organizations seeking to discover the cause, effect, and influence of events on business outcomes?
In exploring how graph analytics can be applied to solving problems, it’s worth noting how graph analytics is different from relational analytics. To put it simply, relational analytics typically explore relationships by comparing "one-to-one" or maybe even "one-to-many." For example, using relational analytics, it would be easy to identify one person and his or her 10 friends. It would also be easy to find any number of people and all of their friends. The people of interest may be in one table and their friends in another, so a simple join is possible.
By contrast, graph analytics can compare "many-to-many." With relational analytics, it becomes much more difficult to answer questions about the second level of "indirect" friends a person has, but this is where graph analytics shine. Graph analytics make it possible to ask not only about the friends of a person but also all of their friends too. Building on these kinds of questions allows researchers to find key influencers within an entire network, not merely within the direct relationships of a subset of that network.
Graph analytics can also infer paths through these complex relationships to find connections that are not easy to see in relational analytics. Relational analytics are ideal for analysis of structured, unchanging data via tables and columns. But being able to look at your data through different analytic lenses, such as graph, is useful for unstructured, constantly changing data because it gives users information and context about relationships in a network and deeper insights that improve the accuracy of predictions and decisionmaking. Graph analytics are not a replacement for relational analytics; organizations will always have a need for both. Thus, it’s important to determine which scenarios are the best fit for each.
Even a simple list of person->knows->person can be extremely complex. But with graph analysis we can identify the key individuals in the graph and visualize them (node size indicates influence). We can also group the individuals into communities that have common sets of relationships (the edge colors indicate community membership).
Specializing in open-ended questions
One area where graph analytics particularly earns its stripes is in data discovery. While most of the discussion around big data has centered on how to answer a particular question or achieve a specific outcome, graph analytics enables us, in many cases, to discover the "unknown unknowns" -- to see patterns in the data when we don’t know the right question to ask in the first place. Graph analytics makes this possible by teasing out relationships that aren’t obvious -- to identify a “needle in a stack of needles,” so to speak. As patterns begin to emerge from multiple data sets, we start to gain a more complete picture of everything that actually affects business outcomes, so that we can address them appropriately.
In this way, we begin to determine the contextual impact of the data to a business -- how all the data elements that we are gathering from multiple applications and sources (CRM, ERP, logistics software, sales, IoT, weather, government, social media, etc.) interrelate and impact the business. In particular, we can discover the impact of events and their relationship to a business. No one might ever intuitively make the kind of connections that can be discovered through graph analytics. In a way, it’s a practical application of the “Chaos Theory” made famous by the film/book Jurassic Park: If a butterfly in South America could cause a hurricane in Florida, you’d never know unless you used graph analytics to examine the myriad relationships that lay between them.
What can graph analytics accomplish that other analytic approaches cannot? Based on graph mathematical theory, graph analytics model the strength and direction of relationships within a given system. Graph analytics can be used not only to detect a correlation, but also to determine its nature and how significant it really is within the overall system.
Graph analysis applications
Graphs can be used to model all sorts of relationships and processes in all kinds of systems. For example, in social or informational systems, graph analytics might be used to compare financial trade data with social, geographic, and other data, or to find patterns across varied data sets that signal the onset of cyber attacks. Additionally, it might be applied to social media to enrich the customer view with patterns and relationships, or to detect patterns in communication that might indicate a threat to national defense.
In biological systems, graph analytics may yield new, vastly more effective treatments by analyzing relationships in proteins, chemical pathways, DNA, cells, and organs, and by determining how they are affected by combinations of lifestyle choices and medications. If there is indeed a cure for cancer, graph analytics will undoubtedly play a crucial role in discovering it.
While the applications of graph analytics are unlimited, there are a few common ways that we can classify the approaches. You might use graph techniques to identify “centralities,” such as items or events that lie at the root of other surrounding events or patterns. Of course, in social media, this has tremendous application for finding the “influencers” -- the people who actually start the trends and shape opinions that affect your brand.
A second application of graph analytics is useful in identifying connections between two or more items. One example of this in the life sciences industry is pairing proteins with certain medications and chemical pathways in disease research. Or in financial services, identifying preliminary indicators of cyber attacks so that they can be prevented.
A third major application of graph analytics involves identifying communities that revolve around a certain theme. For example, the FBI might be interested in identifying groups of people who have been communicating about bomb-making. A more benign example might be identifying groups of people who rally around polka music (so that you can sell them lederhosen, of course).
You’re already using graph analytics
In addition to being used in government and science, graph analysis technology has become a part of our daily lives. Consider Facebook and LinkedIn, for example, which help us make connections based on relationships detected through graph technology. As social networks are by definition based on relationships, it should come as no surprise that graph analytics will play a major role in helping us make sense of the vast unstructured data sets being generated by social media.
Consequently, this is perhaps the area where companies are most eager to apply graph technology, seeking to identify the social influencers and circle of followers that most affect their brand. Also, graph analytics can help companies discover how certain interests on social networks correlate with interest in their brand. For example, does a shared interest in a certain musical performer correlate with enthusiasm for a certain brand of pickup truck? Ford and GM marketing would certainly like to know.
Considering the far-reaching benefits of graph analytics, you may justifiably ask why graph technologies have not been more widely applied, and why we’ve only recently begun to see substantive discussion about it. There are a few reasons, but one of the biggest is that effective graph analytics require the ability to analyze very large, extremely varied data sets, often in real time. Until now, graph analytics offerings have lacked the speed, the scale, the ease-of-use, and the openness that are required to meet real world needs -- capabilities that have only recently become available largely thanks to cloud and open-source technologies.
With that said, there are a few questions organizations need to keep in mind as they begin considering the use of graph analytics. Will the approach deliver results in the speed and scale needed to make a difference? Can it handle ever-increasing data types and volumes, and do so without breaking the bank? And finally, how well does it play with other types of analytic outputs?
For businesses, the ability to derive actionable intelligence from data will largely depend on the flexibility of the platforms they employ. Graph analytics is still just one arrow in the analytics quiver, and its value will largely depend on the ability to provide right-time insights for your organization that were otherwise undiscoverable.
Actian CTO Michael Hoskins directs Actian’s technology innovation strategies and evangelizes game-changing trends in big data, analytics, Hadoop, and cloud. Mike received the AITP Austin chapter's Information Technologist of the Year Award for his leadership in developing Actian DataFlow, a highly parallelized framework to leverage multicore. Follow Mike on Twitter: @MikeHSays.
New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.