Social graph models are powerful enablers for fine-grained predictive modeling of human behavior because they help identify the likely behaviors of individuals in their fuller context -- of groups, relationships, and influence. These models offer microscopically detailed views of the customer experience by focusing on human actions and interactions.
When it comes to social graph analysis, that task can be simple if you're only interested in a few individuals, only investigating one type of connection among them, and only mining one static pool of behavioral data associated with them. On the other hand, if you're trying to assess the shifting behavioral patterns of every possible relationship among every person, place, and thing on the planet, plus all the things they might be saying to each other, dynamically and in real time with perfect predictions about what they might do at every point in the future ... you're living in a science-fiction fantasy world.
The world is slowly waking up to the potential of social graph analysis to transform a wide range of applications in the public, private, and research sectors. It's developing rapidly into one of the most promising new segments in the big data market, and it's the core application of various commercial and open source graph databases (often lumped under the "NoSQL" umbrella).
In many industries, social graph analysis already powers antifraud, influence analysis, sentiment monitoring, market segmentation, engagement optimization, experience optimization, and other applications where complex behavioral patterns must be rapidly identified.
For all its potential, social graph analysis is also a big data resource hog waiting to happen. At the most basic level, you can model social graphs as networks of nodes and links, of entities and relationships, or of individuals and connections; graph analytics professionals also use the terms "vertex" and "edge" to refer to pretty much the same things. We're starting to hear about massively parallel public-sector graph analytics infrastructures that execute graphs consisting of 4.4 trillion nodes (records) and 70 trillion edges (relationships among those records). Facebook's own social graph analytics infrastructure handles billions of nodes and low trillions of edges in its own right.
Think about that: Web-scale graph analytics initiatives already operate at a scale -- storage processing, memory, interconnect, data center square footage, power consumption, and so on -- that dwarfs almost every other type of big data deployment you can name. And the scalability requirements are sure to grow by leaps and bounds as the size of graph models balloons; the range of data sources from which they ingest expands; the volume, variety, and concurrency of workloads they execute grows; and the need for real-time low-latency speed ramps to the next level.
Graph analysis will push big data evolution to the next plateau of scale and sophistication. Hadoop is one segment of the evolutionary picture, but it's not necessarily the centerpiece. All in-memory, massively parallel graph database architectures will drive the show, as will a wide range of NoSQL databases that specialize in discovering, correlating, and preprocessing behavioral data from every possible source.
If you're serious about graph analysis, you're going to need to ramp up all three big data Vs -- volume, velocity, and variety -- to do it effectively. As component costs drop and quantum computing architectures take hold, it's not inconceivable that many organizations will, within our lifetimes, start to operate global graph-analysis clouds that are exabyte-scale, zero-latency, and all-in-memory.
This story, "Graph analysis will make big data even bigger," was originally published at InfoWorld.com. Read more of Extreme Analytics and follow the latest developments in big data at InfoWorld.com. For the latest developments in business technology news, follow InfoWorld.com on Twitter.