Navigating time in knowledge graphs

The temporal benefits of cognitive knowledge graphs can affect almost any business problem, including basic issues of data management such as data quality, data cleansing, and integration

a clock half-submerged in water
Jonny Lindner (CC0)

The concept of time presents several distinct challenges for data management, particularly as it applies to databases or stores. Those difficulties are related to the nature of time, which is ongoing, and its expressions in repositories. The former means data are relevant both at state (a point in time) and over periods of time, which increases the complexity.

In databases of large organizations, it’s not uncommon for time to be expressed in hundreds of different ways varying in accordance to domain, use case, state, progression, or database administrator. Even smaller organizations are plagued with such issues, especially when working with big data sets or decentralized computing environments.

Traditional relational approaches exacerbate these hardships by expressing temporal values over countless tables across different IT systems, making it virtually impossible to do horizontal queries related to time (or most other variables). Additional problems include values changing over time in columns, making it arduous to understand what data mean.

With knowledge graphs—particularly those naturally imbued with artificial intelligence technologies —such common temporal issues are nonexistent. These knowledge graphs represent time in a uniform manner regardless of how large or varied the use cases for a repository. They are intrinsically queryable to yield comprehensive results across the enterprise for time or any other variable.

Best of all, they support an array of AI techniques to identify any changes in value over time—reinforcing facets of data quality and data cleansing while making integration efforts easy.

Single representation

The kernel of the probability graph approach to navigating time is the singular way in which time is represented in these stores. By using a relatively simple event-based schema, in which all data forms are transferred into events with a uniform structure regardless of data’s schema or structure at origination, time is portrayed in no more than three ways. These include start time, end time, and origination time, and are entirely dictated by the corollary metadata of the data event.

Origination time is useful for events without end times (such as emails), although in most cases users can simply substitute this classification for a start time without an end time. Doing so decreases the complexity of representing time to just two forms. Regardless of which method an organization prefers, the results are much more homogenous than conventional approaches in which each department, business domain, and user uses a different variation or abbreviation for what in actuality is just two forms of the same concept.

Query time

One of the most vital facets of knowledge graphs is that they enable the rapid querying of their myriad data elements, including those for time. This trait is largely due to their reliance on a hierarchy of classification systems (known as taxonomies) that underpin every concept—in business terms—within the graph-based repository. This point is pivotal because there are a number of organizations that have devised taxonomies yet failed to implement them into their actual data stores.

Universal taxonomies across the enterprise are embedded in knowledge graphs, which use this understanding of terminology in many ways. Aspects of time are then able to be included in queries, yielding invaluable insight into patient recovery, medication, or treatment efficacy for verticals such as health care or life sciences. Organizations can also use them with machine learning for predictive and prescriptive purposes.

These AI algorithms exploit the semantic environment of the graph stores to identify relationships between data elements (which may otherwise go unnoted) to determine the probability of the occurrence of future events. In health care, these predictions might include adverse reactions to otherwise mild food allergies or an ideal treatment path for a disease. In telecommunications or retail, these prescriptions could include upselling or cross-selling opportunities based on timely recommendations. By simplifying time’s representation and including it in the machine learning process, these graphs can potentially deliver these results in real-time scenarios for immediate benefit.

Periods of time

Over time, the temporal benefits of cognitive knowledge graphs can affect almost any business problem, including basic issues of data management such as data quality, data cleansing and integration. AI measures have long been used to indicate the most effective means for integrating data sets. AI knowledge graphs produce these benefits and more advanced ones, such as determining the meaning of values in columns when they have changed over a period of time for different uses.

Machine learning algorithms can identify the meaning of those values by comparing them to other tables in different databases or the same ones, while the semantic nature of these graphs facilitates the environment in which relationship discernment is optimized. All these temporal problems arising from how time is represented, queried, and demonstrated over periods of times are readily assuaged with knowledge graphs.

This article is published as part of the IDG Contributor Network. Want to Join?