Semantics in the Cognitive Corporation™ Framework
When depicting the Cognitive Corporation™ as a graphic, the use of semantic technology is not highlighted. Semantic technology serves two key roles in the Cognitive Corporation™ – data storage (part of Know) and data integration, which connects all of the concepts. I’ll explore the integration role since it is a vital part of supporting a learning organization.
In my last post I talked about the fact that integration between components has to be based on the meaning of the data, not simply passing compatible data types between systems. Semantic technology supports this need through its design. What key capabilities does semantic technology offer in support of integration? Here I’ll highlight a few.
Logical and Physical Structures are (largely) Separate
Semantic technology reduces the tie between the logical and physical structures of the data versus a relational database. In a relational database it is the physical structure (columns and tables) along with the foreign keys that maintain the relationships in the data. Just think back to relational database design class, in a normalized database all of the column values are related to the table’s key.
This tight tie between data relationships (logical) and data structure (physical) imposes a steep cost if a different set of logical data relationships is desired. Traditionally, we create data marts and data warehouses to allow us to represent multiple logical data relationships. These are copies of the data with differing physical structures and foreign key relationships. We may need these new structures to allow us to report differently on our data or to integrate with different systems which need the altered logical representations.
With semantic data we can take a physical representation of the data (our triples) and apply different logical representations in the form of ontologies. To be fair, the physical structure (subject->predicate->object) forces certain constrains on the ontology but a logical transformation is far simpler than a physical one even with such constraints.
Meaning is Inherent in the Infrastructure
Semantic technology includes a powerful ontological representation at its core. This key capability, constraining terms as a way to create precise definitions, allows us to represent information with clear and unambiguously defined meaning. More to the point, by starting with precise definitions at an enterprise level, we can assure that each system ties to that ontology through related system-level ontologies.
The “meaning” aspect of semantic technology is valuable in the Cognitive Corporation™ since the data from all systems is being integrated and used to inform our predictive models and ultimately our applications, through workflow, rules and data changes. Having a concise set of definitions that each system can tie to allows us to incorporate legacy systems as well as create new systems each sharing the concepts from our enterprise ontology.
The integration of operational data and behavioral data is vital to a learning organization. In order to learn it is not good enough to see that some value is changing, we have to be able to track down why the value is changing. This means that the operational data alone is insufficient to inform learning. Operational data allows us to build models representing what is known to be happening. It is not particularly useful in helping to identify the root cause. Our broader behavioral data is required and must be correlated with the operational data. Semantic technology supports this requirement very well.
Flexibility in Data Representation through URIs
When integrating systems we may need to work with a variety of data, including rich data such as documents and media files. The structure used to represent data in semantic solutions gives us significant latitude in terms of the data being represented. We can create relationships to constants (string, integer, floating point and so forth) but of more interest is how we create relationships to other objects. That is done through URIs.
This simple approach to data representation, using URIs, removes a great deal of noise from the design of our database. It also allows us to take advantage of an ongoing trend to move unstructured information into Content Management Systems (CMS). Any modern CMS will permit referencing its information through a URL – easily integrated with our semantic data store. Further, integration between systems, even those from other businesses, is much more reasonably accomplished through URLs than copying data or creating proprietary integrations.
Another benefit to representing objects through the use of URLs is the fact that the content represented by a given URL can differ – an image in one case, a video in another and an MP3 in a third. We do not need to bother our data tier with these details. When representing data we are interested in intent, the logical relationships. Rendering is not the job of the data tier – so why add that overhead to our data infrastructure? It is the data tier’s job to enable the logical structure so that we can find what we are looking for. What to do with the data once we find it is the job of other tiers in our solution stack.
I hope this post shines a little light on how semantics plays a key role in supporting a Cognitive Corporation™. As mentioned in an earlier post, we will be putting these concepts to the test with a set of enterprise products. The team looks forward to being able to share their insights and lessons learned.
Are you looking at semantic technology within the scope of your enterprise architecture? What do you see as its principal benefits, if any?
Tags: cognitive corporation, data, enterprise applications, Information Systems, linkedin, ontology, semantics, system integration