Taxonomies and Ontologies as Semantic Models

In describing what taxonomies and ontologies are and what they can do, we are hearing the word “semantics” more often. “Semantics” means “meaning,” which is nothing new, and taxonomies and ontologies are not new. What is new is that taxonomies and ontologies are now combined more, and we need a way to describe them together, and that involves the description of “semantic.” Furthermore, taxonomies and ontologies are being implemented in new and expanded applications, where the word semantic(s) has significance.

Semantics in Taxonomies and Ontologies

Taxonomies have semantics in their concepts. A taxonomy is not just a term base or a term list, but rather is an organized set of concepts, each with its own unambiguous meaning. The concepts bring together different labels, like “synonyms” for the same thing, and their meaning and usage is further clarified by their arrangement in a hierarchy. It’s often said that a taxonomy comprises “things” (concepts), not mere “strings” (of text).

Ontologies have a higher level of semantics than taxonomies. Even if they don’t contain synonyms, the relationships between concepts (entities) and sets of concepts (classes) have additional semantics. The relationships in an ontology are convey meanings beyond mere hierarchy or a generic “related term.” For example, relationships between entities may be “is located in,” “has customer,” and “sells product.” Furthermore, entities in an ontology may have various types of attributes, such as contact information for offices and people, which is another application of semantics.

Bringing Together Taxonomies and Ontologies

Taxonomies and ontologies have different origins, but now they are increasingly based on shared Semantic Web data models and guidelines, which enables them to be integrated seamlessly.  Taxonomies have their origins in library science structures, including thesauri, subject headings, and classification schemes. Ontologies have their origins in computer science and data science with a focus on data models.

Combining them brings the benefits of both: the linguistic aspect of controlled terminology and their synonyms with hierarchical structure in taxonomies and the custom semantic relationships and other attribute properties provided by ontologies. This allows users to search for concepts/things, not just text strings while also linking to related concepts and creating complex multi-step queries.

Taxonomies have been considered kinds of “controlled vocabularies” or “knowledge organization systems.” Ontologies are considered a kind of “knowledge model,” and as a knowledge representation system, rather than a knowledge organization system. When we combine them or speak of them collectively, it’s logical to use the word “semantic,” whether as semantic structures or semantic models, because they both involve semantics and both are usually based on Semantic Web guidelines.

Taxonomies are increasingly based on the Semantic Web recommendation (published by the World Wide Web Consortium) of SKOS (Simple Knowledge Organization System), which is based on RDF (Resource Description Framework). Most ontologies are based on RDF-Schema, an extension of RDF, and OWL (Web Ontology Language), another Semantic Web recommendation. The data models of SKOS, RDF, RDF-S, and OWL may all be integrated into the same knowledge model for a combined taxonomy-ontology. Most software for dedicated taxonomy-ontology management uses these data models.

Semantic Search and Semantic Tagging

Taxonomies support semantic search and tagging. “Semantic search” is the third-ranked autocomplete suggested search phrase in a Google search I did recently on “semantic,” so this is clearly a popular application of semantics. Semantic search refers to search that focuses on concepts and meaning rather than just strings of text. This is not new, but since search that is based on text strings and statistical algorithms is so common, improving search results with the focus on semantics is getting more attention.

Semantic search is best enabled with the tagging of taxonomy concepts, which we may call “semantic tagging.” (I first heard of “semantic tagging” when I was asked to write a conference article on it in 2008.) Advanced text analytics technologies, going beyond text extraction, entity recognition, and natural language processing to include natural language understanding, so as to analyze sentence structure, syntax, and sentiment, can also yield search results based somewhat on meaning and not just words.

Semantic Data

Taxonomies are traditionally for tagging and retrieving content, whereas ontologies are traditionally for exploring and retrieving data. The combination of a taxonomy and an ontology enables users to retrieve both content and data that are related to each other. Semantics for content is a given, because content (whether text, image, or other media), by its very nature, has meaning. Data by itself may not have much meaning, unless it is related to other data and that relationship has meaning, too. Thus, “semantic data” is significant. We hear reference to “semantic data” much more often than to “semantic content.

You don’t need to add a taxonomy to content to make it “semantic” and understood (rather a taxonomy helps you find the content). However, depending on how data is presented, you may need to add an ontology or at least a semantic data model (a method to describe objects in a database and their relationship to one another)to make data “semantic.” Experts can analyze raw data, but the data is more valuable if non-experts can understand it, too, and that’s why “semantic data” is important. There is also a lot of attention on “semantic data models.”

Semantic Layer

The idea of a “semantic layer” as a framework or approach to make an organization’s information, both data and content, more structured, findable, and actionable, has been gaining popularity recently. Whether the “semantic layer” is new or just a new way of describing something is arguable.

A semantic layer is a standardized framework that organizes and abstracts organizational data and serves as a connector for all knowledge assets. It’s a method to bridge content and data silos through a structured and consistent approach to connecting instead of consolidating data, which data warehouses do. The idea of a “layer” is that it is part of an enterprise-wide architecture of information, data and content, that connects horizontally across siloed content and data repositories. Taxonomies and ontologies, in addition to potentially other knowledge organization systems, such as a business glossary, are key components of a semantic layer.

More Talk of Semantics with Taxonomies and Ontologies

I’ve definitely been hearing of “semantics” more in the world of taxonomies and ontologies, and now I am bringing the word more into my own presentations. Following are some past and future examples.

Leave a Reply