When I had done consulting for ecommerce
taxonomy clients years ago, and they would refer to the taxonomy facets
for products as “attributes,” I felt that might be confusing, because I
considered “attributes” something else: a characteristic like metadata
of a taxonomy term or a feature of an ontology. I have since come to
realize that facets in some cases, especially in ecommerce, can be
considered attributes, and they can even be managed in an ontology that
is combined with the taxonomy.
Facets in a faceted taxonomy are
various taxonomy term “types” that function as refinements or filters in
the user interface for limiting search results on content that share
similar types of terms or attributes. Users refine or filter their
searches by selecting a term or value from each of two or more facets.
In a periodical article research database offered by a library, facets
might be subject, geographic place, named person, named organization,
article type, publication name. Within an enterprise intranet of
enterprise content management system facets might be topic, department,
office location, and document type. In a health information service,
facets might be symptom, body part, patient age, and patient gender. In
a corporate knowledge base for customer service, facets might be
product type, brand name, market, region, issue type, and customer type.
In most of these cases, a topical taxonomy is one of the facets, even
if that topical taxonomy is hierarchical. The primary taxonomy design
challenge in such cases is deciding what kind of information would be
useful if separated out in its own facet, and what can remain in the
generic topics facet. Using the SKOS (Simple Knowledge Organization
System) model, each concept scheme serves as a facet.
In a
product, ecommerce or marketplace taxonomy, the hierarchical taxonomy of
product types is not one of the facets. This large, hierarchical
taxonomy is typically the starting point for user browsing. While not
constituting a facet, this hierarchy is linked to the facets. The user
navigates or drills down through a hierarchical tree of product
categories, until a specific product type is found, and then the facets
(attributes) relevant to that product type are made available to the
user, allowing the user to refine the search further, by selecting from
each of several attributes, such as size, color, material, price, and
features. This requires a different approach to the taxonomy design than
for the faceted taxonomies described above, and t us these
post-hierarchical-browse refinements that are better known as and more
appropriately called attributes.
Attributes can serve as refinements/filters in taxonomies for
purposes other than ecommerce, such as job board taxonomies (attributes
for job location, skill level, salary range, job type, employer/company,
industry, date posted, etc.), an internal enterprise expert-finder
(attributes for job title, department, office/work location, skills,
subject areas of interest, academic degree, languages, etc.) or
taxonomies of movies (attributes for genre, production company,
production country, language, award winner type, release date, etc.)
The
attributes generally pertain to specific named entities, such as the
name of a product offered by a specific seller, the name of a job title
at a specific employer, or the title of a movie. There can be attributes
for more than one kind of named entity in the same set of taxonomies,
such as for employer name in addition to job title in a job board
taxonomy. Subjects, which are not named entities, usually do not have
attributes, but some do, especially, in the fields of science and
medicine, where they would be attributes on the names of species,
chemicals, viruses, diseases, etc. I will discuss named entities in more
detail in my next blog post.
Issues to consider in creating attributes
In
a taxonomy where attributes are important, there are various issues to
consider. First, shall there be a hierarchical topical taxonomy
presented as an initial browse interface to the users? While this is
typical for product/ecommerce taxonomies, it is not usually the case for
job board taxonomies nor for a taxonomy for movies. However, it may be
desired for a taxonomy of nonfiction books or periodicals, which users
more often would categorize my subject. A producer or publisher of
educational content will likely have a hierarchical taxonomy of
disciplines or subject areas. A research-focused organization would also
likely have a hierarchical subject taxonomy in addition to the
facet-attributes dealing with location, type, funding source/sponsor,
researcher name, etc. Having a hierarchical taxonomy outside of the
attributes tends to be a user experience design decision, but it has an
important impact on how the overall taxonomy is designed and managed.
More
attributes may be created than usable for filtering/refining results.
For example, products will likely have SKU numbers among their
attributes, which can be displayed and perhaps even made searchable, but
would not be one of the filtering-facets presented in the user
interface. In a taxonomy for finding internal experts, contact
information, such as an email address and phone number, would be
attributes on each person’s profile, but these would not be searchable
fields. Rather, they would display when the person profile is selected.
Thus, another issue when creating attributes is determining which will
display and function as filters/refinements and which will display only
as additional metadata on a selected item.
If an initial
hierarchical topical taxonomy is presented to the users, there arises
the question of at what point in the hierarchy should the hierarchy of
categories should end and further details should be treated instead with
attributes? This question often comes up when designing ecommerce
taxonomies. For example, to distinguish gas and electric stoves, should
each of these types be a narrower term of stoves, or should energy
source be an attribute of stoves?
Another issue to resolve is
determining which attributes should be generic across all categories,
and which should be category specific. For example, on which product
categories in an ecommerce taxonomy is it appropriate to have an
attribute for gender (as for clothing or a gift for a woman or for a
man)? Related to that is the question of which categories should have
their own unique attributes. Are some attributes relevant to major (the
broadest) categories, and other attributes relevant only the most
specific categories, and yet other attributes apply to various
miscellaneous products? For example, color might be relevant for
products in different parts of the hierarchy.
Attributes should
be managed as belonging to different types based on their values, such
as whether they are of controlled vocabularies, dates, currency,
numbers, or a simply a Boolean yes/no, such as Remote being an attribute
of jobs. If a hierarchical taxonomy resides outside of the attributes,
then controlled vocabulary attributes are an additional part of a larger
set of taxonomies/controlled vocabularies. How this is managed varies
based on the taxonomy/ontology management tool. For example, such term
lists might need to be managed as separate concept schemes in a SKOS
taxonomy, even though they are used in ontology-based attributes. It can
start getting complicated when an attribute type has different values
for different categories in the same hierarchical taxonomy
implementation. For example, the attribute of Material could have
different values for tables than for clothing, and both categories are
offered by the same ecommerce seller.
Attributes add another
level or layer of expressivity to a taxonomy or set of controlled
vocabularies, which brings it closer to an ontology. The distinction
between taxonomy and ontology is not necessarily clear. It’s fine to
have just some ontology-like features, such as attributes, but it is
recommended to use a taxonomy/ontology management tool, such as PoolParty,
which manages taxonomies and ontologies (and anything in between)
according to Semantic Web/World Wide Web Consortium (W3C) standards.