By Heather Hedden

October 2023

Upcoming events by this speaker:

December 11-12, 2023 Online live streaming:
Taxonomy and Metadata Design

Governance of Metadata and Taxonomies

Metadata and taxonomies support the structure, control, and governance of content, but governance is also needed of the metadata and taxonomies themselves.

What are metadata and taxonomie
Both metadata and taxonomies provide a way to organize, manage and retrieve content or files, whether documents, presentations, images, video recordings, technical drawings, etc.
Metadata is standardized data about digital content, with shared attributes, that needs to be organized and retrieved. Metadata is organized into a set of properties, elements, or fields (there are different names). Examples are File name, Author, File format, Content type, Language, Source, Purpose, Location, and Subject. Each metadata property is filled with specific values. Metadata values are applied or tagged to content to serve various purposes, including targeted publishing, promotion, workflow definition, retrieval, comparison, and archiving.
Taxonomies are sets of controlled terms which are hierarchically arranged (linked to each other in broader/narrower relationships) or grouped by category or type. Taxonomies are usually implemented as values for certain metadata properties which require great detail, such as Subject, Activity, Product category, Industry, or Field of specialization. Taxonomy terms are used to describe what content is about. Content is tagged with taxonomy terms so that people can find it by its subjects.
Metadata provides structure to content, and taxonomies are a part of that structure that enables retrieval based on the meaning of content. Since taxonomies are part of metadata, taxonomy governance is a part of metadata governance.

What and why governance
Governance is the enforcement of authority over the management over something, such as data, metadata, or taxonomies. The goals include ensuring quality, consistency, and usability, not just in the metadata or taxonomies themselves but also in their effectiveness when implemented in content management and retrieval. Governance comprises policies and procedures for the use and continued management of metadata and taxonomies.
There are two components to governance of metadata and taxonomy:

– Setting policy on how it should be (the rules)
– Setting procedure on how it should be managed and maintained

Setting policy on how the metadata should be involves designating the metadata properties, their descriptions, uses, rules, and the types of values for each property. Setting policy on how the taxonomy should be involves designating the taxonomy type, levels of hierarchy depth, use of synonyms, use of notes or definitions on terms, and the editorial style (capitalization, etc.) for term names.
Setting the procedure on management and maintenance deals with changes and updates to metadata and taxonomy. This is necessary to preserve consistency in face of change and mitigate risks associated with change. This part of governance also involves the proper stakeholders. Setting procedure for managing metadata and taxonomies includes addressing the roles, responsibilities, and processes. This includes the procedure for making and approving changes and additions and the roles of who makes decisions and approves decisions for different kinds of changes, including changes to metadata properties, changes to metadata rules, changes to controlled vocabulary values.

Governance in metadata vs. taxonomies
Governance in metadata is somewhat built into the metadata design or scheme itself. Taxonomies, on the other hand, require more of an external governance plan.
It is has become common practice when designing a set of metadata to create a full metadata schema specification, which includes the following determinations:

– The kinds of values managed in each metadata property, e.g. free text, numbers according
   to a designated format, dates according to a designated format, URLs, a choice of two
   values (Boolean “or”), or a controlled vocabulary of terms
– Any limits on the number of characters are in each metadata property
– When values are from a controlled vocabulary, the kind of controlled vocabulary it is,
   whether a flat list, a hierarchical taxonomy, or a thesaurus
– Whether a given metadata property is required to have its values applied/tagged to content
   in all cases
– Whether only one or more than one value of a given metadata property may be
   applied/tagged to content

As such, well-defined metadata provides good governance for content management, at least for describing how a metadata implementation should be. Governance that comprises procedures for updating the metadata, however, sometimes get ignored. Adding or changing metadata properties is an important part of metadata governance. Adding or change the values of controlled vocabularies in metadata properties, may be considered part of taxonomy governance.
Taxonomy governance which describes how the taxonomy should be designed should be developed at the same time that the taxonomy is developed. As questions come up in developing the taxonomy and get resolved, the outcome of the question should then be written down as part of the policy documentation.
The part of taxonomy governance which describes processes and procedures for how the taxonomy should be maintained, on the other hand, is of greater importance to taxonomies than it is to metadata in general. This is because taxonomies tend to be large and detailed and thus subject to change and updates. New terms (as metadata values) need to be added relatively frequently, as new trends and topics appear, new content for tagging gets added, new user types or markets are added, and user feedback suggests improvements.

Governance for taxonomy management procedures includes the following:
– The different types of changes that require different procedures and levels of approval
– Methods that users may submit requests for changes
– The authoritative sources to consult when making a change or addition
– The criteria for adding new controlled vocabulary terms

Standards for metadata and taxonomies
Standards can also help support governance, but there are different kinds of standards for different purposes. Metadata standards and taxonomy standards are quite different.
“Metadata standards” generally refers to specific, published metadata schemas, that have often been developed by a user community for certain kind of resource and are encouraged for adoption by organizations which manage that kind of resource. The standard enumerates and defines all of the metadata properties/elements. Examples include MARC and MODS for libraries, IPTC (International Press Telecommunications Council) for photographs, and Dublin Core (DC) Metadata Element Set, which is a generic set of 15 metadata properties for managing digital text resources. By following metadata schema standards, organizations can easily share and exchange content with each other or import and integrate content from outside sources. It is not necessary to follow a published metadata standard, if you don’t plan to share your data this way.
Standards followed for taxonomies are of two different kinds: (1) taxonomy design best practices, which are based on thesaurus creation guidelines published in ISO 25964-1 and ANSI/NISO Z39.19, and (2) taxonomy interoperability standards, which are SKOS and its underlying data model RDF. The ISO and ANSI/NISO standards provide guidance on the wording of terms and structuring of relationships between terms among other things, such as taxonomy display format. SKOS (Simple Knowledge Organization System) and RDF (Resource Description Framework) are data models published by the World Wide Web Consortium (W3C) to support linked data and the interoperability of knowledge organizations systems (controlled vocabularies taxonomies, thesauri, etc.) on the web. As such, SKOS is more similar to a metadata standard.

Conclusions
Whether or not a metadata schema comes from a published standard or is internally developed, it should include includes rules, definitions, and examples. Metadata properties with controlled vocabularies for their values additionally need policies for how the controlled vocabularies should be updated and who has the authority to make such decisions.