Controlled Vocabulary, Thesaurus, Ontology
It took about six months for our team to refine a list of over 925 company terms down to 173 controlled vocabulary terms. During the process, the words taxonomy, lexicon, dictionary and controlled vocabulary were tossed around nearly interchangeably.
“You keep using that word. I don’t think it means what you think it means.”
Inigo Montoya, Princess Bride
Curious to uncover differences and relationships among these terms, I dug for answers and modeled the relationships.
Controlled Vocabulary
A controlled vocabulary, or authority file, is a restricted list of words used for labeling, indexing, or categorizing. It is controlled because:
Relationships between terms may or may not exist in a controlled vocabulary.
Controlled vocabularies often have synonyms to point from incorrect variant terms to equivalent preferred terms in the controlled vocabulary, but this is not a requirement.
Controlled Vocabulary
A controlled vocabulary, or authority file, is a restricted list of words used for labeling, indexing, or categorizing. It is controlled because:
- Only terms from the list may be used for the subject area.
- Defined policies delineate who, when, and how words are added to the list.
Relationships between terms may or may not exist in a controlled vocabulary.
Controlled vocabularies often have synonyms to point from incorrect variant terms to equivalent preferred terms in the controlled vocabulary, but this is not a requirement.
Lexicon
A lexicon is a group of meanings for specific terms in a controlled vocabulary. A specific person or team may be responsible for maintaining a lexicon. For example, the Marketing department may maintain a lexicon, but their terms are a part of the larger controlled vocabulary. A dictionary is a type of lexicon providing pronunciation rules.
Taxonomy
A taxonomy is a classification system. The taxonomy in a controlled vocabulary indicates a hierarchical structure. Terms within a taxonomy relate to other terms in the taxonomy.
Taxonomies are often displayed as a tree structure. Taxonomies allow for the creation and use of facets, particularly helpful for information retrieval. For example, a shoe store may provide a search system based on the classification of their shoes, allowing customers to search for women’s pink Nike athletic shoes—four facets of a particular shoe.
Thesaurus
A thesaurus is a controlled vocabulary which follows a standard structure, where all terms in the thesaurus have relationships to each other.
- Hierarchical (broader term/narrower term)
- Associative (see also)
- Equivalent (use/used from or see/seen from).
Ontology
For a controlled vocabulary, an ontology is the meaning of words and concepts, resulting in a complex thesaurus with a kind of taxonomy. In an ontology, relationships could include located in to relate a group to a place, manufactures/is manufactured by to relate a business and its goods, and instructs/instructed by to relate a school and its student.
Information and meaning are embedded in an ontology, both in the specific relationships in the controlled vocabulary and the broader world of the intended audience. The controlled vocabulary should be meaningful to the user, reflecting the ontology (or meaning) of terms in the user’s world.
"Let me sum up."
Inigo Montoya, Princess Bride
An ontology is the meaning of words and relationships of concepts in a thesaurus, which resides within a taxonomy, which resides within a controlled vocabulary, which resides within the ontology of the world.
Sources:
http://infogrid.org/trac/wiki/Reference/PidcockArticle
http://www.topquadrant.com/docs/whitepapers/cvtaxthes.pdf