This report offers purchasing and procurement tips for managers involved in decisions about selecting a DAM system. Also included is an example RFP with questions that specifically relate to Digital Asset Management.
Request the report
Practical advice for IT professionals on issues from systems integration to the IT infrastucture requirements for Digital Asset Management.
Request the report
Marketing managers are invited to consider 12 crucial points that will determine the success or failure of a web video archive.
Request the report
Designing a Controlled Vocabulary for use with Digital Asset Libraries
What is a controlled vocabulary?
Controlled vocabularies are common in Digital Asset Management Systems. They offer pre-selected words or phrases for users to choose from, rather than presenting a free form natural language vocabulary where any term can be supplied. They give an audience an explicit method of finding what terms are used within a repository of information.
Why are they useful?
When searching for assets, controlled vocabularies reduce the likelihood of inaccurate results. They help to ensure that assets are catalogued in a consistent and predictable manner that will help the user to obtain results that more closely match their needs.
Controlled vocabularies help users to identify the nature of the repository they are searching so they can more quickly decide if it is appropriate and remove ambiguities resulting from varying usage of different terms. For example, the term 'pools' can relate to pools of liquid, pools of resources, a billiards/snooker related game, or football pools betting. A controlled vocabulary helps users identify the context of the terms more easily.
In a marketing communications environment they can also help to enforce brand identity by reducing the likelihood of non-brand terminology or out-of-date phrases and names.
Styles of controlled vocabulary
There are two common styles of Controlled vocabulary that are relevant for marketing oriented Digital Asset Management purposes: wordlists and taxonomies.
A wordlist is usually simple alphabetical listing of terms that have been used as keywords (or tags) in assets. Simple wordlists are helpful in for general repositories where the subject matter has no unifying features. Users either pick a word from a list and/or enter the first few letters to be shown the closest matching term.
This can be used for subject related asset repositories such as corporate media libraries. Instead of simple word lists, users will select a broad term and then narrow it down through levels (or hierarchies) of detail to identify the specific one they require. A similar approach used for searching is also used for cataloguing.
It should be noted that the above is a simplification, there are overlaps between the two and sub-divisions that can blur the definitions. In other non-commercial fields such as preservation, cultural/heritage or scientific archives, more complex approaches may be essential, however, Daydream have not often found them to be necessary for most corporate marketing communications media libraries.
As well as the root term itself, in many cases, there are other terms that may be associated. These include:
- Active Tenses
- Americanised terms
- Associated terms
- Common spelling errors
The above are of particular use when a controlled vocabulary is used to catalogue assets, but a free text keyword search is supplied for finding them.
Structuring a controlled vocabulary
Assuming a taxonomy system is used, this is the hardest part of developing a controlled vocabulary for an asset library. Business controlled vocabularies are unlikely to be easily available as 'generic' taxonomies (especially for larger corporations which have their own company culture and terminology). The following are some ideas:
- Business units and/or departments
- Regions and countries
- Vertical markets
- Clients and customers
- Alliances and partners
- Buildings and structures
- History and culture
- Non-corporate activities (e.g. charitable work and CSR)
Although these are effective for starting a controlled vocabulary off, the concepts chosen will require continuous refinement, improvement and will change over a period of time to mirror the evolution of the business itself.
Granularity and the depth of the controlled vocabulary
An important consideration is how in-depth to go with a controlled vocabulary. For a taxonomy system, a hierarchy that is three levels deep is typical, e.g. Concept: Category: Sub-Category. Whether this is suitable or not depends on the quantity of assets and the nature of the business.
For asset libraries that are extensive and highly technical, many level of hierarchies are appropriate as users need to drill down to highly specialised items. For a small range of assets that are more marketing oriented, a highly detailed taxonomy is unsuitable because most of the terms will not produce any tangible results and either users will get null searches or time will be wasted developing a controlled vocabulary that is too detailed.
Note that just because the relationship between terms is hierarchical, it often may not be benefical to present it in this way to the user via the interface (i.e. the cascading 'folder' style method that is commonly employed). See this article: "Digital Asset Catalogues That Reflect The Needs Of Your Business" for reasons why implicit hierarchies can often be more effective than explicit ones in controlled vocabularies for digital asset libraries.
Project or usage based controlled vocabularies
The key criteria determining the adequacy of the design of the controlled vocabulary is whether or not it helps users to find the assets they are looking for. To help meet this challenge, the important questions to ask are:
- Why do users need these assets?
- What projects or situations do they use them for?
- How do their uses change across different areas of the business (e.g. sales presentations, training, marketing collateral etc.)?
To achieve this, it can be useful if each time the asset is selected, the specific project that it was used for is recorded also. This cannot be a free text description and must be a structured approach to avoid ambiguity. When users subsequently look for assets, they can see those that were used for certain kinds of uses. The results from this can be interpreted positively or negatively. For example, if an asset has been used excessively for certain types of project, it may be beneficial to avoid it. Alternatively, it might help users to save time searching for assets for a given type of usage if others have already done this work earlier.
Related to the above is the use of guide files. In this case, another usage context (e.g. an annual report or marketing brochure) can be used to help locate assets. Users browse examples of other uses and can obtain a list of assets used within them. This helps where users can locate the usage of an asset but not the keywords that would help them find them.
Employing user feedback to identify terms and improve search quality
A good tactic for improving search quality is to build closed loop controlled vocabularies that allow users to contribute feedback (subject to moderation by an administrator). For example, when users locate an asset, they can be offered the opportunity to suggest keywords that are not currently used and possibly should be included (or possibly excluded). Another possibility is to vote whether an asset is suitable or unsuitable for their project.
In scenarios where there is limited time or resources available to carry out an in-depth study into a suitable controlled vocabulary, this can be an effective strategy for improving the existing taxonomy.
Criticisms of controlled vocabularies and metadata
Having read this article, it should be obvious that controlled vocabularies offer a number of benefits to marketing managers who have been given the task of developing a digital asset management system for their company's brand assets. There are some, however, who criticise their use because they believe a controlled vocabulary is too rigid for practical use.
Apart from user searches, the biggest issue that most organisations find with using this type of structured approach is getting those involved in the digital asset supply chain to use it when uploading material. The process of adding relevant metadata is time consuming and frequently inconsistent. In this widely quoted article Metacrap: Putting the torch to seven straw-men of the meta-utopia, Cory Doctorow asserts that metadata in general lacks value because asset suppliers tend to be lazy (or more likely too busy) to carry out the cataloguing process correctly.
While these are valid points in the context of the internet and web search engines, the situation is slightly different for corporate media libraries. With these types of repositories, there is a greater opportunity to control the environment and the way in which material is catalogued and also the scope of the library is relatively well defined and focussed on helping employees and suppliers to find suitable material for marketing campaigns. In those cases where controlled vocabularies are not delivering users the search results they hoped for, it is more likely that the the design is inappropriate for the context it will be used in. This can occur because of one of the following reasons:
- The controlled vocabulary is a generic model derived from a solution vendor's standard 'off the shelf' system and does not model the structures and culture of the business.
- The taxonomy is composed of too many hierarchical levels - to the extent that ordinary users give up using it to either catalogue assets or search for them.
- The analysis and planning phase has either been carried out too quickly or based on information that is lacking accuracy or completeness.
- Asset suppliers have not been given adequate training or guidance about how to use the controlled vocabulary and in particular when it is appropriate to add terms or use existing ones.
Perhaps the best single piece of advice to avoid problems with controlled vocabularies is to get as much feedback from end-users as early as possible. Any taxonomy you develop will be constantly evolving either as a result of more keywords being added, terms changing, old classifications being merged into new ones and corporate changes such as mergers and acquisitions. Use a flexible and well designed solution that supports the ability to change your controlled vocabulary on demand. Make sure that the vendor of your Digital Asset Management system is sympathetic to the needs of your business and the culture, terminiology and branding that is already in widespread use rather than trying to get you to fit into a generic model they already have.
The following sites have more detailed information on Controlled Vocabularies, their use and how to apply them
http://www.controlledvocabulary.com - mainly covers controlled vocabularies for image libraries, also contains links to a variety of tools and a yahoo group.
http://www.boxesandarrows.com - a site about information architecture. This contains an article which gives an overview of controlled vocabularies from a more abstract perspective.
About the Author
Ralph Windsor is a senior partner in digital asset management implementation consultants, Daydream. He has eighteen years experience of delivering DAM and content technology solutions acquired as a developer, project manager and consultant working with global clients such as WS Atkins, Major League Baseball, BNP Paribas and The British Museum.
To find out more about Daydream and our service, please email firstname.lastname@example.org or telephone us on: +44 (0)20 7096 1471.