Insyte and Resources
How can you solve your data quality areas of pain using the international data quality standard, ISO 8000?
What is a Data Specification?
Abstract: This series of articles taken together explain how ISO 8000 can be part of your digital strategy. This International Standard can help to increase productivity in your organization, and cut the cost of a data cleansing or data onboarding projects. In this article, part two, of the five-part series, we explainthe role of the data specification in improving data quality.
The basic ISO 8000 data architecture is outlined in this model, and you will see that the data specification is at the heart of the architecture.
Figure 1 – Data Architecture – Specification
What is a data specification?
In asset and spare parts management, a data specification is used to ensure that the name of an item is consistent (the class name) and the features (properties) of that class of item are recorded in a logical fashion. These properties may describe the form of the item, and they may provide other logistics information such as the WCO tariff code or a global classification code such as UNSPSC. There is often confusion over the differing terminology used to describe data specifications. In most cases, when data cleaners refer to their “dictionary”, they are in fact referring to a collection of specification templates. In the diagram above, a single template is referred to as a “data specification”. NOTE: in the standard for open technical dictionaries (OTD), a data specification is known as an identification guide. An identification guide group, is therefore a collection of templates.
One issue common to a number of popular data cleaning “dictionaries” is that they do not have definitions for the terms they use; some use guidelines, but virtually none reference authoritative definitions for their terms.
In ISO 8000, each data specification that claims conformance to the standard must use references to concepts defined in a data dictionary. Most templates from a traditional data cleaning house are not referenced this way. To conform to ISO 8000, each term, be that a class name (noun / modifier), or property (characteristic) name, is referenced to a concept defined in a data dictionary, as in the example “property” below. For that data dictionary to be ISO 8000 compliant, it must also include an internationally recognised data identifier (IRDI), which is globally unique for each data entry in its metadata. IRDIs are assigned to class and property names, values and units of measure. The protocol for IRDIs is documented in ISO/IEC standard 11179, Part 3. An example of an IRDI for a concept is in shown in the figure below (0194-1#02-05ZBLR#1). The IRDIs come from the identification scheme in the data architecture model above.
For ISO 8000 compliant data, an IRDI serves as the key when exchanging data among information systems, organizations, or other parties who wish to share a specific administered item, but who might not utilize the same names or contexts.
(bearing) bore diameter – Concept IRDI: 0194-1#02-05ZBLR#1
Concept Type: Property
As outlined in part 1 of this series, a data dictionary can be: an open technical dictionary (OTD) as in ISO 22745; a concept dictionary as in ISO 29002; a parts library (PLIB) as in ISO 13584; a reference data library (RDL) as in ISO 15926; or any data dictionary that describes products and services by means of ontologies of classes and properties. To be a source for ISO 8000 compliant data, the data dictionary must include an international registration data identifier (IRDI) which is globally unique for each data entry in its metadata. IRDIs are assigned to class and property names, values and units of measure. An example of an IRDI for a concept is in shown in the figure above (0194-1#02-05ZBLR#1). The IRDIs come from the identification scheme in the data architecture model above. This last element is key. It is the IRDI that allows ISO 8000 compliant data to meet the standard required of digital data that is both portable between systems, and computer interpretable.
The largest current example of this type of identification guide group, is the NATO Allied Committee 135 set of Item identification guides (IIGs), which describe the format and data requirements for exchanging catalogue data for approximately 40,000 “approved item name” (AINs), and 38 million part numbers. The complete set of NATO IIGs, representing the data requirements for all AINs, would be considered an identification guide group. The NATO system contains about 27,000 different properties. The properties can be used in multiple IIGs, and some of the properties (like those that apply to dimensions) are used in a large number of IIGs.
An essential process in the creation of the data specification is the assignment of data types to each property. As discussed in part 1, ISO 8000 makes it clear that the key to good quality data is managing data from the “bottom up”, i.e., from the smallest meaningful element, the property value. This is achieved by assigning the appropriate data types to each property in the data specification. The most common cause of poor data quality is the use of “string” fields. A string field is an uncontrolled data type.
In part – 1 of the series we explained the role that data dictionaries play in improving data quality. In this part, part 2, we have explained the role of the data specification in improving data quality. In part – 3, we will explain how to create a data specification in a way that ensures data quality is built into the final master data record. In part – 4, we will explain how to create a catalogue item, and how to render short and long descriptions from that catalogue item. In part – 5, the last part of this series, we will explain and how cataloguing at source can simultaneously cut costs and increase the quality output of your data cleansing or data onboarding project.
To sum up parts one and two of this series, in order to improve the quality of your data:
- you shall use a data dictionary with an international registration data identifier (IRDI) for each data dictionary entry;
- your data specifications shall reference a data dictionary.
About the author
Chief Executive MRO Insyte
Peter Eales is a subject matter expert on MRO (maintenance, repair, and operations) material management and industrial data quality. Peter is an experienced consultant, trainer, writer, and speaker on these subjects. Peter is recognised by BSI and ISO as an expert in the subject of industrial data. Peter is a member ISO/TC 184/SC 4/WG 13, the ISO standards development committee that develops standards for industrial data and industrial interfaces, ISO 8000, ISO 29002, and ISO 22745, and is also a committee member of ISO/TC 184/WG 6 that is developing the standard for Oil and Gas Interoperability, ISO 18101.
Peter has previously held positions as the global technical authority for materials management at a global EPC, and as the global subject matter expert for master data at a major oil and gas owner/operator. Peter is currently chief executive of MRO Insyte, and chairman of KOIOS Master Data.
ECCMA is a membership organization and is the project leader for ISO 22745 and ISO 8000 KOIOS Master Data is a world leading cloud MDM solution enabling ISO 8000 compliant data exchange MRO Insyte is an MRO consultancy advising organizations in all aspects of materials management