Insyte and Resources
How can you solve your data quality areas of pain using the international data quality standard, ISO 8000?
Creating your own Data Specification
How do you create a data specification to ensure good quality master data?
The basic ISO 8000 data architecture is outlined in this model, and you will see that the data specification is at the heart of the architecture.
Figure 1 – Data Architecture – Specification
In asset and spare parts management, a data specification is used to ensure that the name of an item is consistent (the class name) and the features (properties) of that class of item are recorded in a logical fashion. These properties may describe the form of the item, and they may provide other logistics information such as the WCO tariff code, or a global classification code such as UNSPSC.
The composition of a data specification is defined by the organization that uses the data. There are two principle types of data specification for items:
item of supply, a specification defined by the buyer or end-user;
Item of production, a specification defined by the manufacturer.
It is normal that an item of supply contains fewer properties than an item of production describing the same item. An item of supply is constructed to meet the needs of the buying organization and should contain those properties that the organization “needs to have”. It should not contain those properties that can be defined as “nice to have”. Need to have properties can be defined as those properties that are fundamental for that organization, and that it needs to reference in order to perform its transactions. Nice to have properties are the converse of this definition.
It should be noted that contrary to common practice and belief, you can create consistent short descriptions for your enterprise resource system (ERP), warehouse or maintenance system from a series of items of production from different manufacturers of the same class of item. You should take this fact into account if you are an end-user about to embark on a project where you need to create a series of data specifications for the products you purchase as part of a data cleaning exercise. We discuss this in more detail in part – 4.
Whether you are creating an item of supply, or an item of production, the first choice you have to make is whether you are creating:
a generic concept– a specification with a narrower intension, example, BEARING, BALL;
or a specific concept– a specification with a broader intension, example, DEEP GROOVE BALL BEARING.
A generic concept has the disadvantage of requiring more properties per data specification to accommodate the different properties for specific types of items in that class. For example, if you use the generic concept BEARING BALL, a “must have” property for the bearing type, angular contact ball bearing, is the contact angle. This property is not required for any other “type” of BEARING BALL. In practical terms this means that it is inevitable when using a generic concept that some properties do not apply to all items when creating the master data record. The practical data quality problem that occurs from this scenario is that if a property has no associated value, how do you know if that value is missing, or whether it does not apply to that type of product? If you do not know this, how can you measure the completeness of your master data records?
Generic concepts also share two other related data quality issues. Generic concepts are usually constructed using generic property names. A defined list of values for a generic property value is very hard to collate and manage, so inevitably a string data type is used for the field instead of a controlled value data type. The most common cause of poor master data quality is the use of string fields. A string field is an uncontrolled data type; the data type controls the quality of the values that appear in the final master data record.
As outlined in part 2 of this series, an essential process in the creation of the data specification is the assignment of data types to each property. As discussed in part 1, ISO 8000 makes it clear that the key to good quality data is managing from the “bottom up”, i.e., from the smallest meaningful element, the property value. This is achieved by assigning the appropriate data types to each property in the data specification. The best data type to ensure data quality is the most appropriate controlled value for the property value pair you are creating.
What are the data type choices I need to make?
There are a number of different data types that you can use in your data specification, such as:
- controlled value: is a data type whose members consist of selections of values from a list of values. These values can be recorded in your data dictionary. Controlled values are one of the preferred data types for ensuring quality data;
- measured value: is a data type whose members consist of measurements. Measured values are one of the preferred data types for ensuring quality data;
- date: is a data type whose members are day-in-month-in-year values. EXAMPLE 2008-02-21 (February 21, 2008) is the ISO 8601 date format (YYYY-MM-DD). Dates are one of the preferred data types for ensuring quality data;
- string: is a data type whose members are finite sequences of characters and which are independent of language. Not recommended, as this data type is difficult to control. String fields can be used sparingly where your MDM system allows “representation” to control values;
- boolean: is a data type whose members are the values true and false. Not recommended, as searching or reporting on a value of true or false is of limited value.
In the figure below, you will see an example of an item of production as a specific concept, using mostly specific property names, and mostly list values. Please compare this figure to a similar concept in your system and see if using data specifications with defined data types would help you manage and improve the quality of your master data.
To sum up the first three parts this series, in order to improve the quality of your data:
- you shall use a data dictionary with an international registration data identifier (IRDI) for each data dictionary entry;
- your data specifications shall reference a data dictionary.
- your data specifications shall have appropriate data types with which to control the values.
About the author
Chief Executive MRO Insyte
Peter Eales is a subject matter expert on MRO (maintenance, repair, and operations) material management and industrial data quality. Peter is an experienced consultant, trainer, writer, and speaker on these subjects. Peter is recognised by BSI and ISO as an expert in the subject of industrial data. Peter is a member ISO/TC 184/SC 4/WG 13, the ISO standards development committee that develops standards for industrial data and industrial interfaces, ISO 8000, ISO 29002, and ISO 22745, and is also a committee member of ISO/TC 184/WG 6 that is developing the standard for Oil and Gas Interoperability, ISO 18101.
Peter has previously held positions as the global technical authority for materials management at a global EPC, and as the global subject matter expert for master data at a major oil and gas owner/operator. Peter is currently chief executive of MRO Insyte, and chairman of KOIOS Master Data.
Peter also acts as a consultant for ECCMA, and is a member of the examination board for the ECCMA ISO 8000 MDQM certification.
ECCMA is a membership organization and is the project leader for ISO 22745 and ISO 8000 KOIOS Master Data is a world leading cloud MDM solution enabling ISO 8000 compliant data exchange MRO Insyte is an MRO consultancy advising organizations in all aspects of materials management