ISO/TS 8000-81:2021
(Main)Data quality - Part 81: Data quality assessment: Profiling
Data quality - Part 81: Data quality assessment: Profiling
This document specifies a procedure for data profiling to generate the foundation for performing data quality assessment. This profiling is applicable to data sets that are either originally in a structure of tables and columns or are the output from a transformation to create such a structure. NOTE 1 Data profiling is applicable to all types of database technology. The following are within the scope of this document: - performing structure analysis to determine data element concepts; - performing column analysis to identify relevant data elements, including statistics about a data set; - performing relationship analysis to identify dependencies in a data set. The following are outside the scope of this document: - methods for extracting and sampling data to be profiled from a data set; - deriving data rules; - measuring the extent of nonconformities in a data set. NOTE 2 ISO 8000‑8 specifies approaches to measuring data and information quality. This document can be used in conjunction with, or independently of, quality management systems standards.
Qualité des données — Partie 81: Titre manque
General Information
- Status
- Published
- Publication Date
- 19-May-2021
- Technical Committee
- ISO/TC 184/SC 4 - Industrial data
- Drafting Committee
- ISO/TC 184/SC 4/WG 13 - Industrial Data Quality
- Current Stage
- 9093 - International Standard confirmed
- Start Date
- 08-May-2025
- Completion Date
- 13-Dec-2025
Overview
ISO/TS 8000-81:2021 - "Data quality - Part 81: Data quality assessment: Profiling" defines a standardized procedure for data profiling as the foundation for data quality assessment. The technical specification applies to data sets organized as tables and columns (or produced by transformations into that form) across all database technologies. The aim is to produce a data profile that helps organizations identify data quality improvement opportunities and support subsequent rule creation and governance.
Key topics and requirements
- Scope of profiling
- Structure analysis, column analysis and relationship analysis are the three mandatory processes.
- Applicable to tabular data and outputs of transformations into tables/columns.
- Structure analysis
- Inputs: data set and optional column metadata (names, descriptions).
- Output: a data element concept that captures the conceptual domain for subsequent analysis.
- Column analysis
- Inputs: data set + data element concept.
- Activities: extract data elements, compare elements to actual values, determine the value domain.
- Outputs: constraints of value domain including cardinalities (row counts, distinct counts, nulls), storage characteristics (data types, lengths, decimals), and valid-value definitions (lists, ranges, patterns). Methods include discovery, assertion testing and visual inspection; automation tools can assist.
- Relationship analysis
- Inputs: data set + data elements from column analysis.
- Activities: identify dependencies and correspondence between data structure and real-world items (requires collaboration with domain experts).
- Outputs: list of dependencies - primary/foreign keys, functional dependencies, derived columns, and synonym relationships (redundant or domain synonyms).
- Explicit exclusions
- The specification does not cover data extraction/sampling methods, deriving data rules, or measuring the extent of nonconformities.
Practical applications
- Establishing a repeatable data profiling baseline for data governance, data quality management, and data stewardship.
- Informing data migration, master data management (MDM), analytics, BI, and regulatory compliance efforts by revealing structural issues, value-domain constraints and inter-column dependencies.
- Supporting the design of validation rules and remediation plans (profiling provides inputs for rule derivation, though rule creation itself is outside the spec).
- Useful for automated tooling adoption: selection, configuration and evaluation of data profiling tools.
Who should use this standard
- Data quality managers, data stewards, data engineers, DBAs, business analysts, BI teams, auditors and architects responsible for data governance, MDM, ETL/ELT and analytics.
Related standards
- ISO 8000 series (context for data quality)
- ISO 8000-2 (Vocabulary)
- ISO 8000-8 (Measuring data and information quality)
- ISO 8000-61 and ISO/TS 8000-1 (data quality management and series overview)
Keywords: ISO/TS 8000-81, data profiling, data quality assessment, value domain, column analysis, relationship analysis, data governance, data stewardship.
Frequently Asked Questions
ISO/TS 8000-81:2021 is a technical specification published by the International Organization for Standardization (ISO). Its full title is "Data quality - Part 81: Data quality assessment: Profiling". This standard covers: This document specifies a procedure for data profiling to generate the foundation for performing data quality assessment. This profiling is applicable to data sets that are either originally in a structure of tables and columns or are the output from a transformation to create such a structure. NOTE 1 Data profiling is applicable to all types of database technology. The following are within the scope of this document: - performing structure analysis to determine data element concepts; - performing column analysis to identify relevant data elements, including statistics about a data set; - performing relationship analysis to identify dependencies in a data set. The following are outside the scope of this document: - methods for extracting and sampling data to be profiled from a data set; - deriving data rules; - measuring the extent of nonconformities in a data set. NOTE 2 ISO 8000‑8 specifies approaches to measuring data and information quality. This document can be used in conjunction with, or independently of, quality management systems standards.
This document specifies a procedure for data profiling to generate the foundation for performing data quality assessment. This profiling is applicable to data sets that are either originally in a structure of tables and columns or are the output from a transformation to create such a structure. NOTE 1 Data profiling is applicable to all types of database technology. The following are within the scope of this document: - performing structure analysis to determine data element concepts; - performing column analysis to identify relevant data elements, including statistics about a data set; - performing relationship analysis to identify dependencies in a data set. The following are outside the scope of this document: - methods for extracting and sampling data to be profiled from a data set; - deriving data rules; - measuring the extent of nonconformities in a data set. NOTE 2 ISO 8000‑8 specifies approaches to measuring data and information quality. This document can be used in conjunction with, or independently of, quality management systems standards.
ISO/TS 8000-81:2021 is classified under the following ICS (International Classification for Standards) categories: 25.040.40 - Industrial process measurement and control. The ICS classification helps identify the subject area and facilitates finding related standards.
You can purchase ISO/TS 8000-81:2021 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.
Standards Content (Sample)
TECHNICAL ISO/TS
SPECIFICATION 8000-81
First edition
2021-05
Data quality —
Part 81:
Data quality assessment: Profiling
Reference number
©
ISO 2021
© ISO 2021
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO 2021 – All rights reserved
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Data profiling . 2
5 Structure analysis . 2
5.1 Inputs . 2
5.2 Scope of activities . 2
5.3 Outputs . 3
6 Column analysis . 3
6.1 Inputs . 3
6.2 Scope of activities . 3
6.3 Outputs . 3
7 Relationship analysis . 3
7.1 Inputs . 3
7.2 Scope of activities . 3
7.3 Outputs . 4
Annex A (informative) Document identification . 5
Annex B (informative) Constraints of value domain . 6
Annex C (informative) Dependency . 8
Bibliography .11
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out
through ISO technical committees. Each member body interested in a subject for which a technical
committee has been established has the right to be represented on that committee. International
organizations, governmental and non-governmental, in liaison with ISO, also take part in the work.
ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of
electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of
any patent rights identified during the development of the document will be in the Introduction and/or
on the ISO list of patent declarations received (see www .iso .org/ patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/
iso/ foreword .html.
This document was prepared by Technical Committee ISO/TC 184, Automation systems and integration,
Subcommittee SC 4, Industrial data.
A list of all parts in the ISO 8000 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
iv © ISO 2021 – All rights reserved
Introduction
Digital data delivers value by enhancing all aspects of organizational performance including:
— operational effectiveness and efficiency;
— safety;
— reputation with customers and the wider public;
— compliance with statutory regulations;
— consumer costs, revenues and stock prices.
The influence on performance originates from data being the formalized representation of information;
this information enables organizations to make reliable decisions. This decision making can be
performed by human beings directly and also by automated data processing including artificial
intelligence systems.
Through widespread adoption of digital computing and associated communication technologies,
organizations become dependent on digital data. This dependency amplifies the negative consequences
of lack of quality in this data. These consequences are the decrease of organizational performance.
The biggest impact of digital data comes from the data having a structure that reflects the nature of the
subject matter and from the data also being computer processable (machine readable) rather than just
being for a person to read and understand.
The content of ISO 9000 explains that quality is not an abstract concept of absolute perfection. Quality
is actually the conformance of characteristics to requirements and, thus, any item of data can be of high
quality for one use but not for another use that has differing requirements.
EXAMPLE 1 When storing start times for meetings, a calendar application requires less precision than a
control system would for storing the times at which to activate a propulsion unit during a spaceflight.
The nature of digital data is fundamental to establishing requirements that are relevant to the specific
decisions that are made by each organization.
EXAMPLE 2 ISO/TS 8000-1 identifies that data has syntactic (format), semantic (meaning) and pragmatic
(usefulness) characteristics.
To support the delivery of high-quality data, the ISO 8000 series addresses:
— data governance, data quality management and maturity assessment;
EXAMPLE 3 ISO 8000-61 specifies a process reference model for data quality management.
— creating and applying requirements for data and information;
EXAMPLE 4 ISO 8000-110 specifies how to exchange characteristic data that is master data.
— monitoring and measuring data and information quality;
EXAMPLE 5 ISO 8000-8 specifies approaches to measuring data and information quality.
— improving data and, consequently, information quality;
EXAMPLE 6 This document specifies an approach to data profiling, which identifies opportunities to
improve data quality.
— issues that are specific to the type of content in a data set.
EXAMPLE 7 ISO/TS 8000-311 specifies how to address quality considerations for product shape data.
Data quality management covers all aspects of data processing, including creating, collecting, storing,
maintaining, transferring, exploiting and presenting data to deliver information.
Effective data quality management is systemic and systematic, requiring an understanding of the
root causes of data quality issues. This understanding is the basis for not just correcting existing
nonconformities but also implementing solutions that prevent future reoccurrence of those
nonconformities.
EXAMPLE 8 If a data set includes dates in multiple formats including “yyyy-mm-dd”, “mm-dd-yy” and
“dd-mm-yy”, then data cleansing can correct the consistency of the values. However, such cleansing requires
additional information to resolve ambiguous entries (e.g. “04-05-20”) and cannot address any process issues and
people issues, including training, that have caused the inconsistency.
As a contribution to this overall capability of the ISO 8000 series, this document specifies an approach
to data profiling, which involves applying analysis techniques to data in actual use. This analysis
generates a profile consisting of the structure, columns and relationships of the data. The profile
provides the basis for identifying opportunities to improve data quality by establishing new explicit
rules for the data. The approach also typically produces greater effect from repeated application to
uncover issues progressively.
Organizations can use this document on its own or in conjunction with other parts of the ISO 8000
series.
This document supports activities that affect:
— one or more information systems;
— data flows within the organization and with external organizations;
— any phase of the data life cycle.
By implementing parts of the ISO 8000 series, an organization achieves the following benefits:
— establishing reliable foundations for digital transformation;
— recognizing how data in digital form has become a fundamental asset class that organizations rely
on to deliver value;
— securing evidence-based trustworthiness of data and information for all stakeholders;
— creating portable data that protects against the loss of intellectual property and that is reusable
across the organization and applications;
— achieving traceability of data back to original sources;
— ensuring all stakeholders work with common understanding of explicit data requirements.
ISO/TS 8000-1 provides a detailed explanation of the structure and scope of the ISO 8000 series.
Annex A contains an identifier that unambiguously identifies this document in an open information
system.
vi © ISO 2021 – All rights reserved
TECHNICAL SPECIFICATION ISO/TS 8000-81:2021(E)
Data quality —
Part 81:
Data quality assessment: Profiling
1 Scope
This document specifies a procedure for data profiling to generate the foundation for performing data
quality assessment. This profiling is applicable to data sets that are either originally in a structure of
tables and columns or are the output from a transformation to create such a structure.
NOTE 1 Data profiling is applicable to all types of database technology.
The following are within the scope of this document:
— performing structure analysis to determine data element concepts;
— performing column analysis to identify relevant data elements, including statistics about a data set;
— performing relationship analysis to identify dependencies in a data set.
The following are outside the scope of this document:
— methods for extracting and sampling data to be profiled from a data set;
— deriving data rules;
— measuring the extent of nonconformities in a data set.
NOTE 2 ISO 8000-8 specifies approaches to measuring data and information quality.
This docu
...
記事タイトル:ISO/TS 8000-81:2021 - データ品質 - 部分81:データ品質評価:プロファイリング 記事内容:この文書では、データ品質評価のための基盤を生成するためのデータプロファイリング手順を指定しています。このプロファイリングは、もともとテーブルと列の構造にあるデータセットまたはそのような構造を作成するための変換の結果であるデータセットに適用することができます。注1:データプロファイリングは、あらゆる種類のデータベース技術に適用することができます。この文書の範囲に含まれるものは次のとおりです:- データ要素の概念を特定するための構造分析を実行すること- データセットに関連するデータ要素を特定し、データセットに関する統計情報を収集するための列分析を実行すること- データセット内の依存関係を特定するための関係分析を実行すること。この文書の範囲外のものは次のとおりです:- データセットからプロファイリング対象のデータを抽出し、サンプリングする方法- データルールの派生- データセットの不一致の程度を測定する方法。注2:ISO 8000-8は、データおよび情報品質の測定手法についてのアプローチを規定しています。この文書は、品質管理システムの標準と組み合わせて使用することも、独立して使用することもできます。
기사 제목: ISO/TS 8000-81:2021 - 데이터 품질 - 파트 81: 데이터 품질 평가: 프로파일링 기사 내용: 본 문서는 데이터 품질 평가를 수행하기 위한 기반을 생성하기 위한 데이터 프로파일링 절차를 명시합니다. 이 프로파일링은 원래 테이블과 열 구조를 갖는 데이터 세트 또는 해당 구조를 생성하기 위한 변환 결과인 데이터 세트에 적용될 수 있습니다. 참고 1: 데이터 프로파일링은 모든 유형의 데이터베이스 기술에 적용될 수 있습니다. 다음은 이 문서의 적용 범위에 해당합니다: - 데이터 요소 개념을 결정하기 위한 구조 분석 수행; - 데이터 세트에 대한 통계를 포함한 관련 데이터 요소를 식별하기 위한 열 분석 수행; - 데이터 세트의 종속성을 식별하기 위한 관계 분석 수행. 다음은 이 문서의 적용 범위에서 제외됩니다: - 데이터 세트에서 프로파일링될 데이터를 추출하고 샘플링하는 방법; - 데이터 규칙 유도; - 데이터 세트에서 비준수 상태의 정도 측정. 참고 2: ISO 8000-8은 데이터 및 정보 품질 측정 방법론을 규정합니다. 이 문서는 품질 관리 시스템 표준과 함께 또는 독립적으로 사용할 수 있습니다.
ISO/TS 8000-81:2021 is a document that outlines a procedure for data profiling, which is used to assess data quality. The profiling process can be applied to data sets that are structured in tables and columns or have been transformed into such a structure. The document specifies activities such as structure analysis to determine data elements, column analysis to identify relevant data elements and gather statistics about the data set, and relationship analysis to identify dependencies in the data set. However, the document does not cover methods for extracting and sampling data, deriving data rules, or measuring the extent of nonconformities in a data set. It is worth noting that ISO 8000-8 provides guidelines for measuring data and information quality and that ISO/TS 8000-81 can be used in conjunction with, or independently of, quality management systems standards.










Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...