FprCEN/CLC/TR 18115
(Main)Data governance and quality for AI within the European context
Data governance and quality for AI within the European context
This document provides an overview on AI-related standards, with a focus on data and data life cycles, to organizations, agencies, enterprises, developers, universities, researchers, focus groups, users, and other stakeholders that are experiencing this era of digital transformation.
It describes links among the many international standards and regulations published or under development, with the aim of promoting a common language, a greater culture of quality, giving an information framework.
It addresses the following areas:
- data governance;
- data quality;
- elements for data, data sets properties to provide unbiased evaluation and information for testing.
Datenmanagement und -qualität für KI im europäischen Kontext
Gouvernance et qualité des données pour l'IA dans le contexte européen
Upravljanje in kakovost podatkov za UI v evropskem okviru
General Information
Standards Content (Sample)
SLOVENSKI STANDARD
01-september-2024
Upravljanje in kakovost podatkov za UI v evropskem okviru
Data governance and quality for AI within the European context
Datenmanagement und -qualität für KI im europäischen Kontext
Gouvernance et qualité des données pour l'IA dans le contexte européen
Ta slovenski standard je istoveten z: FprCEN/CLC/TR 18115
ICS:
35.240.01 Uporabniške rešitve Application of information
informacijske tehnike in technology in general
tehnologije na splošno
2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.
TECHNICAL REPORT FINAL DRAFT
RAPPORT TECHNIQUE
TECHNISCHER REPORT
June 2024
ICS 35.240.01
English version
Data governance and quality for AI within the European
context
This draft Technical Report is submitted to CEN members for Vote. It has been drawn up by the Technical Committee
CEN/CLC/JTC 21.
CEN and CENELEC members are the national standards bodies and national electrotechnical committees of Austria, Belgium,
Bulgaria, Croatia, Cyprus, Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy,
Latvia, Lithuania, Luxembourg, Malta, Netherlands, Norway, Poland, Portugal, Republic of North Macedonia, Romania, Serbia,
Slovakia, Slovenia, Spain, Sweden, Switzerland, Türkiye and United Kingdom.
Recipients of this draft are invited to submit, with their comments, notification of any relevant patent rights of which they are
aware and to provide supporting documentation.
Warning : This document is not a Technical Report. It is distributed for review and comments. It is subject to change without
notice and shall not be referred to as a Technical Report.
CEN-CENELEC Management Centre:
Rue de la Science 23, B-1040 Brussels
© 2024 CEN/CENELEC All rights of exploitation in any form and by any means
Ref. No. FprCEN/CLC/TR 18115:2024 E
reserved worldwide for CEN national Members and for
CENELEC Members.
Contents Page
European foreword . 4
Introduction . 5
1 Scope . 8
2 Normative references . 8
3 Terms and definitions . 8
3.1 General . 8
3.2 Data governance . 10
3.3 Data quality . 12
4 Abbreviations . 15
5 JRC research and data-related standards on AI . 16
5.1 General . 16
5.2 Research: Data quality requirements for inclusive, non-biased and trustworthy AI . 16
5.3 Data-related standards on AI for data governance and data quality . 19
5.3.1 General . 19
5.3.2 A short description of the standards mentioned in Figure 4 (taken from www.iso.org) . 20
6 Data governance . 25
7 Data quality . 36
8 Elements for data, data sets, information for testing and evaluation . 46
9 Data governance and data quality for large European contexts . 51
9.1 General . 51
9.2 Italian government: Strategy program on Artificial Intelligence . 51
9.3 Italian agency application of data quality model for public administrations . 52
9.4 Spanish experience on data Governance: Data Office . 53
9.5 European governance relating to the Directive on inclusivity and accessibility . 54
10 General considerations on innovative technology: Ethics, Governance, AI Act . 55
11 Potential challenges. 57
11.1 General . 57
11.2 Stakeholders’ engagement . 57
11.3 Contextualization . 58
11.4 Critical infrastructures . 58
11.5 Ethics and regulatory challenges . 58
11.6 Interoperability . 59
11.7 Big volume of data . 60
12 Best practices from organizations, industries and research activities . 60
12.1 General . 60
12.2 AI in healthcare: the MES-CoBraD approach . 60
12.3 Overview of industries that stand out for their approach to data governance . 62
Bibliography . 63
Figures
Figure 1 — Connections of legislations, standards, guidelines, monitoring specifications . 7
Figure 2 — Active organizations mentioned in JRC . 18
Figure 3 — Standards and Technical reports mentioned in JRC . 19
Figure 4 — Clusters of standards, TS, TR data-related . 19
Figure 5 — Example of relationships among quality aspects of ISO/IEC 5259-2, ISO/IEC 25059
and AI Act . 24
Figure 6 — European legal references and ISO standards for AI on data quality (or
complementary) . 27
Figure 7 — Data governance framework . 28
Figure 8 — Data governance flow at European level . 31
Figure 9 — Data managing integration and synthesis of experiences . 33
Figure 10 — Data Governance summary . 35
Figure 11 — Data Quality Measures and Data Life Cycle Model . 37
Figure 12 — Relationship among quality models, characteristics, QM, QME, property, target
entity . 39
Figure 13 — Data life cycle framework . 45
Figure 14 — Example of conceptual perspective visualization of data testing and evaluation . 48
Figure 15 — Visualization of elements for governance of data, data sets, testing . 48
Figure 16 — Example of ontological contextual schema of elements resulting in the conference
online held in October 2020 with 100 speakers [13] . 50
Tables
Table 1 — Main documents considered for data governance framework . 25
Table 2 — Type of governance and multi-level point of view . 29
Table 3 — Characteristics of the data quality model adapted from ISO/IEC 25012 . 38
Table 4 — Characteristics of data quality models from ISO 8000, ISO/IEC 25012 and
ISO/IEC 5259-2 . 42
European foreword
This document (FprCEN/CLC/TR 18115:2024) has been prepared by Technical Committee CEN/JTC 21
“Artificial Intelligence”, the secretariat of which is held by DS.
This document is currently submitted to the vote on TR.
Introduction
This document aims to provide an overview of the relevant regulations in the European context and
connected international standards, paying particular attention to data governance and data quality
topics. Relevant regulations considered are:
— “Council of Europe” Ad hoc Committee on AI (CAI) that produced “Recommendation CM/Rec (2020)
of the Committee of Ministers to member States on the human rights impact of algorithmic systems”
and the deliverable “possible elements of a legal framework on Artificial Intelligence, based on the
Council of Europe’s standards on human rights, democracy and the rules of law” (2021) [1].
— “European strategy for data” (2020), which is essential to govern new technologies and create
business opportunities.
— “Artificial Intelligence Act” (2024 upcoming final version), which aims to ensure that AI systems
placed on the market and used in the EU are safe and respect fundamental rights. Attention is given
specifically to:
— Article 10 “Data and data governance” describing the quality criteria specifying aspects of
training, validation and testing of data sets.
— Article 15 “Accuracy, robustness, and cybersecurity” describing essential quality
characteristics that can be extended to a general data quality model; consistency between
terms and definitions is a common goal of this document, as well as of future TS and EN
standards.
— Articles where standard quality characteristics are mentioned (see Figure 5).
— “Data Governance Act” (2022) providing a framework aiming:
— to increase trust in data sharing across areas;
— to develop common European data spaces in strategic domains (e.g. health, environment,
energy, agriculture, mobility, finance, manufacturing, public administration;
— to strengthen mechanism to increase data availability and overcome technical obstacles to the
reuse of data.
— “Data Act” (2023): key elements include the reinforced data portability and data sharing, rules
governing the processing data shared, model contracts, access and use data held by private
companies, data and cloud interoperability, databases containing data from IoT, restriction on data
sharing.
— “Open data Directive” (EU 2019/1024): provides common rules for a European market for
government-held data, including the re-use of public sector information.
In addition, Regulation (EU) 2016/679 of the European Parliament and the Council on the protection of
natural persons about the processing of personal data and on the free movement of such data, and
repealing Directive 95/46/EC – GDPR, it is also considered in this document. The General Data
Protection Regulation – GDPR, entered into force on May 2016, creates a harmonized set of rules
applicable to processing of all European personal data. The objective of GDPR is to ensure that personal
data enjoys a high standard of protection everywhere in the EU, increasing legal certainty for both
individuals and organizations proceeding data, a
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.