ISO/IEC 20547-3:2020
(Main)Information technology - Big data reference architecture - Part 3: Reference architecture
Information technology - Big data reference architecture - Part 3: Reference architecture
This document specifies the big data reference architecture (BDRA). The reference architecture includes concepts and architectural views. The reference architecture specified in this document defines two architectural viewpoints: - a user view defining roles/sub-roles, their relationships, and types of activities within a big data ecosystem; - a functional view defining the architectural layers and the classes of functional components within those layers that implement the activities of the roles/sub-roles within the user view. The BDRA is intended to: - provide a common language for the various stakeholders; - encourage adherence to common standards, specifications, and patterns; - provide consistency of implementation of technology to solve similar problem sets; - facilitate the understanding of the operational intricacies in big data; - illustrate and understand the various big data components, processes, and systems, in the context of an overall big data conceptual model; - provide a technical reference for government departments, agencies and other consumers to understand, discuss, categorize and compare big data solutions; and - facilitate the analysis of candidate standards for interoperability, portability, reusability, and extendibility.
Technologies de l'information — Architecture de référence des mégadonnées — Partie 3: Architecture de référence
General Information
Relations
Overview
ISO/IEC 20547-3:2020 - "Information technology - Big data reference architecture - Part 3: Reference architecture" defines a vendor-neutral, high-level Big Data Reference Architecture (BDRA). It provides a standardized framework to describe big data ecosystems using two complementary viewpoints:
- User view - defines roles, sub‑roles (for example BDAP, BDFP, BDSP, BDP, BDC) and their activities and relationships within a big data ecosystem.
- Functional view - defines architectural layers (application, processing, platform, resource, and multi-layer functions) and classes of functional components that implement the user-view activities.
The BDRA is deliberately non-prescriptive: it is a common language and conceptual model to help design, compare, and evaluate big data solutions across organizations.
Key Topics
- Roles and responsibilities: standardized role taxonomy (big data application provider, framework provider, service partner, provider, consumer) and sub‑roles for collection, preparation, analytics, visualization, access, infrastructure, platform and processing.
- Layered functional architecture: clear separation of application, processing, platform, resource layers and multi‑layer components to map capabilities and interfaces.
- Cross-cutting aspects: security and privacy, management, and data governance are integrated into the reference architecture as essential concerns.
- Interoperability and portability: framework supports analysis of candidate standards to improve interoperability, reusability and extendibility of big data solutions.
- Standards alignment: encourages adherence to existing standards, specifications and patterns; provides consistency for solving similar problem sets.
- Informative annexes: mappings to other system integration reference architectures and example role relationships to assist practical adoption.
Applications and Who Uses It
ISO/IEC 20547-3:2020 is useful for:
- Enterprise and solution architects designing or documenting big data systems.
- Government departments and agencies evaluating, procuring, or benchmarking big data platforms and services.
- Vendors and platform providers aligning product capabilities to a standard architecture.
- System integrators and consultants comparing solutions, ensuring interoperability and defining integration points.
- Data governance, security and compliance teams assessing controls, privacy, and management requirements across architectures.
Practical uses include architecture design, procurement criteria, gap analysis, interoperability assessments, and communicating requirements between technical and non‑technical stakeholders.
Related standards
- ISO/IEC 20547 series (Parts 1, 2, 4, 5) and ISO/IEC 20546 (vocabulary and concepts)
- ISO/IEC 17789 (Cloud computing - reference architecture)
- Governance and data quality standards referenced in the document (e.g., ISO/IEC 38500, ISO 8000)
Keywords: ISO/IEC 20547-3:2020, Big data reference architecture, BDRA, big data architecture, functional view, user view, data governance, interoperability, security and privacy.
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 20547-3
First edition
2020-03
Information technology — Big data
reference architecture —
Part 3:
Reference architecture
Technologies de l'information — Architecture de référence des
mégadonnées —
Partie 3: Architecture de référence
Reference number
©
ISO/IEC 2020
© ISO/IEC 2020
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Fax: +41 22 749 09 47
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2020 – All rights reserved
Contents Page
Foreword .v
Introduction .vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 2
4 Abbreviated terms . 4
5 Conventions . 5
6 Big data reference architecture concepts . 5
6.1 General . 5
6.2 Views . 6
6.3 Overview of user view . 6
6.4 Overview of functional view. 7
6.5 Relationship between the user view and the functional view . 8
6.6 Relationship of the user view and functional view to cross-cutting aspects . 8
7 User view . 9
7.1 Big data roles, sub-roles, and activities. 9
7.2 Role: Big data application provider (BDAP) .10
7.2.1 General.10
7.2.2 Sub-role: big data collection provider (BDCP) .11
7.2.3 Sub-role: big data preparation provider (BDPreP) .11
7.2.4 Sub-role: big data analytics provider (BDAnP) .12
7.2.5 Sub-role: big data visualization provider (BDVP) .12
7.2.6 Sub-role: big data access provider (BDAcP) . .12
7.3 Role: big data framework provider (BDFP) .12
7.3.1 General.12
7.3.2 Sub-role: big data infrastructure provider (BDIP) .13
7.3.3 Sub-role: big data platform provider (BDPlaP) .13
7.3.4 Sub-role: big data processing provider (BDProP).13
7.4 Role: big data service partner (BDSP) .14
7.4.1 General.14
7.4.2 Sub-role: big data service developer (BDSD) .15
7.4.3 Sub-role: big data auditor (BDA) .15
7.4.4 Sub-role: big data system orchestrator (BDSO) .15
7.5 Role: big data provider (BDP) .16
7.6 Role: big data consumer (BDC) .16
8 Cross-cutting aspects .17
8.1 General .17
8.2 Security and privacy .17
8.3 Management .17
8.4 Data governance .18
9 Functional view .18
9.1 Functional architecture .18
9.1.1 General.18
9.1.2 Layering architecture .19
9.1.3 Multi-layer functions .20
9.2 Functional components .20
9.2.1 General.20
9.2.2 Big data application layer functional components .21
9.2.3 Big data processing layer functional components .23
9.2.4 Big data platform layer functional components .25
9.2.5 Resource layer functional components .28
© ISO/IEC 2020 – All rights reserved iii
9.2.6 Multi-layer functional components .29
Annex A (informative) Mapping big data RA functional view to other system integration RA .33
Annex B (informative) Examples of the relationship of roles in big data ecosystem .34
Annex C (informative) .35
Bibliography .37
iv © ISO/IEC 2020 – All rights reserved
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical
activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work. In the field of information technology, ISO and IEC have established a joint technical committee,
ISO/IEC JTC 1.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for
the different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www .iso .org/ patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see www .iso .org/
iso/ foreword .html.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 42, Artificial intelligence.
A list of all parts in the ISO/IEC 20547 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
© ISO/IEC 2020 – All rights reserved v
Introduction
The ISO/IEC 20547 series is intended to provide users with a standardized approach to developing
and implementing big data architectures and provide references for approaches. ISO/IEC TR 20547-1
provides users with an overview of the reference architecture framework described in this document
and a process for applying that framework in developing an architecture. ISO/IEC TR 20547-2 provides
a collection of big data use cases and decomposes those use cases into technical considerations that
big data architects and system implementers can consider. This document describes the reference
architecture in terms of User and Functional views. Those views can be used by the big data architect
to describe their specific system. ISO/IEC 20547-4 describes the security and privacy aspects unique
to big data. ISO/IEC TR 20547-5 provides a list of standards and their relationship to the reference
architecture that architects and implementers can consider as part of the design and implementation of
their system.
Each of these parts is built on the common vocabulary and concepts described in ISO/IEC 20546.
In general terms, reference architecture provides an authoritative source of information about a specific
subject area that guides and constrains the instantiations of multiple architectures and solutions
(see 3.2). Reference architectures generally serve as a reference foundation for solution architectures
and can also be used for comparison and alignment purposes.
The key goal of this reference architecture is to facilitate a shared understanding across multiple
products, organizations, and disciplines about current architectures and future direction.
The reference architecture presented in this document provides an architecture framework for
describing the big data components, processes, and systems to establish a common language for the
various stakeholders named as big data reference architecture (BDRA). It does not represent the system
architecture of a specific big data system. Instead, it is a tool for describing, discussing, and developing
system-specific architectures using an architecture framework of reference. It provides generic high-
level architectural views that are an effective tool for discussing the requirements, structures, and
operations inherent to big data. The model is not tied to any specific vendor products, services or
reference implementation, nor does it define prescriptive solutions that inhibit innovation.
vi © ISO/IEC 2020 – All rights reserved
INTERNATIONAL STANDARD ISO/IEC 20547-3:2020(E)
Information technology — Big data reference
architecture —
Part 3:
Reference architecture
1 Scope
This document specifies the big data reference architecture (BDRA). The reference architecture
includes concepts and architectural views.
The reference architecture specified in this document defines two architectural viewpoints:
— a user view defining roles/sub-roles, their relationships, and types of activities within a big data
ecosystem;
— a functional view defining the architectural layers and the classes of functional components within
those layers that implement the activities of the roles/sub-roles within the user view.
The BDRA is intended to:
— provide a common language for the various stakeholders;
— encourage adherence to common standards, specifications, and patterns;
— provide consistency of implementation of technology to solve similar problem sets;
— facilitate the understanding of the operational intricacies in big data;
— illustrate and understand the various big data components, processes, and systems, in the context
of an overall big data conceptual model;
— provide a technical reference for government departments, agencies and other consumers to
understand, discuss, categorize and compare big data solutions; and
— facilitate the analysis of candidate standards for interoperability, portability, reusability, and
extendibility.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 8000-2, Data quality — Part 2: Vocabulary
ISO/TS 8000-60, Data quality — Part 60: Data quality management: Overview
ISO 8000-61, Data quality — Part 61: Data quality management: Process reference model
ISO/IEC 38500, Information technology — Governance of IT for the organization
ISO/IEC 38505-1, Information technology — Governance of IT — Governance of data — Part 1: Application
of ISO/IEC 38500 to the governance of data
© ISO/IEC 2020 – All rights reserved 1
ISO/IEC TR 38505-2, Information technology — Governance of IT — Governance of data — Part 2:
Implications of ISO/IEC 38505-1 for data management
ISO 55000, Asset management — Overview, principles and terminology
ISO 55001, Asset management — Management systems — Requirements
ISO 55002, Asset management — Management systems — Guidelines for the application of ISO 55001
ISO/IEC/IEEE 42010, Systems and software engineering — Architecture description
ISO/IEC 20546, Information technology — Big data — Overview and vocabulary
ISO/IEC 17789, Information technology — Cloud computing — Reference architecture
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 8000-2, ISO/TS 8000-60,
ISO 8000-61, ISO/IEC 38500, ISO/IEC 38505-1, ISO/IEC TR 38505-2, ISO 55000, ISO 55001, ISO 55002,
ISO/IEC/IEEE 42010, ISO/IEC 20546, ISO/IEC 17789 and the following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at http:// www .electropedia .org/
3.1
data
reinterpretable representation of information (3.3) in a formalized manner suitable for communication,
interpretation, or processing
[SOURCE: ISO/IEC 2382:2015, 2121272]
3.2
reference architecture
authoritative source of information about a specific subject area that guides and constrains the
instantiations of multiple architectures and solutions
Note 1 to entry: This document utilizes the definition of reference architecture from DoD “reference architecture
[7]
description” .
Note 2 to entry: Reference architectures generally serve as a foundation for solution architectures and can also
be used for comparison and alignment of instantiations of architectures and solutions.
3.3
information
data (3.1) that are processed, organized and correlated to produce meaning
Note 1 to entry: Information concerns facts, concepts, objects, events, ideas, processes, etc.
3.4
activity
specified pursuit or set of tasks
[SOURCE: ISO/IEC 17789:2014, 3.2.1]
3.5
knowledge
maintained, processed, and interpreted information (3.3)
[SOURCE: ISO 5127:2017, 3.1.1.17]
2 © ISO/IEC 2020 – All rights reserved
3.6
functional component
functional building block needed to engage in an activity (3.4), backed by an implementation
[SOURCE: ISO/IEC 17789:2014, 3.2.3]
3.7
data governance
property or ability that needs to be coordinated and implemented by a set of activities (3.4) aimed to
design, implement and monitoring a strategic plan for data asset management
Note 1 to entry: Governance of data is described in ISO/IEC 38505-1.
Note 2 to entry: Data asset is understood as a set of data items, or data entities, that have a real or potential
benefit for an organization. Data asset is a subset of asset defined in ISO 55000. A benefit is an advantage to the
organization of the actionable knowledge derived from an analytic system. It is often ascribed to big data due to
the understanding that data has potential benefit that was typically not considered previously.
Note 3 to entry: A strategic plan for data asset management is a document specifying how data management (3.15)
is to be aligned to the organizational strategy. This term has the same meaning as strategic asset management
plan (SAMP) defined in ISO 55000 with data point of view.
3.8
data quality
degree to which the characteristics of data satisfy stated and implied needs when used under specified
conditions
[SOURCE: ISO/IEC 25024:2015, 4.11]
3.9
data quality management
coordinated activities to direct and control an organization with regard to data quality
[SOURCE: ISO 8000-2:2018, 3.4.9]
3.10
party
natural person or legal person, whether or not incorporated, or a group of either
[SOURCE: ISO/IEC 17789:2014, 7.2.3]
3.11
policy
intention and direction of an organization as formally expressed by its top management
[SOURCE: ISO 55000:2014, 3.1.18, modified — The term has been changed to the singular form and the
final stop has been removed from the definition.]
3.12
role
set of activities (3.4) that serves a common purpose
[SOURCE: ISO/IEC 17789:2014, 3.2.7]
3.13
stream
list of flow objects attached to a port of a flow object
[SOURCE: ISO/IEC 10179:1996, 4.33, modified — by deleting leading article and trailing full stop.]
© ISO/IEC 2020 – All rights reserved 3
3.14
sub-role
subset of the activities (3.4) of a given role (3.12)
[SOURCE: ISO/IEC 17789:2014, 3.2.9]
3.15
data management
set of activities (3.4) aimed to implement the big data architecture that best meet business goals by
following the strategic plan for data management assessment
3.16
data lifecycle
stages in the management of a data
Note 1 to entry: The target of lifecycle (defined in ISO 55000) is data in this document.
3.17
application programming interface
API
boundary across which application software uses facilities of programming languages to invoke
services
[SOURCE: ISO/IEC 18012-2:2012, 3.1.4, modified — Note 1 to entry has been removed and the final stop
has been deleted from the definition.]
4 Abbreviated terms
ACID atomicity, consistency, isolation, and durability
API application programming interface
CEP complex event processing
CPU central processing unit
BDA big data auditor
BDAP big data application provider
BDAcP big data access provider
BDAnP big data analytics provider
BDC big data consumer
BDCP big data collection provider
BDFP big data framework provider
BDIP big data infrastructure provider
BDP big data provider
BDPlaP big data platform provider
BDPreP big data preparation provider
BDProP big data processing provider
4 © ISO/IEC 2020 – All rights reserved
BDRA big data reference architecture
BDSD big data service developer
BDSO big data system orchestrator
BDSP big data service partner
BDVP big data visualization provider
DG data governance
DM data manager
DQM data quality manager
PII personally identifiable information
RA reference architecture
5 Conventions
The diagrams that appear in this document are presented using the conventions that are shown in
Table 1. This notation is used as described in ISO/IEC 17789.
Table 1 — Legend to the diagrams used throughout this document
Object Meaning
Party
Role
Sub-Role
Activity
Functional component
Cross-cutting aspect
6 Big data reference architecture concepts
6.1 General
This document defines a BDRA that serves as a fundamental reference point for big data standardization
and which provides an overall architecture framework for the basic concepts and principles of a big
data system.
This document describes the logical relationships between the roles/sub-roles, activities, and functional
components, and cross-cutting aspects that comprise a big data system architecture.
Standards can be relevant to some of these relationships. Standards associated with a relationship can
be used to:
— specify degrees of information flow or other types of interoperability; and/or
— ensure specified degrees of quality (e.g. security or service level).
© ISO/IEC 2020 – All rights reserved 5
Logical relationships defined in this architecture are a significant part of specifying the BDRA and its
behaviour. The relationship describes matters such as the categories of information flows between the
functional components in the BDRA.
6.2 Views
Big data can be described using a viewpoint approach. Four distinct viewpoints are used in the BDRA
(see Figure 1 and Table 2):
Key
1 user view
2 functional view
3 implementation view
4 deployment view
Figure 1 — Transformations between architectural views
Table 2 — BDRA views
BDRA view Description of the BDRA view Scope
User view The ecosystem of big data with the stakeholders (used in ISO/ Within scope
IEC/IEEE 42010), the roles, the sub-roles and the big data
activities
Functional view The functions necessary for the support of big data activities Within scope
Implementation view The functions necessary for the implementation of big data Out of scope
within service parts and/or infrastructure parts
Deployment view How the functions of big data are technically implemented Out of scope
within already existing infrastructure elements or within new
elements to be introduced in this infrastructure
NOTE While details of the user view and functional view are addressed within this document, the
implementation and deployment views are related to technology and vendor-specific big data implementations
and actual deployments, and are therefore out of the scope of this document.
6.3 Overview of user view
The user view addresses the ecosystem of big data with the following concepts:
— parties: a party is a natural person or legal person, whether or not incorporated, or a group of
either or both parties in a big data ecosystem are its stakeholders;
— roles and sub-roles: a role is a set of big data activities that serves a common purpose. a sub-role
is a subset of the big data activities for a given role, and different sub-roles can share the big data
activities associated with a given role;
— activities: an activity is defined as a specified pursuit or set of tasks. big data activities need to have
a purpose and deliver one or more outcomes and these are conducted using functional components;
6 © ISO/IEC 2020 – All rights reserved
— cross-cutting aspects: cross-cutting aspects can be shared and can impact multiple roles, and
big data activities. Cross-cutting aspects may map to multi-layer functions and their associated
functional components which implement the activities within the cross-cutting aspect.
NOTE A party can assume more than one role at any given point in time and can engage in a specific subset of
activities of that role. Examples of parties include, but are not limited to, large corporations, small- and medium-
sized enterprises, government departments, academic institutions and private citizens.
Figure 2 illustrates the entities that are defined for the user view.
Key
1 party
2 role
3 sub-role
4 activity
5 cross-cutting aspect
Figure 2 — User view entities
6.4 Overview of functional view
The functional view is a technology-neutral view of the functions necessary to form a big data system.
The functional view describes the distribution of functions necessary for the support of big data
activities.
The functional architecture also defines the dependencies between functions.
The functional view addresses the following big data concepts:
— functional components: a functional component is a functional building block needed to engage in
an activity, backed by an implementation;
— functional layers: a layer is a set of functional components that provide similar capabilities or
serve a common purpose;
— multi-layer functions: the multi-layer functions include functional components that provide
capabilities that are used across multiple functional layers, and they are grouped into subsets.
NOTE Not all layers or functional components are necessarily instantiated in a specific big data system.
Figure 3 illustrates the concepts of functional components, layers and multi-layer functions.
© ISO/IEC 2020 – All rights reserved 7
Figure 3 — Functional layering
6.5 Relationship between the user view and the functional view
Figure 4 illustrates how the user view provides the set of big data activities that are represented within
the functional view.
Figure 4 — From user view to functional view
6.6 Relationship of the user view and functional view to cross-cutting aspects
Cross-cutting aspects, as their name implies, apply both across the user view and across the functional
view of big data.
Cross-cutting aspects apply to roles and sub-roles in the user view and they directly or indirectly affect
the activities that those roles perform.
Cross-cutting aspects also apply to the functional components within the functional view which are
used when performing the activities described in the user view (See Figure 4).
8 © ISO/IEC 2020 – All rights reserved
Cross-cutting aspects of big data described in Clause 9 include:
— security and privacy;
— management;
— data governance.
7 User view
7.1 Big data roles, sub-roles, and activities
Given that distributed services and their delivery are at the core of big data, all big data related activities
can be categorized into three main groups: activities that use big data, activities that provide big data
analytics services and activities that provide data.
This clause contains descriptions of some of the common roles and sub-roles associated with big data.
It is important to note that a party can play more than one role at any given point in time. When playing
a role, the party can restrict itself to playing one or more sub-roles. Sub-roles are a subset of the big
data activities of a given role.
As shown in Figure 5, the roles of big data are:
— big data application provider (BDAP) (see 7.2);
— big data framework provider (BDFP) (see 7.3);
— big data service partner (BDSP) (see 7.4);
— big data provider (BDP) (see 7.5);
— big data consumer (BDC) (see 7.6).
NOTE Big data provider is any data provider to the BDRA.
© ISO/IEC 2020 – All rights reserved 9
Figure 5 — Big data roles
Annex B provides examples of the relationship of roles in big data ecosystems.
Each of the sub-roles shown in Figure 5 is described in more detail in 7.2 to 7.6.
7.2 Role: Big data application provider (BDAP)
7.2.1 General
The BDAP executes the manipulations of the big data lifecycle. This is where the general capabilities
within user view of the big data reference architecture as shown in Figure 5 are combined to produce
the specific data system.
NOTE 1 While the activities of an application provider are the same whether the solution being built concerns
big data or not, the methods and techniques have changed because the data and data processing is parallelized
across resources.
NOTE 2 As data propagates through the ecosystem, it is being processed and transformed in different ways
in order to extract the value from the information. Each activity of the big data application provider can be
implemented by independent stakeholders and deployed as stand-alone services.
NOTE 3 The BDAP can be a single instance or a collection of more granular big data application providers, each
implementing different steps in the big data lifecycle. Each of the activities of the big data application provider
can be a general service invoked by the data provider or big data consumer, such as a web server, a file server, a
collection of one or more application programs, or a combination.
NOTE 4 The BDAP is in charge of the implementation, testing and validation of the data quality business
rules, requirements and metrics that assure the correct management of data in the big data system. Any big data
application provider can apply the data quality requirements throughout the big data lifecycle.
The BDAP is composed of the following five sub-roles as shown in Figure 6:
— big data collection provider (BDCP) (see 7.2.2);
10 © ISO/IEC 2020 – All rights reserved
— big data preparation provider (BDPreP) (see 7.2.3);
— big data analytics provider (BDAnP) (see 7.2.4);
— big data visualization provider (BDVP) (see 7.2.5);
— big data access provider (BDAcP) (see 7.2.6).
Figure 6 — Big data activities relating to big data application provider sub-roles
7.2.2 Sub-role: big data collection provider (BDCP)
The BDCP is a sub-role of BDAP which is responsible for the collection of big data from data provider.
This can be a general service, such as a file server or web server to accept or perform specific collections
of data, or it can be an application specific service designed to pull data or receive pushes of data from
the data provider.
The BDCP activities are as follows:
— the find data source activity is focused on searching and storing data source information as a form
of metadata which can be used for capturing and/or storing data;
— the capture data activity is focused on converting available data (e.g. web document, blog data,
etc.) into a form that can be handled by system;
— the register and buffer data activity is focused on storing data into data registry or holding data
before transferring it to other tasks or processes.
7.2.3 Sub-role: big data preparation provider (BDPreP)
The BDPreP is a sub-role of the BDAP which is responsible for preparing data from raw data to ready for
analyzing data.
The BDPreP activities are as follows:
— the transform data activity is focused on converting data or information from one format to
another;
— the validate data activity is focused on ensuring that the data is correct based on the validation
constraints such as correctness, meaningfulness, security and privacy, etc.;
© ISO/IEC 2020 – All rights reserved 11
— the cleanse data activity is focused on detecting inaccurate part of data and correcting them by
replacing, modifying or deleting;
— the aggregate data activity is focused on combining two or more data into one dataset in
summary form.
Data validation and data cleansing should be guided by the application of the data quality management.
7.2.4 Sub-role: big data analytics provider (BDAnP)
The BDAnP is a sub-role of BDAP which is responsible for analysing big data in order to meet the
requirements of the data processing algorithms for processing the data to produce insights that address
the technical goal.
The BDAnP activity includes an associate analytic logic activity which involves modelling data
processes with associated logic for extracting information from the data based on the requirements of
the application.
7.2.5 Sub-role: big data visualization provider (BDVP)
The BDVP is a sub-role of BDAP which is responsible for presenting data source information or analysis
result to big data consumer. The objective of these activities is to format and present data in such a way
as to optimally communicate meaning and knowledge.
The BDVP activities are as follows:
— the manifest data status activity involves describing data status in data storage. this can include
various visualization, classification criteria, etc.;
— the format analysis result activity involves formatting processed data for clear and efficient
communication. This can include visual representation, overlaying, etc.
7.2.6 Sub-role: big data access provider (BDAcP)
The BDAcP is a sub-role of BDAP which is responsible for exchanging big data between big data
application and data provider or big data consumer.
The BDAcP activity includes a transfer data activity focused on passing or moving big data from one
system to another system with data transmission integrity, continuity, security and privacy.
7.3 Role: big data framework provider (BDFP)
7.3.1 General
The BDFP consists of one or more hierarchically organized instances of the components. There is no
requirement that all instances at a given level in the hierarchy be of the same technology.
NOTE In fact, most big data implementations are hybrids that combine multiple technology approaches
in order to provide flexibility or meet the complete range of requirements, which are driven from the big data
application provider.
The BDFP is comprised of the following three sub-roles as shown in Figure 7:
— big data infrastructure provider (BDIP) (see 7.3.2);
— big data platform provider (BDPlaP) (see 7.3.3);
— big data processing provider (BDProP) (see 7.3.4).
12 © ISO/IEC 2020 – All rights reserved
Figure 7 — Big data activities relating to big data framework provider sub-roles
7.3.2 Sub-role: big data infrastructure provider (BDIP)
The BDIP is a sub-role of BDFP which is responsible for providing system resources including system
facilities (e.g. networking, computing, storage, etc.) and physical environment (e.g. computer rooms,
electric powers, air conditioners, etc.).
The BDIP activities are as follows:
— the manipulate resources activity is focused on handling or controlling physical or virtual
resources;
— the store/retrieve data activity involves persisting and recalling data from storage (manipulating
data at rest.);
— the transmit/receive data activity is focused on transferring data via network (putting data in
motion.).
7.3.3 Sub-role: big data platform provider (BDPlaP)
The BDPlaP is a sub-role of BDFP which is responsible for providing platforms to organize and distribute
big data on big data infrastructure.
The BDPlaP activities are as follows:
— the organize data activity involves arranging, indexing and linking the data in ways that are
suitable for the specific applications and analytics;
— the distribute data activity involves allocating data across infrastructure resources to maximize
data locality for distributed computation performance.
7.3.4 Sub-role: big data processing provider (BDProP)
The BDProP is a sub-role of BDFP which is responsible for supporting computing and analytic processes
for BDAP activities.
© ISO/IEC 2020 – All rights reserved 13
The BDProP activities are as follows:
— process data in batches: process data in large increments and on a non-continuous basis. Batch
process is used when response time is not critical. Batch processing is most often associated with
the volume of the data or complexity of the analysis;
— process data in streams: process data continuously in small increments (typically individual
records or data elements). Stream processing is used when response time is critical and is most
often associated with the velocity of the data.
7.4 Role: big data service partner (BDSP)
7.4.1 General
The BDSP is a role which is engaged in support of, or auxiliary to, activities of among the big data application
provider, the big data framework provider, the big data provider or the big data consumer, or all.
A BDSP’s big data activities vary depending on the type of partner and their relationship with other
roles in big data ecosystem.
The BDSP is comprised of the following three sub-roles as shown in Figure 8:
— big data service developer (BDSD) (see 7.4.2);
— big data auditor (BDA) (see 7.4.3);
— big data system orchestrator (BDSO) (see 7.4.4).
Figure 8 — Big data activities relating to big data service partner sub-roles
14 © ISO/IEC 2020 – All rights reserved
7.4.2 Sub-role: big data service developer (BDSD)
The BDSD is a sub-role of BDSP which is responsible for designing, developing, testing and maintaining
the implementation of a big data service. This can involve composing the service implementation from
existing service implementations.
The BDSD activities are as follows:
— the design, create and maintain service components activity involves designing and creating
software components that are part of the implementation of a big data service, and providing fixes
or enhancements to service implementations;
— the compose services activity is focused on composing services using existing services by
intermediation, aggregation of them;
— the test services activity focuses on testing the components and services created by the big data
service developer.
7.4.3 Sub-role: big data auditor (BDA)
The BDA is a sub-role of BDSP with the responsibility of conducting an audit of the provision and use
of big data services. A big data audit covers veracity of data sources, operations, performance, security
and privacy, and examines whether a specified set of audit criteria are met.
NOTE 1 There is a variety of specifications for the audit criteria, for example, addresses security
[8]
considerations .
The BDA activities are as follows:
— the perform audit activity involves requesting or obtaining audit evidence, conducting any required
tests on the system or data being audited and obtaining evidence programmatically;
— the report audit results activity involves providing a documented report of the results of
performing an audit.
NOTE 2 The BDA is responsible for the assessment of data quality, the definition and evaluation of data quality
service levels, the continuous measurement and surveillance of data quality.
7.4.4 Sub-role: big data system orchestrator (BDSO)
The BDSO is a sub-role of BDSP which provides the overarching requirements that the system should
fulfil, including policy, governance, architecture, resources, and business requirements, as well as
monitoring activities to ensure the system complies with those requirements.
The BDSO activities are as follows:
— the define application requirements activity deals with the overarching requirements that big
data application should fulfil;
— the define business process activity deals with a partially ordered set of enterprise activities that
can be executed to realise a given objective of an enterprise or a part of an enterprise to achieve
some desired end-result;
— the define system architecture requirements activity deals with conceptual requirements for
defining the structure, behaviour, and view of a big data system;
— the define security and privacy requirements activity is focused on defining security and privacy
requirement from a governance point of view;
— the define da
...
Frequently Asked Questions
ISO/IEC 20547-3:2020 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology - Big data reference architecture - Part 3: Reference architecture". This standard covers: This document specifies the big data reference architecture (BDRA). The reference architecture includes concepts and architectural views. The reference architecture specified in this document defines two architectural viewpoints: - a user view defining roles/sub-roles, their relationships, and types of activities within a big data ecosystem; - a functional view defining the architectural layers and the classes of functional components within those layers that implement the activities of the roles/sub-roles within the user view. The BDRA is intended to: - provide a common language for the various stakeholders; - encourage adherence to common standards, specifications, and patterns; - provide consistency of implementation of technology to solve similar problem sets; - facilitate the understanding of the operational intricacies in big data; - illustrate and understand the various big data components, processes, and systems, in the context of an overall big data conceptual model; - provide a technical reference for government departments, agencies and other consumers to understand, discuss, categorize and compare big data solutions; and - facilitate the analysis of candidate standards for interoperability, portability, reusability, and extendibility.
This document specifies the big data reference architecture (BDRA). The reference architecture includes concepts and architectural views. The reference architecture specified in this document defines two architectural viewpoints: - a user view defining roles/sub-roles, their relationships, and types of activities within a big data ecosystem; - a functional view defining the architectural layers and the classes of functional components within those layers that implement the activities of the roles/sub-roles within the user view. The BDRA is intended to: - provide a common language for the various stakeholders; - encourage adherence to common standards, specifications, and patterns; - provide consistency of implementation of technology to solve similar problem sets; - facilitate the understanding of the operational intricacies in big data; - illustrate and understand the various big data components, processes, and systems, in the context of an overall big data conceptual model; - provide a technical reference for government departments, agencies and other consumers to understand, discuss, categorize and compare big data solutions; and - facilitate the analysis of candidate standards for interoperability, portability, reusability, and extendibility.
ISO/IEC 20547-3:2020 is classified under the following ICS (International Classification for Standards) categories: 35.020 - Information technology (IT) in general. The ICS classification helps identify the subject area and facilitates finding related standards.
ISO/IEC 20547-3:2020 has the following relationships with other standards: It is inter standard links to ISO 7218:2024. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.
You can purchase ISO/IEC 20547-3:2020 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...