ETSI GS PDL 013 V1.1.1 (2022-10)
Permissioned Distributed Ledger (PDL); Supporting Distributed Data Management
Permissioned Distributed Ledger (PDL); Supporting Distributed Data Management
DGS/PDL-0013_Sup_Dis_Data_Mgmt
General Information
Standards Content (Sample)
GROUP SPECIFICATION
Permissioned Distributed Ledger (PDL);
Supporting Distributed Data Management
Disclaimer
The present document has been produced and approved by the Permissioned Distributed Ledger (PDL) ETSI Industry
Specification Group (ISG) and represents the views of those members who participated in this ISG.
It does not necessarily represent the views of the entire ETSI membership.
2 ETSI GS PDL 013 V1.1.1 (2022-10)
Reference
DGS/PDL-0013_Sup_Dis_Data_Mgmt
Keywords
data management, PDL
ETSI
650 Route des Lucioles
F-06921 Sophia Antipolis Cedex - FRANCE
Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16
Siret N° 348 623 562 00017 - APE 7112B
Association à but non lucratif enregistrée à la
Sous-Préfecture de Grasse (06) N° w061004871
Important notice
The present document can be downloaded from:
http://www.etsi.org/standards-search
The present document may be made available in electronic versions and/or in print. The content of any electronic and/or
print versions of the present document shall not be modified without the prior written authorization of ETSI. In case of any
existing or perceived difference in contents between such versions and/or in print, the prevailing version of an ETSI
deliverable is the one made publicly available in PDF format at www.etsi.org/deliver.
Users of the present document should be aware that the document may be subject to revision or change of status.
Information on the current status of this and other ETSI documents is available at
https://portal.etsi.org/TB/ETSIDeliverableStatus.aspx
If you find errors in the present document, please send your comment to one of the following services:
https://portal.etsi.org/People/CommiteeSupportStaff.aspx
If you find a security vulnerability in the present document, please report it through our
Coordinated Vulnerability Disclosure Program:
https://www.etsi.org/standards/coordinated-vulnerability-disclosure
Notice of disclaimer & limitation of liability
The information provided in the present deliverable is directed solely to professionals who have the appropriate degree of
experience to understand and interpret its content in accordance with generally accepted engineering or
other professional standard and applicable regulations.
No recommendation as to products and services or vendors is made or should be implied.
No representation or warranty is made that this deliverable is technically accurate or sufficient or conforms to any law
arranty is made of merchantability or fitness
and/or governmental rule and/or regulation and further, no representation or w
for any particular purpose or against infringement of intellectual property rights.
In no event shall ETSI be held liable for loss of profits or any other incidental or consequential damages.
Any software contained in this deliverable is provided "AS IS" with no warranties, express or implied, including but not
limited to, the warranties of merchantability, fitness for a particular purpose and non-infringement of intellectual property
rights and ETSI shall not be held liable in any event for any damages whatsoever (including, without limitation, damages
for loss of profits, business interruption, loss of information, or any other pecuniary loss) arising out of or related to the use
of or inability to use the software.
Copyright Notification
No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and
microfilm except as authorized by written permission of ETSI.
The content of the PDF version shall not be modified without the written authorization of ETSI.
The copyright and the foregoing restriction extend to reproduction in all media.
© ETSI 2022.
All rights reserved.
ETSI
3 ETSI GS PDL 013 V1.1.1 (2022-10)
Contents
Intellectual Property Rights . 4
Foreword . 4
Modal verbs terminology . 4
Executive summary . 4
Introduction . 4
1 Scope . 5
2 References . 5
2.1 Normative references . 5
2.2 Informative references . 5
3 Definition of terms, symbols and abbreviations . 5
3.1 Terms . 5
3.2 Symbols . 6
3.3 Abbreviations . 6
4 PDL Reference Architecture . 7
4.1 Introduction . 7
4.2 Platform Services Layer . 9
5 Distributed Data Management . 11
5.1 Introduction . 11
5.2 Distributed Data Discovery . 12
5.3 Distributed Data Collection . 13
5.4 Distributed Data Storage . 14
5.5 Distributed Data Sharing . 14
5.6 Distributed Data Computation . 15
5.7 DDM Requirements . 16
5.7.1 Introduction. 16
5.7.2 Decentralization . 16
5.7.3 Trust . 17
5.7.4 Incentivization . 17
5.7.5 Data Provenance . 17
5.7.6 Data Privacy. 18
5.7.7 Data Integrity . 18
5.7.8 Data Control and Sovereignty . 19
5.7.9 Data Management Automation . 19
6 Architectural Requirements for PDL-based DDM . 20
6.1 Distributed Data Applications . 20
6.2 Platform Services Layer . 20
6.3 Underlying DLT Networks. 21
7 PDL-based Distributed Data Management Architecture . 21
7.1 Introduction . 21
7.2 Application Registration Platform Service . 22
7.3 Registration Platform Service. 22
7.4 Messaging Platform Service . 23
7.5 Storage Platform Service . 23
7.6 Transaction Management Platform Service . 24
7.7 Discovery Platform Service . 24
History . 25
ETSI
4 ETSI GS PDL 013 V1.1.1 (2022-10)
Intellectual Property Rights
Essential patents
IPRs essential or potentially essential to normative deliverables may have been declared to ETSI. The declarations
pertaining to these essential IPRs, if any, are publicly available for ETSI members and non-members, and can be
found in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to
ETSI in respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the
ETSI Web server (https://ipr.etsi.org/).
Pursuant to the ETSI Directives including the ETSI IPR Policy, no investigation regarding the essentiality of IPRs,
including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not
referenced in ETSI SR 000 314 (or the updates on the ETSI Web server) which are, or may be, or may become,
essential to the present document.
Trademarks
The present document may include trademarks and/or tradenames which are asserted and/or registered by their owners.
ETSI claims no ownership of these except for any which are indicated as being the property of ETSI, and conveys no
right to use or reproduce any trademark and/or tradename. Mention of those trademarks in the present document does
not constitute an endorsement by ETSI of products, services or organizations associated with those trademarks.
DECT™, PLUGTESTS™, UMTS™ and the ETSI logo are trademarks of ETSI registered for the benefit of its
Members. 3GPP™ and LTE™ are trademarks of ETSI registered for the benefit of its Members and of the 3GPP
Organizational Partners. oneM2M™ logo is a trademark of ETSI registered for the benefit of its Members and of the ®
oneM2M Partners. GSM and the GSM logo are trademarks registered and owned by the GSM Association.
Foreword
This Group Specification (GS) has been produced by ETSI Industry Specification Group (ISG) Permissioned
Distributed Ledger (PDL).
Modal verbs terminology
In the present document "shall", "shall not", "should", "should not", "may", "need not", "will", "will not", "can" and
"cannot" are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of
provisions).
"must" and "must not" are NOT allowed in ETSI deliverables except when used in direct citation.
Executive summary
The present document defines requirements and functional architecture of supporting distributed data management
based on Permissioned Distributed Ledger (PDL) reference architecture. This includes expanded ETSI ISG-PDL
platform services for supporting distributed data management.
Introduction
The present document specifies PDL-based distributed data management. The organization of the present document is
as follows. Clause 1 defines the scope of the present document. Clauses 2 and 3 provide normative and informative
references and definition of terms, respectively. Clause 4 provides an overview of PDL reference architecture.
Clause 5 describes distributed data management use cases and requirements. Clause 6 lists architectural requirements of
PDL-based distributed data management. Clause 7 defines expanded ETSI ISG-PDL platform services for PDL-based
distributed data management.
ETSI
5 ETSI GS PDL 013 V1.1.1 (2022-10)
1 Scope
The present document specifies distributed data management based on PDL reference architecture. This includes:
• defining architectural requirements that are derived from distributed data management use cases including
related use cases such as those described in ETSI GR PDL 009 [i.1] and ETSI GR PDL 002 [i.2];
• defining PDL-based distributed data management architecture according to PDL reference architecture as
defined in ETSI GS PDL 012 [1]); and
• defining expanded ETSI ISG-PDL platform services for PDL-based distributed data management.
2 References
2.1 Normative references
References are either specific (identified by date of publication and/or edition number or version number) or
non-specific. For specific references, only the cited version applies. For non-specific references, the latest version of the
referenced document (including any amendments) applies.
Referenced documents which are not found to be publicly available in the expected location might be found at
https://docbox.etsi.org/Reference.
NOTE: While any hyperlinks included in this clause were valid at the time of publication, ETSI cannot guarantee
their long term validity.
The following referenced documents are necessary for the application of the present document.
[1] ETSI GS PDL 012: "Permissioned Distributed ledger (PDL); Reference Architecture".
[2] ETSI GS PDL 011: "Permissioned Distributed Ledger (PDL); Specification of Requirements for
Smart Contracts' architecture and security".
2.2 Informative references
References are either specific (identified by date of publication and/or edition number or version number) or
non-specific. For specific references, only the cited version applies. For non-specific references, the latest version of the
referenced document (including any amendments) applies.
NOTE: While any hyperlinks included in this clause were valid at the time of publication, ETSI cannot guarantee
their long term validity.
The following referenced documents are not necessary for the application of the present document but they assist the
user with regard to a particular subject area.
[i.1] ETSI GR PDL 009 (V1.1.1): "Permissioned Distributed Ledger (PDL); Federated Data
Management".
[i.2] ETSI GR PDL 002 (V1.1.1): "Permissioned Distributed Ledger (PDL); Applicability and
compliance to data processing requirements".
3 Definition of terms, symbols and abbreviations
3.1 Terms
Void.
ETSI
6 ETSI GS PDL 013 V1.1.1 (2022-10)
3.2 Symbols
Void.
3.3 Abbreviations
For the purposes of the present document, the following abbreviations apply:
API Application Programming Interface
ARPS Application Registration Platform Service
DC Data Collector
DCC Data Computation Controller
DCN Data Computation Node
DCS Data Consumer
DD Data Discoverers
DDAPP Distributed Data Application
DDM Distributed Data Management
DH Data Host
DLT Distributed Ledger Technology
DO Data Owner
DP Data Provider
DPS Discovery Platform Service
DS Data Source
ETSI European Telecommunications Standards Institute
ETSI ISG-PDL ETSI Industry Specification Group for Permissioned Distributed Ledger
FCAPS Fault, Configuration, Accounting, Performance and Security
FL Federated Learning
GDPR General Data Protection Regulation
GR Group Report
GS Group Specification
IRP Interface Reference Point
ISG Industry Specification Group
MARL Multi-Agent Reinforcement Learning
MPS Messaging Platform Service
PDL Permissioned Distributed Ledger
RD Requirement on Decentralization
RDCS Requirement on Data Control and Sovereignty
RDDA Requirement on Distributed Data Application
RDI Requirement on Data Integrity
RDMA Requirement on Data Management Automation
RDP Requirement on Data Privacy
RDPV Requirement on Data Provenance
RI Requirement on Incentivization
RL Reinforcement Learning
RPS Registration Platform Service
RPSL Requirement on Platform Service Layer
RT Requirement on Trust
RUDLTN Requirement on Underlying DLT Networks
SPS Storage Platform Service
TCI Transaction Creation Indication
TMPS Transaction Management Platform Service
ETSI
7 ETSI GS PDL 013 V1.1.1 (2022-10)
4 PDL Reference Architecture
4.1 Introduction
ETSI GS PDL 012 [1] develops a layered PDL reference architecture, which consists of five layers as illustrated in
Figure 4.1-1. Each layer is designed in a manner that allows abstraction, such that it can be operated regardless of the
implementation specifics of the other layers. In addition, Interface Reference Points (IRPs) are defined between
different layers:
• PDL Applications: Various PDL-based applications leverage PDL services as provided by the below
described Service Layer to interact with different DLT networks. For example, a PDL-based data sharing
application utilizes a DLT network as a distributed infrastructure to enable distributed sharing. An application
may also interact with external storage to store certain data that requires better privacy control or to reduce the
overhead to DLT networks.
• Application Abstract Layer: This layer utilizes Data Model Brokers/Gateways enabling applications that
allow different data models to communicate with ETSI ISG-PDL compliant platforms. This layer is located
between the PDL Applications and Platform Services Layers and implemented through the Data Model Broker
Platform Service where necessary.
• Platform Services Layer, which provides useful services for applications to support various types of
applications using PDL technology. As a result, an application could leverage services from the Platform
Service Layer rather than embed such services within the application itself. This reduces applications'
complexity, accelerates application development and deployment, and increases interoperability. For example,
the Platform Service Layer may include a Transaction Management Platform Service to facilitate transaction
creation in a manner transparent to a specific PDL type (i.e. a specific deployed DLT network) and in a
manner uniform across applications using such platform; this is an example of layer abstraction in its essence.
Such a Transaction Management Platform Service can perform transaction transformation/adaptation between
applications running on different PDL types to facilitate application operations in a complex environment.
• DLT Abstraction Layer, which consists of a Data Model Broker/Gateway enabling Platform services to
communicate with ETSI ISG-PDL compliant PDL types regardless of the specific type of the underlying PDL.
An additional functionality of such abstraction layer is to allow interoperability between different DLT types,
which may differ not only in data model structure but also on consensus mechanism and smart-contract
functionality. Such abstraction layer hides the differences between PDL types and provides a unified service-
facing interface on the services side and a PDL specific interface on the PDL side. This layer is located
between the "Techno" and the "Disco" IRPs and implemented through the Data-Model Broker Platform
Service where applicable.
• DLT Layer, which includes various DLT networks (e.g. an implementation of a specific DLT type) and
potentially the abstraction of DLT networks. While DLT networks and chains may vary in terms of consensus
mechanism and smart contract format, the abstract functionality of a chain is very similar across most DLTs:
Storing a distributed chain of data blocks in a tamper-resistant manner, and performing pre-programmed
actions based on rules (i.e. "Smart Contracts") on all copies of the distributed chain. Yet, not all DLT types are
necessarily compliant with the ETSI ISG-PDL layered architecture approach, thus the DLT layer can only
include and accommodate DLT types that are compliant with said architecture.
• Interface Reference Points (IRPs), which define communication channels through which the functional
blocks defined above communicate with each other. The IRPs are given names for reference purposes (e.g.
Debka, Tango, etc.).
ETSI
8 ETSI GS PDL 013 V1.1.1 (2022-10)
Figure 4.1-1: ETSI ISG-PDL Reference Architecture (Source: ETSI GS PDL 012 [1])
Four ETSI ISG-PDL platform categories (namely: Alpha, Bravo, Charlie, and Delta) are defined in ETSI
GS PDL 012 [1]. The differences among those four categories lie in a few factors:
1) the number of involved vendors;
2) the number of supported underlying DLT technologies; and
3) the number of supported applications.
• Alpha Platforms: that are designed, developed, delivered, and integrated to all users of the said platform by a
single vendor using a single DLT technology.
• Bravo Platforms: that are designed, developed, delivered, and integrated to all users of the said platform by a
single vendor, but can operate using two or more underlying DLT technologies.
• Charlie Platforms: that can operate using two or more underlying DLT technologies and are designed and
developed towards a specification of an application abstraction layer so that any application that supports such
an abstraction layer can interface with the said platform. Moreover - the Platform Services in a Charlie
Platform can be developed by multiple vendors towards the specification defined in ETSI GS PDL 012 [1].
ETSI
9 ETSI GS PDL 013 V1.1.1 (2022-10)
• Delta Platforms: That use a single DLT technology and are designed and developed towards a specification of
an application abstraction layer so that any Application that supports such an abstraction layer can interface
with the said platform. This is, in essence, a simplified Charlie Platform that uses a single DLT type thus
eliminating the DLT abstraction layer and eliminating the overheads associated with DLT interoperability.
4.2 Platform Services Layer
The Platform Services Layer hosts several types of services; details of each service are described in ETSI
GS PDL 012 [1] where each such Platform Service is defined:
• The PDL Platform Services can be Atomic services or Composite services. Each such service could be
Mandatory or Optional. Atomic services are self-sufficient and do not rely on other Platform services for their
proper operation, while Composite services use one or more other Platform service to operate. A platform
cannot function properly unless all Mandatory Platform Services are implemented therein, while Optional
services may only be required for specific purposes or use-cases.
• PDL Platform Services are services and functionalities provided by the PDL platform that all applications may
use. Platform Services may reuse or be built upon other Platform Services. Examples of Platform Services
include: namespace, identity, location, discovery, messaging, policy, governance, security, composition,
access control, concurrency storage, modelling, distributed processing, resource management, service
management, transaction management, etc.
• In addition, PDL Platform Services Layer has Application Specific Platform Services that are services used by
specific applications or specific groups of applications and are not needed or cannot be made useful for other
applications (e.g. measurement of precipitation is useful for agriculture and weather applications but has no
use for data storage applications). Such services may be implemented within the application itself, however the
developer may want to contribute them and install them on the platform so the can be re-used by other
applications in the future if the need arises.
Table 4.2-1 lists the Platform Services as defined in ETSI GS PDL 012 [1].
Table 4.2-1: ETSI ISG-PDL Platform Services [1]
PDL Platform Service Mandatory (M) Atomic (A) or Short description
name or Optional (O) Composite (C)
Namespace M A Ensures that all of a given set of objects for a
particular function have unique names.
Identity M A Unambiguously identifies an instance of an entity
from all other instances of this and other objects.
Location O A Associates an object with a location.
Registration O A List a managed object with authorities or registries.
Discovery O A Discovery of services offered by the services layer
and discovery of PDL networks.
Messaging M C Enables communication between a group of entities.
Policy O C Manage and control the changing and/or maintaining
of the state of managed objects.
Security M C A collection of services that assess, reduce, protect,
and manage security risks.
Authentication M C Verifies that a subject requesting to perform an
operation on a target is who they say they are.
Authorization O C Permitting or denying access to a target by a subject.
Cryptography O C Managing protocols that prevent third parties from
reading private communications.
Encryption O C Encoding information using a key into an
unintelligible form.
Identity Management O C Access control based on the identity of an entity.
Key Management O C Management of cryptographic keys.
Logging O C Dynamic ingestion and collection of logs.
Governance M C Rules and tools that control the behaviour and
function of a PDL.
Implementation O C Rules and agreements that describe how ETSI
Agreements ISG-PDL Services are implemented and control the
behaviour of a PDL platform.
ETSI
10 ETSI GS PDL 013 V1.1.1 (2022-10)
PDL Platform Service Mandatory (M) Atomic (A) or Short description
name or Optional (O) Composite (C)
Governing Entity M C Defines the rules and implementation agreements.
Ensures compliance. Resolves conflicts where
needed.
Composition O C Defines who can compose new services and how
such new services are composed.
Access Control M C Defines who can perform which operations on which
set of target entities.
Fault Tolerance O C Defines how to handle faulty instructions.
Distribution Transparency O C defines how to maintain transparency when
distributing information to target entities.
Publish and Subscribe O C Defines how entities publish services and subscribe
to services.
Concurrency O C Defines how entities handle concurrency.
Storage M C A group of services related to Storage.
In Memory Storage M C Data that is stored in the random access memory of
a computer running an application.
File System Storage M C Storage on a directly connected storage device.
On-Chain Storage M C Application data that is stored in blocks on all nodes
using the chain.
Off-Chain storge O C Information in a digital, machine-readable medium
that is not stored on the main chain.
Distributed Blockchain M C Storage on a Distributed Blockchain ledger.
Storage
Modelling M C A group of services related to Modelling.
Information Model M C Presentation of concepts of interest to platform
management environment in a technology-neutral
form as objects and relationships between objects.
Data Model M C Representation of applicable concepts in a
technology-specific concrete form.
Model Search O C Enables search for specific or generic models within
existing information and data models.
Model Stitching O C Enables integrating multiple models or parts of
models into a single model.
Topology M C Allows a node to identify other nodes on the PDL and
identify which nodes to communicate with when
performing PDL related tasks.
Event Processing M C Processes node-specific and platform-wide events as
they occur.
Distributed Data O C Performs tasks related to collection of data that are
Collection location-independent.
Distributed Secret Sharing O C Sharing of confidential data between nodes in a
manner that maintains confidentiality of the data.
Resource Management M C Defines how to administer and manage Resources.
Resource Discovery O C Enables discovery of resources available to
applications and nodes.
Resource Virtualization O C Creating a virtual resource that mimics the behaviour
of a physical resource.
Resource Inventory O C Management of node-specific and platform-wide
Management resource inventory.
Resource Admin and M C Administration and management of node-specific and
Management platform-wide resources.
Resource FCAPS O C Resource management tasks defined by the ISO
model.
Resource Composition O C Creation and management of composite resources.
Platform Services M C Defines how to administer and manage Platform
Management Services.
Platform Service M C Provides means to discover services available to
Discovery applications and nodes.
Platform Service O C Creating a service using virtual resources.
Virtualization
Platform Service Inventory O C Keeping track of inventory and serviceability of
Management Platform services.
Platform Service Admin M C Administration and management of Platform Services
and Management through governance.
ETSI
11 ETSI GS PDL 013 V1.1.1 (2022-10)
PDL Platform Service Mandatory (M) Atomic (A) or Short description
name or Optional (O) Composite (C)
Platform Service FCAPS O C Platform Service management tasks defined by the
ISO model.
Platform Service O C Creation and management of the composition of
Composition Composite Platform Services.
Application Management M C Creation and management of Applications.
Application Composition M C Composing an Application from two or more
managed objects.
Application and Service O C Orchestrating multiple managed objects so they
Orchestration provide a desired set of behaviours.
Orchestration O C Orchestration of objects, resources, services, and/or
applications so that they collectively provide the
desired functionality and behaviour.
Platform Exploration O C Allows an application to indicate its requirements and
explore whether the platform offers such service
capabilities
Application Registration O C Registers and lists all applications operated on a
platform.
Transaction Management O C Facilitates transaction related interactions between
applications/services and underlying PDL networks.
Data Model O C Defines tools that enable two systems with different
Gateway/Broker data models to interact.
API Presentation O C A specific Data Model Gateway/Broker
implementation for environments that use APIs to
exchange data between objects.
Application Specific O C Serve a specific application or a group of applications
Services but not required or used by other applications using
the platform.
5 Distributed Data Management
5.1 Introduction
Distributed Data Management (DDM) is referred to operation and manipulation of data in distributed manners such as
those illustrated in Figure 5.1-1. For each scenario in Figure 5.1-1, there are multiple distributed parties (referred to as
data nodes); each data node has a Distributed Data Application (DDAPP), which supports a specific distributed data
management task among those distributed data nodes:
• Distributed Data Discovery: Data is discovered from multiple distributed parties. Data discovery is required
for distributed data collection, distributed data storage, distributed data sharing, and distributed data
computation.
• Distributed Data Collection: Data is collected from multiple distributed parties.
• Distributed Data Storage: Data is stored in multiple distributed parties.
• Distributed Data Sharing: Data is distributed and shared among multiple parties.
• Distributed Data Computation: Multiple parties perform data computation in a distributed and collaborative
way, for example, federated learning, distributed machine learning, and multi-party computation.
ETSI
12 ETSI GS PDL 013 V1.1.1 (2022-10)
Distributed Data Discovery Distributed Data Collection Distributed Data Storage Distributed Data Sharing Distributed Data Computation
DDAPP: Distributed Data Application DDAPP: Distributed Data Application
DDAPP: Distributed Data Application DDAPP: Distributed Data Application
DDAPP: Distributed Data Application
DDAPP
DDAPP DDAPP
DDAPP DDAPP
DDAPP DDAPP
DDAPP DDAPP
DDAPP DDAPP
DDAPP
DDAPP DDAPP
PDL Nodes
Distributed PDL Infrastructure
Figure 5.1-1: Distributed Data Management
5.2 Distributed Data Discovery
Data discovery occurs before performing other data management operations such as data collection, data sharing, and
data computation. For example, before original data is collected, it needs to be discovered. In addition, data discovery
may also be needed for data storage such as moving a specific type of data from one place to another. Two types of data
nodes are involved in distributed data discovery: Data Hosts (DHs) and Data Discoverers (DDs). An entity which hosts
data to be discovered can be referred to as a DH. DDs discover expected data from DHs. Data discovery applications
exist in both DDs and DHs for jointly performing distributed data discovery.
Figure 5.2-1 illustrates distributed data discovery, where one or more DDs discover data from multiple distributed DHs.
For example, the data discoverer DD-B discovers its expected data from three DHs (i.e. DH-2, DH-3, and DH-4).
Distributed DHs can register their data to one or more data repositories, from which DDs can discover expected data.
When a DD knows the address of a DH, it can also discover data directly from such DH. In some cases the data
required/expected by a DD is not fully available on a single DH and may require discovery of multiple DHs.
• If data discovery is used as a precursor for distributed data collection, DHs are Data Sources (DSs) while a DD
can be a Data Collector (DC).
Data
DD-A DD-B DD-C
Discoverer
(DD)
Data Host
(DH)
Discover Discover
Data Discovery
Data Data
Application
DH-1 DH-2 DH-3 DH-4 DH-n
Figure 5.2-1: Distributed Data Discovery
ETSI
13 ETSI GS PDL 013 V1.1.1 (2022-10)
5.3 Distributed Data Collection
Two types of data nodes are involved in distributed data collection: Data Collectors (DCs) and Data Sources (DSs). For
the purpose of data collection, a DC is responsible for actively retrieving or passively receiving data from one or more
DSs. In the passive receipt scenario a DS transmits its original data to one or multiple DCs. In the active retrieve
scenario a DC has to retrieve the data from the DS. DCs may maintain collected data locally and may store or forward
the collected data to other DCs or external data storage systems.
In a distributed data collection scenario, data is collected in a distributed manner, as illustrated in Figure 5.3-1. In such a
scenario each DC may collect data from different DSs while the data collected by each DC is distributed to all DCs.
Data collection applications exist in both DSs and DCs to jointly perform distributed data collection. The resulting data
collected by all DCs from all DSs is then stored in a distributed manner on all the nodes that are used for storage. Note
that some nodes may be used to both collect and store data, some may be used for data collection purposes only and
may not be used for data storage while other nodes may be used for data storge only and will not participate in data
collection:
• Distributed DCs: Data collection will be performed by multiple distributed DCs. Those DCs could be fully
decentralized or form a hierarchical structure. For example, Figure 5.3-1 shows four distributed DCs
(i.e. DC-A, DC-B, DC-C, and DC-D). Each DC maintains the collected data from a group of DSs. The
collected data can be replicated and/or moved among multiple distributed DCs.
• Distributed DSs: Many DSs are distributed (e.g. due to geographical spread or due to operational
circumstances). A set of DSs (e.g. DS-A1, DS-A2, and DS-An) can be logically grouped together and transmit
their original data to a DC (e.g. DC-A).
• Distributed Data Transmission: The original data is transmitted from DSs to DCs in a distributed way. There
could be three different transmission modes:
1) Each DS (e.g. DS-A1) transmits its original data directly to a DC (e.g. DC-A).
2) One DS (e.g. DS-Bn) can send its data to another DS (e.g. DS-B2), which then forwards the data to a DC
(e.g. DC-B).
3) One DS (e.g. DS-Cn) can send its original data to multiple DCs (e.g. DC-B and DC-C) for the purpose of
either load balancing or diversity.
Data Collector
Data Transmission Collected Data
Distributed
(DC)
Data
Data Transmission
Collection
Original Data Data Source
(Optional)
(DS)
(DDC)
Data Collection
Application
DS-A1 DC-D DS-C1
DS-A2 DC-A DC-C DS-C2
DS-An DC-B DS-Cn
DS-B1 DS-B2 DS-Bn
Figure 5.3-1: Distributed Data Collection
ETSI
14 ETSI GS PDL 013 V1.1.1 (2022-10)
5.4 Distributed Data Storage
Two types of data nodes are involved in distributed data storage: Data Providers (DPs) and Data Hosts (DHs). A DP
generates the data to be stored. DPs transmit their data for storage on one or more DHs. Examples of DPs include
devices (e.g. a thermometer), applications (e.g. video streamer), Data Collectors (DCs), while DHs could be a cloud
server, an edge server, and even a device with adequate storage such as a vehicle. Data storage applications exist in both
DPs and DHs for jointly performing distributed data storage.
Figure 5.4-1 illustrates distributed data storage, where data from DPs is stored on a distributed data storage system
consisting of multiple distributed DHs:
• Scenario 1: A DP submits its data to a single DH at a given time. For example, DP-A submits its data to DH-1.
All the data from DP-A can be stored on DH-1. Alternatively, DH-1 can split the data into multiple parts, store
some parts locally and transmit some parts to other DHs (e.g. DH-2). DP-A may submit its data to different
DHs at different times but only one DH at any given time.
• Scenario 2: DP-B submits its original data to multiple distributed DHs. In one case, DP-B splits its data to
multiple parts and submits each of those parts to a different DH. The data split may be based on volume
(e.g. data blocks of same size regardless of content are submitted to different DHs sequentially) or on content
(e.g. DP-B submits image files to DH-2 and text files to DH-n). In another case DP-B may submit the same
data to multiple DHs in parallel for purposes such as resiliency.
Distributed Data Transmission Original Data
Data Storage
Data Transmission
Stored Data
(DDS) (Optional)
Data Provider
(DP)
DP-A DP-B
Data Host
(DH)
Data Storage
Application
DH-1 DH-2 DH-n
Figure 5.4-1: Distributed Data Storage
5.5 Distributed Data Sharing
Data sharing occurs between three types of data nodes: Data Providers (DPs), Data Consumers (DCSs), and Data
Owners (DOs). A DP provides data, which is shared and accessed by one or more DCSs. The data in a DP could
originate from one or more Data Owners (DO). A DO can share its data with DCSs through one or more DPs. Data
sharing applications exist in both DPs and DCSs for jointly performing distributed data sharing.
In a distributed data sharing scenario data is provided by multiple distributed DPs. DCSs consume the data directly from
DPs. Figure 5.5-1 illustrates several distributed data storage scenarios:
• Scenario 1: A DCS consumes data from different DPs at different times. For example, DCS-A consumes data
from DP-1 at time t1; then, it changes to consume data from DP-2 at time t2 (e.g. a person watching a movie
offered by one content provider then switching to watch another movie offered by another content provider).
ETSI
15 ETSI GS PDL 013 V1.1.1 (2022-10)
• Scenario 2: A DCS simultaneously consumes data from multiple distributed DPs. For example, DCS-B
consumes data from distributed DP-2 and DP-n (e.g. a financial application reading stock values from multiple
stock exchanges around the globe).
• Scenario 3: A DP accesses data from other DPs. For example, DP-2 accesses data from DP-1 and DP-n. Such
data is then available for DCSs to consume from DP-2 without need for such DCSs to establish consumption
arrangements with DP-1/DP-n (e.g. a travel booking application that collects data from multiple airlines and
allows users to book multi-leg flights operated by multiple airlines using a single booking environment).
Distributed Data Transmission
Data to be
Data Sharing Data Transmission
Shared
(DDSh) (Optional)
Data Consumer
(DCS)
DCS-A DCS-B
Data Provider
(DP)
Data Owner
(DO)
Data Sharing
Application
DP-1 DP-2 DP-n
DO-1 DO-2 DO-n
Figure 5.5-1: Distributed Data Sharing
5.6 Distributed Data Computation
Two types of data nodes are involved in distributed data computation: Data Computation Nodes (DCNs) and Data
Computation Controllers (DCCs). In a distributed data computation scenario as illustrated in Figure 5.6-1, data
computation is executed at DCNs, which may be coordinated by a DCC. Each DCN performs certain computation tasks
over the designated data and generates computation results. DCNs may exchange computation tasks, data, and
computation results with other DCNs. A DCC may assign computation tasks to DCNs and collect the computation
results. Each DCN may compute local data, the received data from the DCC, and/or the data received from external data
storage systems. Typical examples of distributed data computation include federated learning, federated analytics, and
decentralized machine learning such as Multi-Agent Reinforcement Learning (MARL). Data computation applications
exist in both DCNs and DCCs to jointly perform distributed data discovery:
• Scenario 1: A DCC and two or more DCNs collaboratively train an artificial intelligence model, referred to as
federated learning. In this scenario, the DCC is the parameter server and the DCNs are the federated learning
clients. The
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...