ISO/IEC 23001-7:2016
(Main)Information technology - MPEG systems technologies - Part 7: Common encryption in ISO base media file format files
Information technology - MPEG systems technologies - Part 7: Common encryption in ISO base media file format files
ISO/IEC 23001-7:2016 specifies common encryption formats for use in any file format based on ISO/IEC 14496‑12. File, track, and track fragment metadata is specified to enable multiple digital rights and key management systems (DRMs) to access the same common encrypted file or stream. This part of ISO/IEC 23001 does not define a DRM system. The AES-128 symmetric block cipher is incorporated by reference to encrypt elementary stream data contained in media samples. Both AES counter mode (CTR) and Cipher Block Chaining (CBC) are specified in separate protection schemes. Partial encryption using a pattern of encrypted and clear blocks is also specified in separate protection schemes. The identification of encryption keys, Initialization Vector storage and processing is specified for each scheme. Subsample encryption is specified for NAL structured video, such as AVC and HEVC, to enable normal processing and editing of video elementary streams prior to decryption. An XML representation is specified for important common encryption information so that it can be included in XML files as standard elements and attributes to enable interoperable license and key management prior to media file download.
Technologies de l'information — Technologies des systèmes MPEG — Partie 7: Cryptage commun des fichiers au format de fichier de médias de la base ISO
General Information
Relations
Frequently Asked Questions
ISO/IEC 23001-7:2016 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology - MPEG systems technologies - Part 7: Common encryption in ISO base media file format files". This standard covers: ISO/IEC 23001-7:2016 specifies common encryption formats for use in any file format based on ISO/IEC 14496‑12. File, track, and track fragment metadata is specified to enable multiple digital rights and key management systems (DRMs) to access the same common encrypted file or stream. This part of ISO/IEC 23001 does not define a DRM system. The AES-128 symmetric block cipher is incorporated by reference to encrypt elementary stream data contained in media samples. Both AES counter mode (CTR) and Cipher Block Chaining (CBC) are specified in separate protection schemes. Partial encryption using a pattern of encrypted and clear blocks is also specified in separate protection schemes. The identification of encryption keys, Initialization Vector storage and processing is specified for each scheme. Subsample encryption is specified for NAL structured video, such as AVC and HEVC, to enable normal processing and editing of video elementary streams prior to decryption. An XML representation is specified for important common encryption information so that it can be included in XML files as standard elements and attributes to enable interoperable license and key management prior to media file download.
ISO/IEC 23001-7:2016 specifies common encryption formats for use in any file format based on ISO/IEC 14496‑12. File, track, and track fragment metadata is specified to enable multiple digital rights and key management systems (DRMs) to access the same common encrypted file or stream. This part of ISO/IEC 23001 does not define a DRM system. The AES-128 symmetric block cipher is incorporated by reference to encrypt elementary stream data contained in media samples. Both AES counter mode (CTR) and Cipher Block Chaining (CBC) are specified in separate protection schemes. Partial encryption using a pattern of encrypted and clear blocks is also specified in separate protection schemes. The identification of encryption keys, Initialization Vector storage and processing is specified for each scheme. Subsample encryption is specified for NAL structured video, such as AVC and HEVC, to enable normal processing and editing of video elementary streams prior to decryption. An XML representation is specified for important common encryption information so that it can be included in XML files as standard elements and attributes to enable interoperable license and key management prior to media file download.
ISO/IEC 23001-7:2016 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.
ISO/IEC 23001-7:2016 has the following relationships with other standards: It is inter standard links to ISO/IEC 23001-7:2016/Amd 1:2019, ISO/IEC 23001-7:2023, ISO/IEC 23001-7:2015. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.
You can purchase ISO/IEC 23001-7:2016 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.
Standards Content (Sample)
FINAL
INTERNATIONAL ISO/IEC
DRAFT
STANDARD FDIS
23001-7
ISO/IEC JTC 1/SC 29
Information technology — MPEG
Secretariat: JISC
systems technologies —
Voting begins
on: 2015-10-27
Part 7:
Voting terminates
Common encryption in ISO base media
on: 2015-12-27
file format files
Technologies de l’information — Technologies des systèmes MPEG —
Partie 7: Cryptage commun des fichiers au format de fichier de médias
de la base ISO
RECIPIENTS OF THIS DRAFT ARE INVITED TO
SUBMIT, WITH THEIR COMMENTS, NOTIFICATION
OF ANY RELEVANT PATENT RIGHTS OF WHICH
THEY ARE AWARE AND TO PROVIDE SUPPOR TING
DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
Reference number
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO-
ISO/IEC FDIS 23001-7:2015(E)
LOGICAL, COMMERCIAL AND USER PURPOSES,
DRAFT INTERNATIONAL STANDARDS MAY ON
OCCASION HAVE TO BE CONSIDERED IN THE
LIGHT OF THEIR POTENTIAL TO BECOME STAN-
DARDS TO WHICH REFERENCE MAY BE MADE IN
©
NATIONAL REGULATIONS. ISO/IEC 2015
ISO/IEC FDIS 23001-7:2015(E)
© ISO/IEC 2015, Published in Switzerland
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form
or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior
written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of
the requester.
ISO copyright office
Ch. de Blandonnet 8 • CP 401
CH-1214 Vernier, Geneva, Switzerland
Tel. +41 22 749 01 11
Fax +41 22 749 09 47
copyright@iso.org
www.iso.org
ii © ISO/IEC 2015 – All rights reserved
ISO/IEC FDIS 23001-7:2015(E)
Contents Page
Foreword .v
Introduction .vi
1 Scope . 1
2 Normative references . 1
3 Terms, definitions, and abbreviated terms . 1
3.1 Terms and definitions . 1
3.2 Abbreviated terms . 2
4 Protection schemes. 3
4.1 Scheme type signaling . 3
4.2 Common encryption scheme types . 3
5 Overview of encryption metadata . 3
6 Encryption parameters shared by groups of samples . 3
7 Common encryption sample auxiliary information . 5
7.1 Definition . 5
7.2 Sample Encryption Information box for storage of sample auxiliary information . 6
7.2.1 Sample Encryption Box (‘senc’) . 6
7.2.2 Syntax . 6
7.2.3 Semantics . 6
8 Box definitions . 7
8.1 Protection system specific header box . 7
8.1.1 Definition . 7
8.1.2 Syntax . 7
8.1.3 Semantics . 8
8.2 Track Encryption box . 8
8.2.1 Definition . 8
8.2.2 Syntax . 8
8.2.3 Semantics . 9
9 Encryption of media data . 9
9.1 Field semantics . 9
9.2 Initialization Vectors .10
9.3 AES-CTR mode counter operation .11
9.4 Full sample encryption .12
9.4.1 General.12
9.4.2 Full sample encryption using AES-CTR mode .12
9.4.3 Full sample encryption using AES-CBC mode .12
9.5 Subsample encryption .13
9.5.1 Definition (normative) .13
9.5.2 Subsample encryption of NAL Structured Video tracks .14
9.6 Pattern encryption .18
9.6.1 Definition .18
9.6.2 Example of pattern encryption applied to a video NAL unit.19
9.7 Whole-block full sample encryption .19
10 Protection scheme definitions .19
10.1 ‘cenc’ AES-CTR scheme .19
10.2 ‘cbc1’ AES-CBC scheme .20
10.3 ‘cens’ AES-CTR subsample pattern encryption scheme .20
10.4 ‘cbcs’ AES-CBC subsample pattern encryption scheme .21
10.4.1 Definition .21
10.4.2 ‘cbcs’ AES-CBC mode pattern encryption scheme application (informative) .22
11 XML representation of Common Encryption parameters .22
© ISO/IEC 2015 – All rights reserved iii
ISO/IEC FDIS 23001-7:2015(E)
11.1 General .22
11.2 Definition of the XML cenc:default_KID attribute and cenc:pssh element .22
11.3 Use of the cenc:default_KID attribute and cenc:pssh element in DASH
ContentProtection Descriptor elements .23
11.3.1 General .23
11.3.2 Addition of cenc:default_KID attributes in DASH ContentProtection Descriptors 23
11.3.3 Addition of the cenc:pssh element in Protection System Specific UUID
ContentProtection Descriptors .24
11.3.4 Example of two Content Protection Descriptors in an MPD .24
Bibliography .26
iv © ISO/IEC 2015 – All rights reserved
ISO/IEC FDIS 23001-7:2015(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical
activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work. In the field of information technology, ISO and IEC have established a joint technical committee,
ISO/IEC JTC 1.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for
the different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation on the meaning of ISO specific terms and expressions related to conformity
assessment, as well as information about ISO’s adherence to the WTO principles in the Technical
Barriers to Trade (TBT) see the following URL: Foreword - Supplementary information
The committee responsible for this document is ISO/IEC JTC 1, Information technology, Subcommittee
SC 29, Coding of audio, picture, multimedia and hypermedia information.
This third edition cancels and replaces the second edition (ISO/IEC 23001-7:2015), which has been
technically revised.
ISO/IEC 23001 consists of the following parts, under the general title Information technology — MPEG
systems technologies:
— Part 1: Binary MPEG format for XML
— Part 2: Fragment request units
— Part 3: XML IPMP messages
— Part 4: Codec configuration representation
— Part 5: Bitstream Syntax Description Language (BSDL)
— Part 7: Common encryption in ISO base media file format files
— Part 8: Coding-independent code points
— Part 9: Common encryption of MPEG-2 transport streams
— Part 10: Carriage of timed metadata metrics of media in ISO base media file format
— Part 11: Energy-efficient media consumption (green metadata)
— Part 12: Sample variants in the ISO base media file format
© ISO/IEC 2015 – All rights reserved v
ISO/IEC FDIS 23001-7:2015(E)
Introduction
Common Encryption specifies standard encryption and key mapping methods that can be utilized
to enable decryption of the same file using different Digital Rights Management (DRM) and key
management systems. It operates by defining encryption algorithms and encryption-related metadata
necessary to decrypt the protected streams, yet it leaves the details of rights mappings, key acquisition
and storage, DRM content protection compliance rules, etc., up to the DRM system or systems.
For instance, DRM systems is intended to support identifying the decryption key via stored key
identifiers (KIDs), but how each DRM system protects and locates the KID identified decryption key is
left to a DRM-specific method.
DRM-specific information such as licenses, rights, and license acquisition information can be stored
in an ISO Base Media file using a Protection System Specific Header box (‘pssh’). Each instance of this
box stored in the file corresponds to one applicable DRM system identified by a well-known SystemID.
DRM licenses or license acquisition information need not be stored in the file in order to look up a
separately delivered key using a KID stored in the file and decrypt media samples using the encryption
parameters stored in each track.
The second edition of this part of ISO/IEC 23001 added XML representations of Common Encryption
parameters for delivery in XML documents, such as an MPEG DASH Media Presentation Description
Documents (MPD). The second edition also defined the ‘cbc1’ protection scheme using AES-CBC
mode encryption.
The third edition added ‘cbcs’ and ‘cens’ protection schemes for pattern encryption, which encrypt
only a fraction of the data Blocks within each video Subsample protected. Pattern encryption reduces
the computational power required by devices to decrypt video tracks.
vi © ISO/IEC 2015 – All rights reserved
FINAL DRAFT INTERNATIONAL STANDARD ISO/IEC FDIS 23001-7:2015(E)
Information technology — MPEG systems technologies —
Part 7:
Common encryption in ISO base media file format files
1 Scope
This part of ISO/IEC 23001 specifies common encryption formats for use in any file format based on
ISO/IEC 14496-12. File, track, and track fragment metadata is specified to enable multiple digital rights
and key management systems (DRMs) to access the same common encrypted file or stream. This part
of ISO/IEC 23001 does not define a DRM system.
The AES-128 symmetric block cipher is incorporated by reference to encrypt elementary stream
data contained in media samples. Both AES counter mode (CTR) and Cipher Block Chaining (CBC)
are specified in separate protection schemes. Partial encryption using a pattern of encrypted and
clear blocks is also specified in separate protection schemes. The identification of encryption keys,
Initialization Vector storage and processing is specified for each scheme.
Subsample encryption is specified for NAL structured video, such as AVC and HEVC, to enable normal
processing and editing of video elementary streams prior to decryption.
An XML representation is specified for important common encryption information so that it can be
included in XML files as standard elements and attributes to enable interoperable license and key
management prior to media file download.
2 Normative references
The following documents, in whole or in part, are normatively referenced in this document and are
indispensable for its application. For dated references, only the edition cited applies. For undated
references, the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 14496–12, Information technology — Coding of audio-visual objects — Part 12: ISO Base
Media File Format
ISO/IEC 14496–15, Information technology — Coding of audio-visual objects — Part 15: Carriage of NAL
unit structured video in the ISO Base Media File Format
3 Terms, definitions, and abbreviated terms
3.1 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
NOTE Words used as defined terms and normative terms (SHALL, SHOULD and MAY) are written in upper
case to distinguish them from the same word intending its dictionary definition.
3.1.1
constant IV
initialization vector (3.1.3) specified in a sample entry or sample group description that applies to all
samples and subsamples (3.1.8) under that sample entry or mapped to that sample group
© ISO/IEC 2015 – All rights reserved 1
ISO/IEC FDIS 23001-7:2015(E)
3.1.2
block
16-byte extent of sample data that may be encrypted or decrypted by the AES-128 block cipher, in
which case, a cipher block
3.1.3
initialization vector
8-byte or 16-byte value used in combination with a key and a 16-byte block (3.1.2) of content to create
the first cipher block in a chain and derive subsequent cipher blocks in a cipher block chain
3.1.4
ISO Base Media File
file conforming to the file format described in ISO/IEC 14496-12 in which the techniques in
ISO/IEC 23001-7 may be used
3.1.5
NAL unit
syntax structure containing an indication of the type of data to follow and bytes containing that data in
the form of an RBSP interspersed as necessary with emulation prevention bytes
3.1.6
NAL structured video
video streams composed of NAL units (3.1.5) of which the carriage is specified by ISO/IEC 14496-15
3.1.7
protection scheme
encryption algorithm and information defined in this part of ISO/IEC 23001 and identified by a four
character code in an ISO Media track’s Scheme Type Box (‘schm’)
3.1.8
subsample
byte range within a sample consisting of an unprotected byte range followed by a protected byte range
3.2 Abbreviated terms
AES Advanced Encryption Standard as specified in Federal Information Processing Stand-
ards Publication 197, FIPS-197
AES-CTR AES Counter Mode as specified in Recommendation of Block Cipher Modes of Operation,
NIST, NIST Special Publication 800-38A
AES-CBC AES Cipher-Block Chaining Mode as specified in Recommendation of Block Cipher Modes
of Operation, NIST, NIST Special Publication 800-38A
AVC Advanced Video Coding as specified in ISO/IEC 14496-10
HEVC High Efficiency Video Coding as specified in ISO/IEC 23008-2
IV Initialization Vector
NAL Network Abstraction Layer, as specified in ISO/IEC 14496-10 and ISO/IEC 23008-2
URN Unique Resource Name
UUID Universally Unique Identifier
2 © ISO/IEC 2015 – All rights reserved
ISO/IEC FDIS 23001-7:2015(E)
4 Protection schemes
4.1 Scheme type signaling
Scheme signaling SHALL conform to ISO/IEC 14496-12. As defined in ISO/IEC 14496-12, the sample
entry is transformed and a Protection Scheme Information Box (‘sinf’) is added to the standard
sample entry in the Sample Description Box to denote that a stream is protected. The Protection
Scheme Information Box SHALL contain a Scheme Type Box (‘schm’) so that the scheme is identifiable.
The Scheme Type Box SHALL have the following additional constraints:
— the scheme_type field SHALL be set to a value equal to a four character code defined in Clause 10;
— the scheme_version field SHALL be set to 0x00010000 (Major version 1, Minor version 0).
The Protection Scheme Information Box SHALL also contain a Scheme Information Box (‘schi’).
The Scheme Information Box SHALL contain a Track Encryption Box (‘tenc’), describing the default
encryption parameters for the track.
4.2 Common encryption scheme types
Four protection schemes are specified in this edition of Common Encryption. Each scheme uses syntax
and algorithms specified in Clause 5 to Clause 9, as constrained in Clause 10. They are the following:
a) ‘cenc’ – AES-CTR mode full sample and video NAL Subsample encryption, see 10.1;
b) ‘cbc1’ – AES-CBC mode full sample and video NAL Subsample encryption, see 10.2;
c) ‘cens’ – AES-CTR mode partial video NAL pattern encryption, see 10.3;
d) ‘cbcs’ – AES-CBC mode partial video NAL pattern encryption, see 10.4.
5 Overview of encryption metadata
The encryption metadata defined by Common Encryption can be categorized as follows.
— Protection System Specific Data – this data is opaque to Common Encryption. This gives protection
systems (i.e. key and digital rights management “DRM” systems) a place to store their own data using
a common mechanism. This data is contained in the ProtectionSystemSpecificHeaderBox
described in 8.1.
— Common encryption information for a media track – this includes default values for the key identifier
(KID), Initialization Vector and vector size, protection pattern, and protection flag. This data is
contained in the TrackEncryptionBox described in 8.2.
— Common encryption information for groups of media samples – this includes overrides to the track
level defaults defined above. This allows groups of samples within the track to use different keys,
a mix of clear and protected content, share a Constant Initialization Vector (for some schemes),
etc. This data is contained in a SampleGroupDescriptionBox (‘sgpd’) that is referenced by a
SampleToGroupBox (‘sbgp’). See Clause 6 for further details.
— Encryption information for individual media samples – this includes Initialization Vectors and
Subsample encryption data. This data is sample auxiliary information, referenced by using a Sample
Auxiliary Information Sizes Box (‘saiz’) and a Sample Auxiliary Information Offsets Box (‘saio’).
See Clause 7 for further details.
6 Encryption parameters shared by groups of samples
Each sample in a protected track SHALL be associated with an isProtected flag, Per_Sample_
IV_Size, KID, optional Block pattern information, and an optional constant_IV. This can be
© ISO/IEC 2015 – All rights reserved 3
ISO/IEC FDIS 23001-7:2015(E)
accomplished by relying on the default values in the Track Encryption Box (‘tenc’) (see 8.2), and
optionally specifying parameters by sample group. Encryption parameters specified in a sample group
SHALL override the corresponding default parameter values for the samples in that group defined in
the Track Encryption Box. Samples not mapped to any sample group SHALL use the defaults established
in the Track Encryption Box.
When specifying the parameters by sample group, the Sample To Group Box (‘sbgp’) in the sample
table or track fragment specifies which samples use which sample group description from the Sample
Group Description Box (‘sgpd’). The format of the sample group description is uniform across all track
types (as indicated by the handler type for the track). For fragmented files, it may be necessary to store
both the Sample To Group Box and Sample Group Description Box in each track fragment to make them
accessible for decryption of the samples they describe, e.g. when movie fragments are separately stored
and delivered by streaming.
Tracks of all types SHALL use the CencSampleEncryptionInformationGroupEntry sample
group description structure, which has the following syntax.
aligned(8) class CencSampleEncryptionInformationGroupEntry
extends SampleGroupEntry( ‘seig’ )
{
unsigned int(8) reserved = 0;
unsigned int(4) crypt_byte_block = 0;
unsigned int(4) skip_byte_block = 0;
unsigned int(8) isProtected;
unsigned int(8) Per_Sample_IV_Size;
unsigned int(8)[16] KID;
if (isProtected ==1 && Per_Sample_IV_Size == 0) {
unsigned int(8) constant_IV_size;
unsigned int(8)[constant_IV_size] constant_IV;
}
}
These structures use a common semantic for their fields as follows:
— isProtected is the flag which indicates the encryption state of the samples in the sample group.
See the isProtected field in 9.1 for further details.
— Per_Sample_IV_Size is the Initialization Vector size in bytes for samples in the sample group.
See the Per_Sample_IV_Size field in 9.1 for further details.
— KID is the key identifier used for samples in the sample group. See the KID field in 9.1 for
further details.
— constant_IV_size is the size of a possible Initialization Vector used for all samples associated
with this group (when per-sample Initialization Vectors are not used).
— constant_IV, if present, is the Initialization Vector used for all samples associated with this
group. See the constant_IV field in 9.1 for further details.
— crypt_byte_block specifies the count of the encrypted Blocks in the protection pattern, where
each Block is of size 16-bytes. See 9.1 for further details.
— skip_byte_block specifies the count of the unencrypted Blocks in the protection pattern. See
9.1 for further details.
In order to facilitate the addition of future optional fields, clients SHALL ignore additional bytes after
the fields defined in the CencSampleEncryption group entry structures.
4 © ISO/IEC 2015 – All rights reserved
ISO/IEC FDIS 23001-7:2015(E)
7 Common encryption sample auxiliary information
7.1 Definition
Each protected sample in a protected track SHALL have an Initialization Vector associated with it. Both
Initialization Vectors and Subsample encryption information MAY be provided as Sample Auxiliary
Information with aux_info_type equal to the scheme and aux_info_type_parameter equal to 0.
For example, for tracks protected using the ‘cenc’ scheme, the default value for aux_info_type
is ‘cenc’ and the default value for the aux_info_type_parameter is 0, so content SHOULD be
created omitting these optional fields. Storage of sample auxiliary information SHALL conform to
ISO/IEC 14496-12.
The format of the sample auxiliary information for samples with this type SHALL be as follows:
aligned(8) class CencSampleAuxiliaryDataFormat
{
unsigned int(Per_Sample_IV_Size*8) InitializationVector;
if (sample_info_size > Per_Sample_IV_Size )
{
unsigned int(16) subsample_count;
{
unsigned int(16) BytesOfClearData;
unsigned int(32) BytesOfProtectedData;
} [subsample_count ]
}
}
where
sample_info_size
is the size of the sample auxiliary information for this sample from
the Sample Auxiliary Information Size Box (‘saiz’);
InitializationVector
is the Initialization Vector for the sample, unless a constant_IV is
present in the Track Encryption Box (‘tenc’) (see the Initiali-
zationVector field in 9.1 for further details);
subsample_count
is the count of Subsamples for this sample (see the subsample_
count field in 9.1 for further details);
BytesOfClearData
is the number of bytes of clear data in this Subsample (see the
BytesOfClearData field in 9.1 for further details);
BytesOfProtectedData
is the number of bytes of protected data in this Subsample (see the
BytesOfProtectedData field in 9.1 for further details.
If Subsample encryption is not used (the size of the sample auxiliary information equals Per_Sample_
IV_Size), then the entire sample is protected (see 9.4 for further details). In this case, all auxiliary
information will have the same size and hence the default_sample_info_size of the Sample
Auxiliary Information Sizes box (‘saiz’) will be equal to the Per_Sample_IV_Size of the Initialization
Vectors. If Per_Sample_IV_Size is also zero (because constant IVs are in use) then the sample
auxiliary information would then be empty and should be omitted.
NOTE Even if Subsample encryption is used, the size of the sample auxiliary information may be the same
for all of the samples (if all of the samples have the same number of Subsamples) and the default_sample_
info_size may be used.
© ISO/IEC 2015 – All rights reserved 5
ISO/IEC FDIS 23001-7:2015(E)
7.2 Sample Encryption Information box for storage of sample auxiliary information
7.2.1 Sample Encryption Box (‘senc’)
Box Type: ‘senc’
Container: Track Fragment Box (‘traf’) or Track Box (‘trak’)
Mandatory: No
Quantity: Zero or one
An optional storage location for Sample Auxiliary Information is the Sample Encryption Box (‘senc’),
specified here.
The Sample Encryption Box contains sample auxiliary information and may contain a per sample
Initialization Vector for each sample, and clear and protected byte ranges of partially protected video
samples (“Subsample encryption”). It MAY be used when samples in a track or track fragment are
protected. Storage of ‘senc’ in a Track Fragment Box makes the necessary Sample Auxiliary Information
accessible within the movie fragment for all contained samples in order to make each track fragment
independently decryptable; for instance, when movie fragments are delivered as DASH Media Segments.
7.2.2 Syntax
aligned(8) class SampleEncryptionBox
extends FullBox(‘senc’, version=0, flags)
{
unsigned int(32) sample_count;
{
unsigned int(Per_Sample_IV_Size*8) InitializationVector;
if (flags & 0x000002)
{
unsigned int(16) subsample_count;
{
unsigned int(16) BytesOfClearData;
unsigned int(32) BytesOfProtectedData;
} [ subsample_count ]
}
}[ sample_count ]
}
7.2.3 Semantics
— flags is inherited from the FullBox structure. The SampleEncryptionBox currently supports
the following bit values:
— 0x2 – UseSubSampleEncryption
— If the UseSubSampleEncryption flag is set, then the track fragment that contains this
Sample Encryption Box SHALL use Subsample encryption as described in 9.5. When this flag
is set, Subsample mapping data follows each InitializationVector. The Subsample
mapping data consists of the number of Subsamples for each sample, followed by an array of
values describing the number of bytes of clear data and the number of bytes of encrypted data
for each Subsample.
— sample_count is the number of protected samples in the containing track or track fragment. This
value SHALL be either zero (0) or the total number of samples in the track or track fragment.
— InitializationVector SHALL conform to the definition specified in 9.2. Only one Per_
Sample_IV_Size SHALL be used within a file or Per_Sample_IV_Size SHALL be zero when
a sample is unencrypted or a Constant IV is in use. Selection of InitializationVector values
SHOULD follow the recommendations of 9.2.
6 © ISO/IEC 2015 – All rights reserved
ISO/IEC FDIS 23001-7:2015(E)
— subsample_count SHALL conform to the definition specified in 9.1.
— BytesOfClearData SHALL conform to the definition specified in 9.1.
— BytesOfProtectedData SHALL conform to the definition specified in 9.1.
8 Box definitions
8.1 Protection system specific header box
8.1.1 Definition
Box Type: `pssh’
Container: Movie (‘moov’) or Movie Fragment (‘moof’)
Mandatory: No
Quantity: Zero or more
This box contains information needed by a Content Protection System to play back the content. The
data format is specified by the system identified by the ‘pssh’ parameter SystemID and is considered
opaque for the purposes of this part of ISO/IEC 23001. The collection of Protection System Specific
Header boxes from the initial movie box, together with those in a movie fragment, SHALL provide all
the required Content Protection System information to decode that fragment.
The data encapsulated in the Data field MAY be read by the identified Content Protection System client
to enable decryption key acquisition and decryption of media data. For license/rights-based systems,
the header information MAY include data such as the URL of license server(s) or rights issuer(s) used,
embedded licenses/rights, embedded keys(s), and/or other protection system specific metadata.
A single file MAY be constructed to be playable by multiple key and digital rights management (DRM)
systems, by including Protection System Specific Header boxes for each system supported. In order to find
all of the Protection System Specific data that is relevant to a sample in the presentation, readers SHALL
— examine all Protection System Specific Header boxes in the Movie Box and in the Movie Fragment
Box associated with the sample (but not those in other Movie Fragment Boxes),
— match the SystemID field in this box to the SystemID(s) of the DRM System(s) they support, and
— match the KID associated with the sample (either from the default_KID field of the Track
Encryption Box or the KID field of the appropriate sample group description entry) with one of
the KID values in the Protection System Specific Header Box. Boxes without a list of applicable KID
values, or with an empty list, SHALL be considered to apply to all KIDs in the file or movie fragment.
Protection System Specific Header data SHALL be associated with a sample based on a matching KID
value in the ‘pssh’ and sample group description or default ‘tenc’ describing the sample. If a sample
or set of samples is moved due to file defragmentation or refragmentation or removed by editing, then
the associated Protection System Specific Header boxes for the remaining samples SHALL be stored
following the above requirements.
8.1.2 Syntax
aligned(8) class ProtectionSystemSpecificHeaderBox extends FullBox(‘pssh’, version, flags=0)
{
unsigned int(8)[16] SystemID;
if (version > 0)
{
unsigned int(32) KID_count;
{
unsigned int(8)[16] KID;
© ISO/IEC 2015 – All rights reserved 7
ISO/IEC FDIS 23001-7:2015(E)
} [KID_count];
}
unsigned int(32) DataSize;
unsigned int(8)[DataSize] Data;
}
8.1.3 Semantics
— SystemID specifies a UUID that uniquely identifies the content protection system that this
header belongs to.
— KID_count specifies the number of KID entries in the following table. The value MAY be zero.
— KID identifies a key identifier that the Data field applies to. If not set, then the Data array SHALL
apply to all KIDs in the movie or movie fragment containing this box.
— DataSize specifies the size in bytes of the Data member.
— Data holds the content protection system specific data.
8.2 Track Encryption box
8.2.1 Definition
Box Type: `tenc’
Container: Scheme Information Box (‘schi’)
Mandatory: No (Yes, for protected tracks)
Quantity: Zero or one
The Track Encryption Box contains default values for the isProtected flag, Per_Sample_IV_Size,
and KID for the entire track. In the case where pattern-based encryption is in effect, it supplies the
pattern and when Constant IVs are in use, it supplies the Constant IV. These values are used as the
encryption parameters for the samples in this track unless over-ridden by the sample group description
associated with a group of samples. For files with only one key per track, this box allows the basic
encryption parameters to be specified once per track instead of being repeated per sample.
If both the value of default_isProtected is 1 and default_Per_Sample_IV_Size is 0, then the
default_constant_IV_size for all samples that use these settings SHALL be present. A Constant
IV SHALL NOT be used with counter-mode encryption. A sample group description may supply keys or
keys and Constant IVs for sample groups that override these default values for those samples mapped
to the group.
NOTE The version field of the Track Encryption Box is set to a value greater than zero when the pattern
encryption defined in 9.6 is used and to zero otherwise.
8.2.2 Syntax
aligned(8) class TrackEncryptionBox extends FullBox(‘tenc’, version, flags=0)
{
unsigned int(8) reserved = 0;
if (version==0) {
unsigned int(8) reserved = 0;
}
else { // version is 1 or greater
unsigned int(4) default_crypt_byte_block;
unsigned int(4) default_skip_byte_block;
}
unsigned int(8) default_isProtected;
unsigned int(8) default_Per_Sample_IV_Size;
8 © ISO/IEC 2015 – All rights reserved
ISO/IEC FDIS 23001-7:2015(E)
unsigned int(8)[16] default_KID;
if (default_isProtected ==1 && default_Per_Sample_IV_Size == 0) {
unsigned int(8) default_constant_IV_size;
unsigned int(8)[default_constant_IV_size] default_constant_IV;
}
}
8.2.3 Semantics
— version SHALL be zero unless pattern-based encryption is in use, whereupon it SHALL be 1.
— default_isProtected is the protection flag which indicates the default protection state of the
samples in the track. See the isProtected field in 9.1 for further details.
— default_Per_Sample_IV_Size is the default Initialization Vector size in bytes. See the Per_
Sample_IV_Size field in 9.1 for further details.
— default_KID is the default key identifier used for samples in this track. See the KID field in 9.1 for
further details.
— default_constant_IV_size is the size of a possible default Initialization Vector for all samples.
— default_constant_IV, if present, is the default Initialization Vector for all samples. See the
constant_IV field in 9.1 for further details.
— default_crypt_byte_block specifies the count of the encrypted Blocks in the protection
pattern, where each Block is of size 16-bytes. See 9.1 for further details.
— default_skip_byte_block specifies the count of the unencrypted Blocks in the protection
pattern. See the skip_byte_block field in 9.1 for further details.
9 Encryption of media data
9.1 Field semantics
Within the sample groups and sample auxiliary information used by the common encryption scheme,
these fields have the following semantics:
— isProtected is the identifier of the protection state of the samples in the track or group of
samples. This flag takes the following values:
— 0x0: Not protected;
— 0x1: protected (as signalled by the scheme_type field of the scheme type box ‘schm’, e.g. for
scheme_type of ‘cenc’, the track default is AES-CTR encrypted using the ‘cenc’ scheme);
— 0x02 – 0xFF: Reserved.
— Per_Sample_IV_Size is the size in bytes of the InitializationVector field. The following
are supported values:
— 0 if the isProtected flag is 0x0 (Not Protected) or Constant IVs are in use;
— 8 specifies 64-bit Initialization Vectors;
— 16 specifies 128-bit Initialization Vectors.
— constant_IV_size is the size in bytes of the constant_IV field. The following are
supported values:
— 8 specifies 64-bit Initialization Vectors;
© ISO/IEC 2015 – All rights reserved 9
ISO/IEC FDIS 23001-7:2015(E)
— 16 specifies 128-bit Initialization Vectors.
— KID is a key identifier that uniquely identifies the key needed to decrypt the associated samples
within the scope of an application so that KID is sufficient to identify a separately stored license
containing the key that was used to encrypt the content. This allows the identification of multiple
encryption keys per file or track. Unprotected samples in a protected track SHALL be identified by
having an isProtected flag of 0x0, a Per_Sample_IV_Size of 0x0, and a KID value of 0x0. It
is strongly recommended to use UUIDs [2] as KIDs in order to satisfy the uniqueness requirement
across all applications.
— InitializationVector specifies the Initialization Vector (IV) needed for decryption of a
sample. For an isProtected flag of 0x0, no Initialization Vectors are needed and the auxiliary
information SHOULD have a size of 0, i.e. not be present.
For an isProtected flag of 0x1:
— IVs shall be supplied using Per_Sample IVs or Constant IVs.
— If the Per_Sample_IV_Size field is 16, then InitializationVector specifies the entire
128-bit IV value
— If the Per_Sample_IV_Size field is 8, then its value is copied to bytes 0 to 7 of the Initialization
Vector and bytes 8 to 15 of the Initialization Vector are set to zero.
— subsample_count specifies the number of Subsample encryption entries present for this sample.
If present, this field SHALL be greater than 0.
— BytesOfClearData specifies the number of bytes of clear data at the beginning of this Subsample
encryption entry.
NOTE This value may be zero if no clear bytes exist for this Subsample.
— BytesOfProtectedData specifies the number of bytes of protected data following the clear data.
NOTE This value may be zero if no protected bytes exist for this Subsample.
The Subsample encryption entries SHALL NOT include an entry with a zero value in both the
BytesOfClearData field and in the BytesOfProtectedData field. The total length of all
BytesOfClearData and BytesOfProtectedData in a sample SHALL equal the length of
the sample. Subsample encryption entries SHOULD be as compactly represented as possible. For
example, instead of two entries with {15 clear, 0 protected}, {17 clear, 500 protected}, use one entry
of {32 clear, 500 protected}. If pattern-based encryption is used, then the pattern applies to the
protected byte range, BytesOfProtectedData; otherwise, all protected bytes are encrypted.
— crypt_byte_block shall be zero unless pattern-based encryption is in effect. See 9.6 for
further details.
— skip_byte_block shall be zero unless pattern-based encryption is in effect. See 9.6 for
further details.
9.2 Initialization Vectors
The Initialization Vector (IV) values for each sample SHALL be either a Constant IV and located in the
sample entry or a sample group descrip
...
INTERNATIONAL ISO/IEC
STANDARD 23001-7
Third edition
2016-02-15
Information technology — MPEG
systems technologies —
Part 7:
Common encryption in ISO base media
file format files
Technologies de l’information — Technologies des systèmes MPEG —
Partie 7: Cryptage commun des fichiers au format de fichier de médias
de la base ISO
Reference number
©
ISO/IEC 2016
© ISO/IEC 2016, Published in Switzerland
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form
or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior
written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of
the requester.
ISO copyright office
Ch. de Blandonnet 8 • CP 401
CH-1214 Vernier, Geneva, Switzerland
Tel. +41 22 749 01 11
Fax +41 22 749 09 47
copyright@iso.org
www.iso.org
ii © ISO/IEC 2016 – All rights reserved
Contents Page
Foreword .v
Introduction .vi
1 Scope . 1
2 Normative references . 1
3 Terms, definitions, and abbreviated terms . 1
3.1 Terms and definitions . 1
3.2 Abbreviated terms . 2
4 Protection schemes. 3
4.1 Scheme type signaling . 3
4.2 Common encryption scheme types . 3
5 Overview of encryption metadata . 3
6 Encryption parameters shared by groups of samples . 3
7 Common encryption sample auxiliary information . 5
7.1 Definition . 5
7.2 Sample Encryption Information box for storage of sample auxiliary information . 6
7.2.1 Sample Encryption Box (‘senc’) . 6
7.2.2 Syntax . 6
7.2.3 Semantics . 6
8 Box definitions . 7
8.1 Protection system specific header box . 7
8.1.1 Definition . 7
8.1.2 Syntax . 7
8.1.3 Semantics . 8
8.2 Track Encryption box . 8
8.2.1 Definition . 8
8.2.2 Syntax . 8
8.2.3 Semantics . 9
9 Encryption of media data . 9
9.1 Field semantics . 9
9.2 Initialization Vectors .10
9.3 AES-CTR mode counter operation .11
9.4 Full sample encryption .12
9.4.1 General.12
9.4.2 Full sample encryption using AES-CTR mode .12
9.4.3 Full sample encryption using AES-CBC mode .12
9.5 Subsample encryption .13
9.5.1 Definition (normative) .13
9.5.2 Subsample encryption of NAL Structured Video tracks .14
9.6 Pattern encryption .18
9.6.1 Definition .18
9.6.2 Example of pattern encryption applied to a video NAL unit.19
9.7 Whole-block full sample encryption .19
10 Protection scheme definitions .19
10.1 ‘cenc’ AES-CTR scheme .19
10.2 ‘cbc1’ AES-CBC scheme .20
10.3 ‘cens’ AES-CTR subsample pattern encryption scheme .20
10.4 ‘cbcs’ AES-CBC subsample pattern encryption scheme .21
10.4.1 Definition .21
10.4.2 ‘cbcs’ AES-CBC mode pattern encryption scheme application (informative) .22
11 XML representation of Common Encryption parameters .22
© ISO/IEC 2016 – All rights reserved iii
11.1 General .22
11.2 Definition of the XML cenc:default_KID attribute and cenc:pssh element .22
11.3 Use of the cenc:default_KID attribute and cenc:pssh element in DASH
ContentProtection Descriptor elements .23
11.3.1 General .23
11.3.2 Addition of cenc:default_KID attributes in DASH ContentProtection Descriptors 23
11.3.3 Addition of the cenc:pssh element in Protection System Specific UUID
ContentProtection Descriptors .24
11.3.4 Example of two Content Protection Descriptors in an MPD .24
Bibliography .26
iv © ISO/IEC 2016 – All rights reserved
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical
activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work. In the field of information technology, ISO and IEC have established a joint technical committee,
ISO/IEC JTC 1.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for
the different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation on the meaning of ISO specific terms and expressions related to conformity
assessment, as well as information about ISO’s adherence to the WTO principles in the Technical
Barriers to Trade (TBT) see the following URL: Foreword - Supplementary information
The committee responsible for this document is ISO/IEC JTC 1, Information technology, Subcommittee
SC 29, Coding of audio, picture, multimedia and hypermedia information.
This third edition cancels and replaces the second edition (ISO/IEC 23001-7:2015), which has been
technically revised.
ISO/IEC 23001 consists of the following parts, under the general title Information technology — MPEG
systems technologies:
— Part 1: Binary MPEG format for XML
— Part 2: Fragment request units
— Part 3: XML IPMP messages
— Part 4: Codec configuration representation
— Part 5: Bitstream Syntax Description Language (BSDL)
— Part 7: Common encryption in ISO base media file format files
— Part 8: Coding-independent code points
— Part 9: Common encryption of MPEG-2 transport streams
— Part 10: Carriage of timed metadata metrics of media in ISO base media file format
— Part 11: Energy-efficient media consumption (green metadata)
— Part 12: Sample variants in the ISO base media file format
© ISO/IEC 2016 – All rights reserved v
Introduction
Common Encryption specifies standard encryption and key mapping methods that can be utilized
to enable decryption of the same file using different Digital Rights Management (DRM) and key
management systems. It operates by defining encryption algorithms and encryption-related metadata
necessary to decrypt the protected streams, yet it leaves the details of rights mappings, key acquisition
and storage, DRM content protection compliance rules, etc., up to the DRM system or systems.
For instance, DRM systems is intended to support identifying the decryption key via stored key
identifiers (KIDs), but how each DRM system protects and locates the KID identified decryption key is
left to a DRM-specific method.
DRM-specific information such as licenses, rights, and license acquisition information can be stored
in an ISO Base Media file using a Protection System Specific Header box (‘pssh’). Each instance of this
box stored in the file corresponds to one applicable DRM system identified by a well-known SystemID.
DRM licenses or license acquisition information need not be stored in the file in order to look up a
separately delivered key using a KID stored in the file and decrypt media samples using the encryption
parameters stored in each track.
The second edition of this part of ISO/IEC 23001 added XML representations of Common Encryption
parameters for delivery in XML documents, such as an MPEG DASH Media Presentation Description
Documents (MPD). The second edition also defined the ‘cbc1’ protection scheme using AES-CBC
mode encryption.
The third edition added ‘cbcs’ and ‘cens’ protection schemes for pattern encryption, which encrypt
only a fraction of the data Blocks within each video Subsample protected. Pattern encryption reduces
the computational power required by devices to decrypt video tracks.
vi © ISO/IEC 2016 – All rights reserved
INTERNATIONAL STANDARD ISO/IEC 23001-7:2016(E)
Information technology — MPEG systems technologies —
Part 7:
Common encryption in ISO base media file format files
1 Scope
This part of ISO/IEC 23001 specifies common encryption formats for use in any file format based on
ISO/IEC 14496-12. File, track, and track fragment metadata is specified to enable multiple digital rights
and key management systems (DRMs) to access the same common encrypted file or stream. This part
of ISO/IEC 23001 does not define a DRM system.
The AES-128 symmetric block cipher is incorporated by reference to encrypt elementary stream
data contained in media samples. Both AES counter mode (CTR) and Cipher Block Chaining (CBC)
are specified in separate protection schemes. Partial encryption using a pattern of encrypted and
clear blocks is also specified in separate protection schemes. The identification of encryption keys,
Initialization Vector storage and processing is specified for each scheme.
Subsample encryption is specified for NAL structured video, such as AVC and HEVC, to enable normal
processing and editing of video elementary streams prior to decryption.
An XML representation is specified for important common encryption information so that it can be
included in XML files as standard elements and attributes to enable interoperable license and key
management prior to media file download.
2 Normative references
The following documents, in whole or in part, are normatively referenced in this document and are
indispensable for its application. For dated references, only the edition cited applies. For undated
references, the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 14496–12, Information technology — Coding of audio-visual objects — Part 12: ISO Base
Media File Format
ISO/IEC 14496–15, Information technology — Coding of audio-visual objects — Part 15: Carriage of NAL
unit structured video in the ISO Base Media File Format
3 Terms, definitions, and abbreviated terms
3.1 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
NOTE Words used as defined terms and normative terms (SHALL, SHOULD and MAY) are written in upper
case to distinguish them from the same word intending its dictionary definition.
3.1.1
constant IV
initialization vector (3.1.3) specified in a sample entry or sample group description that applies to all
samples and subsamples (3.1.8) under that sample entry or mapped to that sample group
© ISO/IEC 2016 – All rights reserved 1
3.1.2
block
16-byte extent of sample data that may be encrypted or decrypted by the AES-128 block cipher, in
which case, a cipher block
3.1.3
initialization vector
8-byte or 16-byte value used in combination with a key and a 16-byte block (3.1.2) of content to create
the first cipher block in a chain and derive subsequent cipher blocks in a cipher block chain
3.1.4
ISO Base Media File
file conforming to the file format described in ISO/IEC 14496-12 in which the techniques in
ISO/IEC 23001-7 may be used
3.1.5
NAL unit
syntax structure containing an indication of the type of data to follow and bytes containing that data in
the form of an RBSP interspersed as necessary with emulation prevention bytes
3.1.6
NAL structured video
video streams composed of NAL units (3.1.5) of which the carriage is specified by ISO/IEC 14496-15
3.1.7
protection scheme
encryption algorithm and information defined in this part of ISO/IEC 23001 and identified by a four
character code in an ISO Media track’s Scheme Type Box (‘schm’)
3.1.8
subsample
byte range within a sample consisting of an unprotected byte range followed by a protected byte range
3.2 Abbreviated terms
AES Advanced Encryption Standard as specified in Federal Information Processing Stand-
ards Publication 197, FIPS-197
AES-CTR AES Counter Mode as specified in Recommendation of Block Cipher Modes of Operation,
NIST, NIST Special Publication 800-38A
AES-CBC AES Cipher-Block Chaining Mode as specified in Recommendation of Block Cipher Modes
of Operation, NIST, NIST Special Publication 800-38A
AVC Advanced Video Coding as specified in ISO/IEC 14496-10
HEVC High Efficiency Video Coding as specified in ISO/IEC 23008-2
IV Initialization Vector
NAL Network Abstraction Layer, as specified in ISO/IEC 14496-10 and ISO/IEC 23008-2
URN Unique Resource Name
UUID Universally Unique Identifier
2 © ISO/IEC 2016 – All rights reserved
4 Protection schemes
4.1 Scheme type signaling
Scheme signaling SHALL conform to ISO/IEC 14496-12. As defined in ISO/IEC 14496-12, the sample
entry is transformed and a Protection Scheme Information Box (‘sinf’) is added to the standard
sample entry in the Sample Description Box to denote that a stream is protected. The Protection
Scheme Information Box SHALL contain a Scheme Type Box (‘schm’) so that the scheme is identifiable.
The Scheme Type Box SHALL have the following additional constraints:
— the scheme_type field SHALL be set to a value equal to a four character code defined in Clause 10;
— the scheme_version field SHALL be set to 0x00010000 (Major version 1, Minor version 0).
The Protection Scheme Information Box SHALL also contain a Scheme Information Box (‘schi’).
The Scheme Information Box SHALL contain a Track Encryption Box (‘tenc’), describing the default
encryption parameters for the track.
4.2 Common encryption scheme types
Four protection schemes are specified in this edition of Common Encryption. Each scheme uses syntax
and algorithms specified in Clause 5 to Clause 9, as constrained in Clause 10. They are the following:
a) ‘cenc’ – AES-CTR mode full sample and video NAL Subsample encryption, see 10.1;
b) ‘cbc1’ – AES-CBC mode full sample and video NAL Subsample encryption, see 10.2;
c) ‘cens’ – AES-CTR mode partial video NAL pattern encryption, see 10.3;
d) ‘cbcs’ – AES-CBC mode partial video NAL pattern encryption, see 10.4.
5 Overview of encryption metadata
The encryption metadata defined by Common Encryption can be categorized as follows.
— Protection System Specific Data – this data is opaque to Common Encryption. This gives protection
systems (i.e. key and digital rights management “DRM” systems) a place to store their own data using
a common mechanism. This data is contained in the ProtectionSystemSpecificHeaderBox
described in 8.1.
— Common encryption information for a media track – this includes default values for the key identifier
(KID), Initialization Vector and vector size, protection pattern, and protection flag. This data is
contained in the TrackEncryptionBox described in 8.2.
— Common encryption information for groups of media samples – this includes overrides to the track
level defaults defined above. This allows groups of samples within the track to use different keys,
a mix of clear and protected content, share a Constant Initialization Vector (for some schemes),
etc. This data is contained in a SampleGroupDescriptionBox (‘sgpd’) that is referenced by a
SampleToGroupBox (‘sbgp’). See Clause 6 for further details.
— Encryption information for individual media samples – this includes Initialization Vectors and
Subsample encryption data. This data is sample auxiliary information, referenced by using a Sample
Auxiliary Information Sizes Box (‘saiz’) and a Sample Auxiliary Information Offsets Box (‘saio’).
See Clause 7 for further details.
6 Encryption parameters shared by groups of samples
Each sample in a protected track SHALL be associated with an isProtected flag, Per_Sample_
IV_Size, KID, optional Block pattern information, and an optional constant_IV. This can be
© ISO/IEC 2016 – All rights reserved 3
accomplished by relying on the default values in the Track Encryption Box (‘tenc’) (see 8.2), and
optionally specifying parameters by sample group. Encryption parameters specified in a sample group
SHALL override the corresponding default parameter values for the samples in that group defined in
the Track Encryption Box. Samples not mapped to any sample group SHALL use the defaults established
in the Track Encryption Box.
When specifying the parameters by sample group, the Sample To Group Box (‘sbgp’) in the sample
table or track fragment specifies which samples use which sample group description from the Sample
Group Description Box (‘sgpd’). The format of the sample group description is uniform across all track
types (as indicated by the handler type for the track). For fragmented files, it may be necessary to store
both the Sample To Group Box and Sample Group Description Box in each track fragment to make them
accessible for decryption of the samples they describe, e.g. when movie fragments are separately stored
and delivered by streaming.
Tracks of all types SHALL use the CencSampleEncryptionInformationGroupEntry sample
group description structure, which has the following syntax.
aligned(8) class CencSampleEncryptionInformationGroupEntry
extends SampleGroupEntry( ‘seig’ )
{
unsigned int(8) reserved = 0;
unsigned int(4) crypt_byte_block = 0;
unsigned int(4) skip_byte_block = 0;
unsigned int(8) isProtected;
unsigned int(8) Per_Sample_IV_Size;
unsigned int(8)[16] KID;
if (isProtected ==1 && Per_Sample_IV_Size == 0) {
unsigned int(8) constant_IV_size;
unsigned int(8)[constant_IV_size] constant_IV;
}
}
These structures use a common semantic for their fields as follows:
— isProtected is the flag which indicates the encryption state of the samples in the sample group.
See the isProtected field in 9.1 for further details.
— Per_Sample_IV_Size is the Initialization Vector size in bytes for samples in the sample group.
See the Per_Sample_IV_Size field in 9.1 for further details.
— KID is the key identifier used for samples in the sample group. See the KID field in 9.1 for
further details.
— constant_IV_size is the size of a possible Initialization Vector used for all samples associated
with this group (when per-sample Initialization Vectors are not used).
— constant_IV, if present, is the Initialization Vector used for all samples associated with this
group. See the constant_IV field in 9.1 for further details.
— crypt_byte_block specifies the count of the encrypted Blocks in the protection pattern, where
each Block is of size 16-bytes. See 9.1 for further details.
— skip_byte_block specifies the count of the unencrypted Blocks in the protection pattern. See
9.1 for further details.
In order to facilitate the addition of future optional fields, clients SHALL ignore additional bytes after
the fields defined in the CencSampleEncryption group entry structures.
4 © ISO/IEC 2016 – All rights reserved
7 Common encryption sample auxiliary information
7.1 Definition
Each protected sample in a protected track SHALL have an Initialization Vector associated with it. Both
Initialization Vectors and Subsample encryption information MAY be provided as Sample Auxiliary
Information with aux_info_type equal to the scheme and aux_info_type_parameter equal to 0.
For example, for tracks protected using the ‘cenc’ scheme, the default value for aux_info_type
is ‘cenc’ and the default value for the aux_info_type_parameter is 0, so content SHOULD be
created omitting these optional fields. Storage of sample auxiliary information SHALL conform to
ISO/IEC 14496-12.
The format of the sample auxiliary information for samples with this type SHALL be as follows:
aligned(8) class CencSampleAuxiliaryDataFormat
{
unsigned int(Per_Sample_IV_Size*8) InitializationVector;
if (sample_info_size > Per_Sample_IV_Size )
{
unsigned int(16) subsample_count;
{
unsigned int(16) BytesOfClearData;
unsigned int(32) BytesOfProtectedData;
} [subsample_count ]
}
}
where
sample_info_size
is the size of the sample auxiliary information for this sample from
the Sample Auxiliary Information Size Box (‘saiz’);
InitializationVector
is the Initialization Vector for the sample, unless a constant_IV is
present in the Track Encryption Box (‘tenc’) (see the Initiali-
zationVector field in 9.1 for further details);
subsample_count
is the count of Subsamples for this sample (see the subsample_
count field in 9.1 for further details);
BytesOfClearData
is the number of bytes of clear data in this Subsample (see the
BytesOfClearData field in 9.1 for further details);
BytesOfProtectedData
is the number of bytes of protected data in this Subsample (see the
BytesOfProtectedData field in 9.1 for further details.
If Subsample encryption is not used (the size of the sample auxiliary information equals Per_Sample_
IV_Size), then the entire sample is protected (see 9.4 for further details). In this case, all auxiliary
information will have the same size and hence the default_sample_info_size of the Sample
Auxiliary Information Sizes box (‘saiz’) will be equal to the Per_Sample_IV_Size of the Initialization
Vectors. If Per_Sample_IV_Size is also zero (because constant IVs are in use) then the sample
auxiliary information would then be empty and should be omitted.
NOTE Even if Subsample encryption is used, the size of the sample auxiliary information may be the same
for all of the samples (if all of the samples have the same number of Subsamples) and the default_sample_
info_size may be used.
© ISO/IEC 2016 – All rights reserved 5
7.2 Sample Encryption Information box for storage of sample auxiliary information
7.2.1 Sample Encryption Box (‘senc’)
Box Type: ‘senc’
Container: Track Fragment Box (‘traf’) or Track Box (‘trak’)
Mandatory: No
Quantity: Zero or one
An optional storage location for Sample Auxiliary Information is the Sample Encryption Box (‘senc’),
specified here.
The Sample Encryption Box contains sample auxiliary information and may contain a per sample
Initialization Vector for each sample, and clear and protected byte ranges of partially protected video
samples (“Subsample encryption”). It MAY be used when samples in a track or track fragment are
protected. Storage of ‘senc’ in a Track Fragment Box makes the necessary Sample Auxiliary Information
accessible within the movie fragment for all contained samples in order to make each track fragment
independently decryptable; for instance, when movie fragments are delivered as DASH Media Segments.
7.2.2 Syntax
aligned(8) class SampleEncryptionBox
extends FullBox(‘senc’, version=0, flags)
{
unsigned int(32) sample_count;
{
unsigned int(Per_Sample_IV_Size*8) InitializationVector;
if (flags & 0x000002)
{
unsigned int(16) subsample_count;
{
unsigned int(16) BytesOfClearData;
unsigned int(32) BytesOfProtectedData;
} [ subsample_count ]
}
}[ sample_count ]
}
7.2.3 Semantics
— flags is inherited from the FullBox structure. The SampleEncryptionBox currently supports
the following bit values:
— 0x2 – UseSubSampleEncryption
— If the UseSubSampleEncryption flag is set, then the track fragment that contains this
Sample Encryption Box SHALL use Subsample encryption as described in 9.5. When this flag
is set, Subsample mapping data follows each InitializationVector. The Subsample
mapping data consists of the number of Subsamples for each sample, followed by an array of
values describing the number of bytes of clear data and the number of bytes of encrypted data
for each Subsample.
— sample_count is the number of protected samples in the containing track or track fragment. This
value SHALL be either zero (0) or the total number of samples in the track or track fragment.
— InitializationVector SHALL conform to the definition specified in 9.2. Only one Per_
Sample_IV_Size SHALL be used within a file or Per_Sample_IV_Size SHALL be zero when
a sample is unencrypted or a Constant IV is in use. Selection of InitializationVector values
SHOULD follow the recommendations of 9.2.
6 © ISO/IEC 2016 – All rights reserved
— subsample_count SHALL conform to the definition specified in 9.1.
— BytesOfClearData SHALL conform to the definition specified in 9.1.
— BytesOfProtectedData SHALL conform to the definition specified in 9.1.
8 Box definitions
8.1 Protection system specific header box
8.1.1 Definition
Box Type: `pssh’
Container: Movie (‘moov’) or Movie Fragment (‘moof’)
Mandatory: No
Quantity: Zero or more
This box contains information needed by a Content Protection System to play back the content. The
data format is specified by the system identified by the ‘pssh’ parameter SystemID and is considered
opaque for the purposes of this part of ISO/IEC 23001. The collection of Protection System Specific
Header boxes from the initial movie box, together with those in a movie fragment, SHALL provide all
the required Content Protection System information to decode that fragment.
The data encapsulated in the Data field MAY be read by the identified Content Protection System client
to enable decryption key acquisition and decryption of media data. For license/rights-based systems,
the header information MAY include data such as the URL of license server(s) or rights issuer(s) used,
embedded licenses/rights, embedded keys(s), and/or other protection system specific metadata.
A single file MAY be constructed to be playable by multiple key and digital rights management (DRM)
systems, by including Protection System Specific Header boxes for each system supported. In order to find
all of the Protection System Specific data that is relevant to a sample in the presentation, readers SHALL
— examine all Protection System Specific Header boxes in the Movie Box and in the Movie Fragment
Box associated with the sample (but not those in other Movie Fragment Boxes),
— match the SystemID field in this box to the SystemID(s) of the DRM System(s) they support, and
— match the KID associated with the sample (either from the default_KID field of the Track
Encryption Box or the KID field of the appropriate sample group description entry) with one of
the KID values in the Protection System Specific Header Box. Boxes without a list of applicable KID
values, or with an empty list, SHALL be considered to apply to all KIDs in the file or movie fragment.
Protection System Specific Header data SHALL be associated with a sample based on a matching KID
value in the ‘pssh’ and sample group description or default ‘tenc’ describing the sample. If a sample
or set of samples is moved due to file defragmentation or refragmentation or removed by editing, then
the associated Protection System Specific Header boxes for the remaining samples SHALL be stored
following the above requirements.
8.1.2 Syntax
aligned(8) class ProtectionSystemSpecificHeaderBox extends FullBox(‘pssh’, version, flags=0)
{
unsigned int(8)[16] SystemID;
if (version > 0)
{
unsigned int(32) KID_count;
{
unsigned int(8)[16] KID;
© ISO/IEC 2016 – All rights reserved 7
} [KID_count];
}
unsigned int(32) DataSize;
unsigned int(8)[DataSize] Data;
}
8.1.3 Semantics
— SystemID specifies a UUID that uniquely identifies the content protection system that this
header belongs to.
— KID_count specifies the number of KID entries in the following table. The value MAY be zero.
— KID identifies a key identifier that the Data field applies to. If not set, then the Data array SHALL
apply to all KIDs in the movie or movie fragment containing this box.
— DataSize specifies the size in bytes of the Data member.
— Data holds the content protection system specific data.
8.2 Track Encryption box
8.2.1 Definition
Box Type: `tenc’
Container: Scheme Information Box (‘schi’)
Mandatory: No (Yes, for protected tracks)
Quantity: Zero or one
The Track Encryption Box contains default values for the isProtected flag, Per_Sample_IV_Size,
and KID for the entire track. In the case where pattern-based encryption is in effect, it supplies the
pattern and when Constant IVs are in use, it supplies the Constant IV. These values are used as the
encryption parameters for the samples in this track unless over-ridden by the sample group description
associated with a group of samples. For files with only one key per track, this box allows the basic
encryption parameters to be specified once per track instead of being repeated per sample.
If both the value of default_isProtected is 1 and default_Per_Sample_IV_Size is 0, then the
default_constant_IV_size for all samples that use these settings SHALL be present. A Constant
IV SHALL NOT be used with counter-mode encryption. A sample group description may supply keys or
keys and Constant IVs for sample groups that override these default values for those samples mapped
to the group.
NOTE The version field of the Track Encryption Box is set to a value greater than zero when the pattern
encryption defined in 9.6 is used and to zero otherwise.
8.2.2 Syntax
aligned(8) class TrackEncryptionBox extends FullBox(‘tenc’, version, flags=0)
{
unsigned int(8) reserved = 0;
if (version==0) {
unsigned int(8) reserved = 0;
}
else { // version is 1 or greater
unsigned int(4) default_crypt_byte_block;
unsigned int(4) default_skip_byte_block;
}
unsigned int(8) default_isProtected;
unsigned int(8) default_Per_Sample_IV_Size;
8 © ISO/IEC 2016 – All rights reserved
unsigned int(8)[16] default_KID;
if (default_isProtected ==1 && default_Per_Sample_IV_Size == 0) {
unsigned int(8) default_constant_IV_size;
unsigned int(8)[default_constant_IV_size] default_constant_IV;
}
}
8.2.3 Semantics
— version SHALL be zero unless pattern-based encryption is in use, whereupon it SHALL be 1.
— default_isProtected is the protection flag which indicates the default protection state of the
samples in the track. See the isProtected field in 9.1 for further details.
— default_Per_Sample_IV_Size is the default Initialization Vector size in bytes. See the Per_
Sample_IV_Size field in 9.1 for further details.
— default_KID is the default key identifier used for samples in this track. See the KID field in 9.1 for
further details.
— default_constant_IV_size is the size of a possible default Initialization Vector for all samples.
— default_constant_IV, if present, is the default Initialization Vector for all samples. See the
constant_IV field in 9.1 for further details.
— default_crypt_byte_block specifies the count of the encrypted Blocks in the protection
pattern, where each Block is of size 16-bytes. See 9.1 for further details.
— default_skip_byte_block specifies the count of the unencrypted Blocks in the protection
pattern. See the skip_byte_block field in 9.1 for further details.
9 Encryption of media data
9.1 Field semantics
Within the sample groups and sample auxiliary information used by the common encryption scheme,
these fields have the following semantics:
— isProtected is the identifier of the protection state of the samples in the track or group of
samples. This flag takes the following values:
— 0x0: Not protected;
— 0x1: protected (as signalled by the scheme_type field of the scheme type box ‘schm’, e.g. for
scheme_type of ‘cenc’, the track default is AES-CTR encrypted using the ‘cenc’ scheme);
— 0x02 – 0xFF: Reserved.
— Per_Sample_IV_Size is the size in bytes of the InitializationVector field. The following
are supported values:
— 0 if the isProtected flag is 0x0 (Not Protected) or Constant IVs are in use;
— 8 specifies 64-bit Initialization Vectors;
— 16 specifies 128-bit Initialization Vectors.
— constant_IV_size is the size in bytes of the constant_IV field. The following are
supported values:
— 8 specifies 64-bit Initialization Vectors;
© ISO/IEC 2016 – All rights reserved 9
— 16 specifies 128-bit Initialization Vectors.
— KID is a key identifier that uniquely identifies the key needed to decrypt the associated samples
within the scope of an application so that KID is sufficient to identify a separately stored license
containing the key that was used to encrypt the content. This allows the identification of multiple
encryption keys per file or track. Unprotected samples in a protected track SHALL be identified by
having an isProtected flag of 0x0, a Per_Sample_IV_Size of 0x0, and a KID value of 0x0. It
is strongly recommended to use UUIDs [2] as KIDs in order to satisfy the uniqueness requirement
across all applications.
— InitializationVector specifies the Initialization Vector (IV) needed for decryption of a
sample. For an isProtected flag of 0x0, no Initialization Vectors are needed and the auxiliary
information SHOULD have a size of 0, i.e. not be present.
For an isProtected flag of 0x1:
— IVs shall be supplied using Per_Sample IVs or Constant IVs.
— If the Per_Sample_IV_Size field is 16, then InitializationVector specifies the entire
128-bit IV value
— If the Per_Sample_IV_Size field is 8, then its value is copied to bytes 0 to 7 of the Initialization
Vector and bytes 8 to 15 of the Initialization Vector are set to zero.
— subsample_count specifies the number of Subsample encryption entries present for this sample.
If present, this field SHALL be greater than 0.
— BytesOfClearData specifies the number of bytes of clear data at the beginning of this Subsample
encryption entry.
NOTE This value may be zero if no clear bytes exist for this Subsample.
— BytesOfProtectedData specifies the number of bytes of protected data following the clear data.
NOTE This value may be zero if no protected bytes exist for this Subsample.
The Subsample encryption entries SHALL NOT include an entry with a zero value in both the
BytesOfClearData field and in the BytesOfProtectedData field. The total length of all
BytesOfClearData and BytesOfProtectedData in a sample SHALL equal the length of
the sample. Subsample encryption entries SHOULD be as compactly represented as possible. For
example, instead of two entries with {15 clear, 0 protected}, {17 clear, 500 protected}, use one entry
of {32 clear, 500 protected}. If pattern-based encryption is used, then the pattern applies to the
protected byte range, BytesOfProtectedData; otherwise, all protected bytes are encrypted.
— crypt_byte_block shall be zero unless pattern-based encryption is in effect. See 9.6 for
further details.
— skip_byte_block shall be zero unless pattern-based encryption is in effect. See 9.6 for
further details.
9.2 Initialization Vectors
The Initialization Vector (IV) values for each sample SHALL be either a Constant IV and located in the
sample entry or a sample group description or SHALL be signaled per sample and be located in the
Sample Auxiliary Information associated with each protected sample. See 9.1 for additional details on
how Initialization Vectors are formed and stored.
It is recommended that applications applying encryption generate a random number for the first
Initialization Vector in a sequence.
— For 8-byte Per_Sample_IV_Size, Initialization Vectors for subsequent samples SHOULD be
created by incrementing the 8-byte Initialization Vector and padding the least significant bits with
10 © ISO/IEC 2016 – All rights reserved
ze
...










Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...