Digital cellular telecommunications system (Phase 2+) (GSM); Universal Mobile Telecommunications System (UMTS); LTE; 5G; Study on video enhancements in 3GPP multimedia services (3GPP TR 26.948 version 17.0.0 Release 17)

RTR/TSGS-0426948vh00

General Information

Status
Not Published
Technical Committee
Current Stage
12 - Completion
Completion Date
04-May-2022
Ref Project
Standard
ETSI TR 126 948 V17.0.0 (2022-05) - Digital cellular telecommunications system (Phase 2+) (GSM); Universal Mobile Telecommunications System (UMTS); LTE; 5G; Study on video enhancements in 3GPP multimedia services (3GPP TR 26.948 version 17.0.0 Release 17)
English language
63 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


TECHNICAL REPORT
Digital cellular telecommunications system (Phase 2+) (GSM);
Universal Mobile Telecommunications System (UMTS);
LTE;
5G;
Study on video enhancements in 3GPP multimedia services
(3GPP TR 26.948 version 17.0.0 Release 17)

3GPP TR 26.948 version 17.0.0 Release 17 1 ETSI TR 126 948 V17.0.0 (2022-05)

Reference
RTR/TSGS-0426948vh00
Keywords
5G,GSM,LTE,UMTS
ETSI
650 Route des Lucioles
F-06921 Sophia Antipolis Cedex - FRANCE

Tel.: +33 4 92 94 42 00  Fax: +33 4 93 65 47 16

Siret N° 348 623 562 00017 - APE 7112B
Association à but non lucratif enregistrée à la
Sous-Préfecture de Grasse (06) N° w061004871

Important notice
The present document can be downloaded from:
http://www.etsi.org/standards-search
The present document may be made available in electronic versions and/or in print. The content of any electronic and/or
print versions of the present document shall not be modified without the prior written authorization of ETSI. In case of any
existing or perceived difference in contents between such versions and/or in print, the prevailing version of an ETSI
deliverable is the one made publicly available in PDF format at www.etsi.org/deliver.
Users of the present document should be aware that the document may be subject to revision or change of status.
Information on the current status of this and other ETSI documents is available at
https://portal.etsi.org/TB/ETSIDeliverableStatus.aspx
If you find errors in the present document, please send your comment to one of the following services:
https://portal.etsi.org/People/CommiteeSupportStaff.aspx
If you find a security vulnerability in the present document, please report it through our
Coordinated Vulnerability Disclosure Program:
https://www.etsi.org/standards/coordinated-vulnerability-disclosure
Notice of disclaimer & limitation of liability
The information provided in the present deliverable is directed solely to professionals who have the appropriate degree of
experience to understand and interpret its content in accordance with generally accepted engineering or
other professional standard and applicable regulations.
No recommendation as to products and services or vendors is made or should be implied.
No representation or warranty is made that this deliverable is technically accurate or sufficient or conforms to any law
rule and/or regulation and further, no representation or warranty is made of merchantability or fitness
and/or governmental
for any particular purpose or against infringement of intellectual property rights.
In no event shall ETSI be held liable for loss of profits or any other incidental or consequential damages.

Any software contained in this deliverable is provided "AS IS" with no warranties, express or implied, including but not
limited to, the warranties of merchantability, fitness for a particular purpose and non-infringement of intellectual property
rights and ETSI shall not be held liable in any event for any damages whatsoever (including, without limitation, damages
for loss of profits, business interruption, loss of information, or any other pecuniary loss) arising out of or related to the use
of or inability to use the software.
Copyright Notification
No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and
microfilm except as authorized by written permission of ETSI.
The content of the PDF version shall not be modified without the written authorization of ETSI.
The copyright and the foregoing restriction extend to reproduction in all media.

© ETSI 2022.
All rights reserved.
ETSI
3GPP TR 26.948 version 17.0.0 Release 17 2 ETSI TR 126 948 V17.0.0 (2022-05)
Intellectual Property Rights
Essential patents
IPRs essential or potentially essential to normative deliverables may have been declared to ETSI. The declarations
pertaining to these essential IPRs, if any, are publicly available for ETSI members and non-members, and can be
found in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to
ETSI in respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the
ETSI Web server (https://ipr.etsi.org/).
Pursuant to the ETSI Directives including the ETSI IPR Policy, no investigation regarding the essentiality of IPRs,
including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not
referenced in ETSI SR 000 314 (or the updates on the ETSI Web server) which are, or may be, or may become,
essential to the present document.
Trademarks
The present document may include trademarks and/or tradenames which are asserted and/or registered by their owners.
ETSI claims no ownership of these except for any which are indicated as being the property of ETSI, and conveys no
right to use or reproduce any trademark and/or tradename. Mention of those trademarks in the present document does
not constitute an endorsement by ETSI of products, services or organizations associated with those trademarks.
DECT™, PLUGTESTS™, UMTS™ and the ETSI logo are trademarks of ETSI registered for the benefit of its

Members. 3GPP™ and LTE™ are trademarks of ETSI registered for the benefit of its Members and of the 3GPP
Organizational Partners. oneM2M™ logo is a trademark of ETSI registered for the benefit of its Members and of the ®
oneM2M Partners. GSM and the GSM logo are trademarks registered and owned by the GSM Association.
Legal Notice
This Technical Report (TR) has been produced by ETSI 3rd Generation Partnership Project (3GPP).
The present document may refer to technical specifications or reports using their 3GPP identities. These shall be
interpreted as being references to the corresponding ETSI deliverables.
The cross reference between 3GPP and ETSI identities can be found under http://webapp.etsi.org/key/queryform.asp.
Modal verbs terminology
In the present document "should", "should not", "may", "need not", "will", "will not", "can" and "cannot" are to be
interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of provisions).
"must" and "must not" are NOT allowed in ETSI deliverables except when used in direct citation.
ETSI
3GPP TR 26.948 version 17.0.0 Release 17 3 ETSI TR 126 948 V17.0.0 (2022-05)
Contents
Intellectual Property Rights . 2
Legal Notice . 2
Modal verbs terminology . 2
Foreword . 5
1 Scope . 6
2 References . 6
3 Definitions and abbreviations . 7
3.1 Definitions . 7
3.2 Abbreviations . 7
4 Overview of video codecs specified for existing 3GPP multimedia services . 8
5 Overview of SHVC . 9
5.0 General . 9
5.1 Basic SHVC architecture . 9
5.2 Systems and transport interfaces of SHVC . 10
5.2.1 Introduction. 10
5.2.2 Parameter Set and Slice Segment Header Extensions . 11
5.2.3 Layer and Scalability Identification . 11
5.2.4 Layer sets . 12
5.2.5 Profile, Tier, and Level (PTL) . 12
5.2.6 RPS and Reference Picture List Construction . 12
5.2.7 Random Access, Layer Switching, and Bitstream Splicing . 13
5.2.8 Hybrid Codec Scalability and Multiview/3D Support . 13
5.2.9 Hypothetical Reference Decoder (HRD) . 14
5.2.10 SEI Messages . 14
5.3 A comparison of SHVC and SVC . 14
5.4 SHVC decoder and encoder complexity analyses . 16
5.4.1 Introduction. 16
5.4.2 HEVC simulcast encoder and decoder. 16
5.4.3 SHVC encoder and decoder . 18
5.4.4 Upsampling filter . 19
5.4.5 Inter-layer texture prediction . 20
5.4.6 Inter-layer motion prediction . 21
5.4.7 Conclusion . 22
6 Use cases . 22
6.1 Multi-stream Multiparty Video Conferencing (MMVC) . 22
6.1.1 The heterogeneous-device MMVC use case . 22
6.1.2 The heterogeneous-bandwidth MMVC use case . 24
6.1.3 Solutions for the MMVC use cases . 24
6.1.3.0 General . 24
6.1.3.1 HEVC simulcast . 24
6.1.3.2 SHVC . 25
6.1.4 Comparison of the solutions . 26
6.1.4.1 Uplink bandwidth . 26
6.1.4.2 Downlink bandwidth . 26
6.1.4.3 Decoding complexity . 27
6.1.4.4 Encoding complexity . 27
6.2 MBMS . 27
6.2.1 The differentiated-service MBMS use case . 27
6.2.2 Solutions for the MBMS use case . 28
6.2.2.1 HEVC simulcast . 28
6.2.2.2 SHVC . 29
6.2.3 Comparison of the solutions . 29
ETSI
3GPP TR 26.948 version 17.0.0 Release 17 4 ETSI TR 126 948 V17.0.0 (2022-05)
6.2.3.1 Transmission bandwidth . 29
6.2.3.2 Decoding complexity . 31
6.2.3.3 Encoding complexity . 32
6.3 3GP-DASH. 32
6.3.1 The 3GP-DASH use case . 32
6.3.2 Solutions for the 3GP-DASH use case . 32
6.3.2.1 HEVC simulcast . 32
6.3.2.2 SHVC . 33
6.3.3 Comparison of the solutions . 34
6.3.3.0 General . 34
6.3.3.1 Outgoing transmission bandwidth . 34
6.3.3.1.0 General . 34
6.3.3.1.1 From origin server to caches . 34
6.3.3.1.2 From origin server to UEs . 34
6.3.3.2 Incoming transmission bandwidth . 35
6.3.3.3 Decoding complexity . 36
6.3.3.4 Encoding complexity . 36
7 Test cases, conditions, and results . 37
7.1 Test cases and conditions . 37
7.1.1 General conditions . 37
7.1.2 MMVC . 38
7.1.3 MBMS . 39
7.1.4 3GP-DASH . 40
7.2 Test results. 42
7.2.1 MMVC . 42
7.2.1.1 General . 42
7.2.1.2 Results for aligned IRAP pictures case . 42
7.2.1.3 Results for IRAP non-aligned test case . 43
7.2.1.4 First set of additional results . 43
7.2.1.5 Second set of additional results . 44
7.2.1.6 Additional analysis for comparing SHVC and HEVC simulcast . 48
7.2.1.6.1 Introduction . 48
7.2.1.6.2 Uplink vs downlink transmission cost . 48
7.2.1.6.3 Case by case cost/benefit analysis of SHVC vs HEVC simulcast . 48
7.2.2 MBMS . 52
7.2.2.0 General . 52
7.2.2.1 Results for aligned IRAP pictures case . 53
7.2.2.2 Results for non-aligned IRAP pictures case . 54
7.2.3 3GP-DASH . 55
7.2.3.0 General . 55
7.2.3.1 Results for aligned IRAP pictures case . 56
7.2.3.2 Results for non-aligned IRAP pictures case . 57
8 Conclusions . 58
8.1 Introduction . 58
8.2 MMVC and IMS based telepresence . 59
8.3 MBMS . 59
8.4 3GP-DASH. 60
Annex A: Change history . 61
History . 62

ETSI
3GPP TR 26.948 version 17.0.0 Release 17 5 ETSI TR 126 948 V17.0.0 (2022-05)
Foreword
rd
This Technical Report has been produced by the 3 Generation Partnership Project (3GPP).
The contents of the present document are subject to continuing work within the TSG and may change following formal
TSG approval. Should the TSG modify the contents of the present document, it will be re-released by the TSG with an
identifying change of release date and an increase in version number as follows:
Version x.y.z
where:
x the first digit:
1 presented to TSG for information;
2 presented to TSG for approval;
3 or greater indicates TSG approved document under change control.
y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, updates,
etc.
z the third digit is incremented when editorial only changes have been incorporated in the document.
ETSI
3GPP TR 26.948 version 17.0.0 Release 17 6 ETSI TR 126 948 V17.0.0 (2022-05)
1 Scope
The present document reports the study on video enhancements in 3GPP multimedia services. It firstly provides an
overview of the video codecs and their configurations specified for existing 3GPP multimedia services, namely 3GP-
DASH (TS 26.247 [1]), PSS (TS 26.234 [2]), MBMS (TS 26.346 [3]), MTSI (TS 26.114 [4], including multi-stream
multiparty video conferencing), MMS (TS 26.140 [5]), and IMS Messaging and Presence (TS 26.141 [6]). Then use
cases on video enhancements for existing 3GPP multimedia services are discussed, including a discussion on potential
codec solutions for each of the use cases. To enable drawing conclusions, simulation conditions and simulation results
for comparisons of different codecs and their configurations are provided. Performance is evaluated in typical 3GPP
service environments taking into account bandwidth, quality and complexity. Based on the performance results,
conclusions are made in terms of recommendations for support of enhanced video capabilities for 3GPP multimedia
services.
2 References
The following documents contain provisions which, through reference in this text, constitute provisions of the present
document.
- References are either specific (identified by date of publication, edition number, version number, etc.) or
non-specific.
- For a specific reference, subsequent revisions do not apply.
- For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (including
a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same
Release as the present document.
[1] 3GPP TS 26.247: "Transparent end-to-end Packet-switched Streaming Service (PSS); Progressive
Download and Dynamic Adaptive Streaming over HTTP (3GP-DASH)".
[2] 3GPP TS 26.234: "Transparent end-to-end Packet-switched Streaming Service (PSS); Protocols
and codecs".
[3] 3GPP TS 26.346: "Multimedia Broadcast/Multicast Service (MBMS); Protocols and codecs".
[4] 3GPP TS 26.114: "IP Multimedia Subsystem (IMS); Multimedia telephony; Media handling and
interaction".
[5] 3GPP TS 26.140: "Multimedia Messaging Service (MMS); Media formats and codecs".
[6] 3GPP TS 26.141: "IP Multimedia System (IMS) Messaging and Presence; Media formats and
codecs".
[7] 3GPP TR 21.905: "Vocabulary for 3GPP Specifications".
[8] ITU-T Recommendation H.263 (01/2005): "Video coding for low bit rate communication".
[9] ITU-T Recommendation H.264 (V9) (02/2014): "Advanced video coding for generic audiovisual
services".
[10] ITU-T Recommendation H.265 (V3) (04/2015): "High efficiency video coding".
[11] J. Boyce, Y. Yan, J. Chen, and A. K. Ramasubramonian, "Overview of SHVC: Scalable
Extensions of the High Efficiency Video Coding (HEVC) Standard," IEEE Trans. Circuits Syst.
Video Technol., August 2015, to be published.
[12] R. Sjöberg, Y. Chen, A. Fujibayashi, M. M. Hannuksela, J. Samuelsson, T. K. Tan, Y.-K. Wang,
and S. Wenger, "Overview of HEVC High-Level Syntax and Reference Picture Management,"
IEEE Trans. Circuits Syst. Video Technol., vol. 22, no. 12, pp. 1858‒1870, Dec. 2012.
[13] 3GPP TR 26.906: "Evaluation of HEVC for 3GPP Services".
ETSI
3GPP TR 26.948 version 17.0.0 Release 17 7 ETSI TR 126 948 V17.0.0 (2022-05)
[14] G. Tech, Y. Chen, K. Müller, J.-R. Ohm, A. Vetro, and Y.-K. Wang, "Overview of the Multiview
and 3D Extensions of High Efficiency Video Coding," IEEE Trans. Circuits Syst. Video Technol.,
August 2015, to be published.
[15] ITU-R Recommendation BT.709-6 (06/2015): "Parameter values for the HDTV standards for
production and international programme exchange".
[16] ITU-R Recommendation BT.2020-1 (06/2014): "Parameter values for ultra-high definition
television systems for production and international programme exchange".
[17] 3GPP TR 26.923: "Study on Media Handling Aspects of IMS-based Telepresence".
[18] 3GPP TS 26.948: "Video enhancements for 3GPP Multimedia Services".
3 Definitions and abbreviations
3.1 Definitions
For the purposes of the present document, the terms and definitions given in TR 21.905 [7] apply.
3.2 Abbreviations
For the purposes of the present document, the abbreviations given in TR 21.905 [7] and the following apply.
3D-HEVC Three-Dimension High Efficiency Video Coding
rd
3GPP 3 Generation Partnership Project
API Application Program Interface
AU Access Unit
AVC Advanced Video Coding
BD Bjontegaard Delta
BL Base Layer
BLA Broken Link Access
BM-SC Broadcast-Multicast - Service Centre
BP Bitstream Partition
CABAC Context Adaptive Binary Arithmetic Coding
CPB Coded Picture Buffer
CRA Clean Random Access
CTU Coding Tree Unit
DASH Dynamic Adaptive Streaming over HTTP
DPB Decoded Picture Buffer
EL Enhancement Layer
GOP Group of Pictures
HDTV High-Definition TeleVision
HEVC High Efficiency Video Coding
HLS High-Level Syntax
HRD Hypothetical Reference Decoder
IDR Instantaneous Decoding Refresh
IMS IP Multimedia Subsystem
INBLD Independent Non-Base Layer Decoding
ILP Inter-Layer Prediction
IRAP Intra Random Access Point
MANE Media Aware Network Element
MBMS Multimedia Broadcast/Multicast Service
MBMS-GW MBMS Gateway
MMVC Multi-stream Multiparty Video Conferencing
MMS Multimedia Messaging Service
MRFP Multimedia Resource Function Processor
MSB Most Significant Bits
MTSI Multimedia Telephony Service for IMS
ETSI
3GPP TR 26.948 version 17.0.0 Release 17 8 ETSI TR 126 948 V17.0.0 (2022-05)
MTU Maximum Transmission Unit
MV Motion Vector
MV-HEVC MultiView High Efficiency Video Coding
NAL Network Abstraction Layer
OLS Output Layer Set
POC Picture Order Count
PPS Picture Parameter Set
PSNR Peak Signal Noise Ratio
PSS Packet-switched Streaming Service
PTL Profile, Tier, and Level
QP Quantization Parameter
RAP Random Access Point
RPL Reference Picture List
RPS Reference Picture Set
SAO Sample Adaptive Offset
SEI Supplemental Enhancement Information
SPS Sequence Parameter Set
SVC Scalable Video Coding
SHVC Scalable High efficiency Video Coding
TMVP Temporal Motion Vector Prediction
UE User Equipment
UHDTV Ultra High-Definition TeleVision
VPS Video Parameter Set
VUI Video Usability Information
WPP Wavefront Parallel Processing
4 Overview of video codecs specified for existing
3GPP multimedia services
The video support in 3GPP multimedia services in Release-12 is provided in Table 1.
ETSI
3GPP TR 26.948 version 17.0.0 Release 17 9 ETSI TR 126 948 V17.0.0 (2022-05)
Table 1: Video support in 3GPP multimedia services in Release-12
H.263 [8] H.264/AVC [9] HEVC/H.265 [10]
DASH and PSS Profile 0 Level 45 Constrained Baseline Main Profile, Main Tier,
Profile, Level 1.3 Level 3.1
Progressive High Profile
Level 3.1
Frame-packed
stereoscopic 3D video
(H.264 Constrained
Baseline Profile Level 1.3
or Progressive High Profile
Level 3.1)
Multiview stereoscopic 3D
video (H.264 Stereo High
Profile Level 3.1), but not
for RTP based
transmission
MBMS Constrained Baseline Main Profile, Main Tier,
Profile, Level 1.3 Level 3.1
Progressive High Profile
Level 3.1
Frame-packed
stereoscopic 3D video
(H.264 Constrained
Baseline Profile Level 1.3
or Progressive High Profile
Level 3.1)
MTSI and IMS Messaging H.264 Constrained Main Profile, Main Tier,
and Presence Baseline Profile, Level 1.2 Level 3.1
Constrained Baseline
Profile, Level 3.1
MMS Profile 0 Level 45 Constrained Baseline Main Profile, Main Tier,
Profile, Level 1.3 Level 3.1
Progressive High Profile
Level 3.1
Frame-packed
stereoscopic 3D video
(H.264 Constrained
Baseline Profile Level 1.3
or Progressive High Profile
Level 3.1)
5 Overview of SHVC
5.0 General
Scalable High efficiency Video Coding (SHVC) refers to the scalable extension of H.265/HEVC, specified in Annex H
of the H.265/HEVC specification [10]. This clause provides an overview of SHVC, including the basic SHVC
architecture, SHVC systems and transport interfaces, a comparison of SHVC and Scalable Video Coding (SVC), the
scalable extension of H.264/AVC [9], and SHVC decoder and encoder complexity analyses.
5.1 Basic SHVC architecture
Inter-layer prediction is employed in a scalable system to improve the coding efficiency of the enhancement layers. In
addition to the spatial and temporal motion-compensated predictions that are available in a single-layer codec, inter-
layer prediction (ILP) in SHVC uses the reconstructed video signal from a reference layer to predict the current
enhancement layer. Inter layer prediction in SHVC is built upon the so called "reference index" framework. With this
framework, the collocated reconstructed picture from the reference layer is treated as a long-term reference picture, and
is assigned a reference index (or reference indices) in the reference picture list(s) along with other temporal reference
ETSI
3GPP TR 26.948 version 17.0.0 Release 17 10 ETSI TR 126 948 V17.0.0 (2022-05)
pictures in the current layer. Then, ILP is achieved at the block-level (Prediction Unit-level) by setting the value of the
ref_idx syntax element to correspond to the inter-layer reference picture(s) in the reference picture list(s).
Figure 1 shows the SHVC codec architecture from the decoder's perspective. SHVC supports more layers, but for ease
of explanation, Figure 1 only describes a two-layer scalable system consisting of the base layer (BL) and one
enhancement layer (EL).
As will be discussed later in clause 8.3, one fundamental benefit of the "reference index" based SHVC architecture is
that it allows the EL codec to maintain the same block level logics as a single-layer HEVC decoder. The EL codec
differs from a single-layer HEVC decoder only at the high level syntax level, i.e. at or above the slice header level.
Hence, the EL decoder is labeled as HEVC* in Figure 1 to reflect this. To achieve efficient inter-layer prediction, inter-
layer processing is applied to the reconstructed BL pictures retrieved from the BL Decoded Picture Buffer (BL DPB);
afterwards, the processed pictures are put into the EL Decoded Picture Buffer (EL DPB) and used as inter-layer
reference pictures for predictive coding of the EL pictures. SHVC applies different forms of inter-layer processing
depending on the types of scalability between the two layers. For example, for spatial scalability, resampling of texture
and/or motion information from the reference layer is applied. By adjusting the sample bit depth during resampling,
SHVC also supports bit depth scalability. For color gamut scalability, a color mapping process is applied. Further
detailed discussion of inter layer processing modules supported by SHVC can be found in [11].
As shown in Figure 1, the base layer bitstream can be sent either as part of the SHVC bitstream "in-band", or obtained
via "external means" in an "out-of-band" manner. In the former case when the base layer is embedded within the SHVC
bitstream, the input bitstream is de-multiplexed into two separate layers. The base layer (BL) bitstream is sent to the
base layer decoder and the enhancement layer (EL) bitstream is sent to the EL decoder. The BL decoder is an HEVC
decoder; in the Scalable Main and Scalable Main 10 profiles currently defined in SHVC, the BL decoder conforms to
either the HEVC Main or Main 10 profile. Additionally, SHVC also allows the base layer bitstream to be provided via
external means, for example, through other system-level multiplexing methods. This latter function can be used to
support the use case when the base layer bitstream is coded using a non-HEVC single-layer codec, for example, using
H.264/AVC, MPEG-2, or even non-standardized codecs. Accordingly, this is also referred to as hybrid codec
scalability. For hybrid codec scalability, the BL decoding operations are outside of the scope of the SHVC decoder.
After decoding, the reconstructed BL pictures are provided to the SHVC decoder, along with some information
associated with the BL pictures. The remaining SHVC decoding operations are the same as the former case with the
embedded BL bitstream. The SHVC decoder applies inter-layer processing to the reconstructed BL pictures to obtain
the inter-layer reference pictures for predictive coding of the EL video pictures. It is worth noting that, although BL
bitstreams provided via external means are generally expected to be non-HEVC coded, an HEVC-coded BL bitstream
can be provided via external means as well.

Figure 1: SHVC decoder architecture
5.2 Systems and transport interfaces of SHVC
5.2.1 Introduction
The systems and transport interfaces of a video codec, also referred to as high-level syntax (HLS), are an integral part of
a video codec. An important part is the network abstraction layer (NAL), providing a (generic) interface of a video
ETSI
3GPP TR 26.948 version 17.0.0 Release 17 11 ETSI TR 126 948 V17.0.0 (2022-05)
codec to (various) networks/systems. HLS topics include (but are not limited to) bitstream structure and coded data
units structures; parameter sets signalling; support of random access and stream adaptation; error resilience; coded and
decoded picture buffer management and buffering model (a.k.a. hypothetical reference decoder or HRD); scalability;
byte stream format; profile and level signalling; signalling of supplemental enhancement information (SEI) and video
usability information (VUI); extensibility and backward compatibility.
HEVC (single-layer coding) HLS was designed with significant consideration of extensibility mechanisms. These are
also referred to as hooks, which basically allow future extensions that would be backward compatible to earlier versions
of the standard. Important HLS hooks in HEVC include: a) Inclusion of layer identifier (ID) in the NAL unit header,
whereby the same NAL unit header syntax applies to both HEVC single-layer coding and its multi-layer extensions; b)
Introduction of the video parameter set (VPS), which was introduced mainly for use with multi-layer extensions, as
VPS contains cross-layer information; c) Introduction of the layer set concept and the associated signalling of multi-
layer HRD parameters; d) Addition of extensibility for all types of parameter sets and slice header, which allows the
same syntax structures to be used for both the base layer and enhancement layers without defining new NAL unit types
and to be further extended in the future when needed.
A common HLS framework has been jointly developed for SHVC and MV-HEVC (which is largely applicable to 3D-
HEVC as well). This clause focuses on the new HLS features developed for the three multi-layer HEVC extensions
compared to HEVC single layer coding HLS, for which an overview can be found in [12] and TR 26.906 [13]. More
details of SHVC Systems and transport interfaces be found in [11] and [14].
5.2.2 Parameter Set and Slice Segment Header Extensions
The VPS has been extended by adding the VPS extension structure to the end, which mainly includes information on: a)
Scalability type and division of NAL unit header layer ID to scalability IDs; b) Layer dependency, dependency type,
and independent layers; c) Layer sets and output layer sets; d) Sub-layers and inter-layer dependency of sub-layers; e)
Profile, tier, and level (PTL); f) Representation format (resolution, bit depth, color format, etc.); g) Decoded picture
buffer (DPB) size; h) cross-layer video usability information (VUI), which includes information on cross-layer picture
type alignment, cross-layer intra random access point (IRAP) picture alignment, bit rate and picture rate of layer sets,
video signal format (color primaries, transfer characteristics, etc.), usage of tiles and wavefronts and other enabled
parallel processing capabilities, and additional HRD parameters.
It should be noted that the VPS applies to all layers, while in the AU decoding order dimension it applies from the first
AU where it is activated up to the AU when it is deactivated. Different layers (including the base layer and a non-base
layer) may either share the same SPS or use different SPSs. Pictures of different layers or AUs can also share the same
picture parameter set (PPS) or use different PPSs. To enable sharing between sequence parameter set (SPS) and PPS, all
SPSs share the same value space of their SPS IDs, regardless of the layer ID values in their NAL unit headers; the same
is true for PPSs.
Among other smaller extensions, the slice segment header has been extended in a backward compatible manner by
adding the following information: a) The discardable flag that indicates whether the picture is used for at least one of
inter prediction and inter-layer prediction or neither (when neither applies the picture can be discarded without affecting
the decoding of any other pictures, in the same layer or other layers); b) A flag that indicates whether an IDR picture is
a bitstream splicing point (if yes, then pictures from earlier AUs would be unavailable as references for pictures of any
layer starting from the current AU); c) Information on lower-layer pictures used by the current picture for inter-layer
prediction; and d) POC resetting and POC most significant bits (MSB) information. The latter two sets of information
are used as the basis for derivation of the inter-layer reference picture set (RPS) and for guaranteeing cross-layer POC
alignment, both of which are discussed later.
5.2.3 Layer and Scalability Identification
Each layer is associated with a unique layer ID, for which the value will be increasing across pictures of different layers
in decoding order within an AU. In addition, a layer is associated with scalability IDs specifying its content, which are
derived from the VPS extension and denoted as view order index and auxiliary ID.
All layers of a view have the same view order index. The view order index is required to be increasing in decoding
order of views. Furthermore, a view ID value is signalled for each view order index, which can be chosen without
constraints, but should indicate the view's camera position (e.g. in a linear setup).
The auxiliary ID signals whether a layer is an auxiliary picture layer carrying depth, alpha or other user defined
auxiliary data. By design choice, auxiliary picture layers have no normative impact on the decoding of non-auxiliary
picture layers (denoted as primary picture layers).
ETSI
3GPP TR 26.948 version 17.0.0 Release 17 12 ETSI TR 126 948 V17.0.0 (2022-05)
5.2.4 Layer sets
The concept of layer sets was already introduced in HEVC version 1. A layer set is a set of independent decodable
layers that contains the base layer. Layer sets are signalled in the base part of the VPS. During the development of the
common multi-layer HLS, two related concepts, namely output layer sets (OLSs) and additional layer sets, were further
introduced. An OLS is a layer set for which the target output layers are specified (non-target-output layers are for
example those layers that are used only for inter-layer prediction but not for output/display). For example, an OLS can
have two layers for output (e.g. stereoscopic viewing) but contain three layers. An HEVC single-layer decoder would
only process one target output layer, i.e. the base layer, regardless of how many layers the layer set contains. This is the
reason why the concept of OLS layer set was not needed in HEVC version 1.
An additional layer set is a set of independent decodable layers that does not contain the base layer. For example, if a
bitstream contains two simulcast (i.e. independently coded) layers, then the non-base layer itself can be included in an
additional layer set. This concept can also be used for signalling the PTL for auxiliary picture layers, which are usually
coded independently from the primary picture layers. For example, a depth or alpha (i.e. transparency) auxiliary picture
layer can be included in an additional layer set and indicated to conform to the Monochrome (8 bit) profile, regardless
of which single-layer profile the base (primary picture) layer conforms to. Without such a design, many more profiles
would need to be defined to handle all the combinations of auxiliary picture layers with single-layer profiles. To realize
the benefits of this design, an independent non-base layer rewriting process was specified, which "transcodes"
independent non-base layers to a bitstream that conforms to a single-layer profile.
By design choice, an additional layer set is allowed to contain more than one layer, e.g. three layers with layer ID values
equal to 3, 4, and 5, where the layer with layer ID equal to 3 is an independent non-base layer. Along with this, a
bitstream extraction process for additional layer sets was specified. While the extr
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...