Higher Order Ambisonics (HOA) Transport Format

DTS/JTC-047

General Information

Status
Published
Publication Date
11-Jun-2018
Current Stage
12 - Completion
Due Date
10-Jun-2018
Completion Date
12-Jun-2018
Ref Project

Buy Standard

Standard
ETSI TS 103 589 V1.1.1 (2018-06) - Higher Order Ambisonics (HOA) Transport Format
English language
33 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

ETSI TS 103 589 V1.1.1 (2018-06)






TECHNICAL SPECIFICATION
Higher Order Ambisonics (HOA) Transport Format

---------------------- Page: 1 ----------------------
2 ETSI TS 103 589 V1.1.1 (2018-06)



Reference
DTS/JTC-047
Keywords
audio, broadcasting, TV, UHDTV

ETSI
650 Route des Lucioles
F-06921 Sophia Antipolis Cedex - FRANCE

Tel.: +33 4 92 94 42 00  Fax: +33 4 93 65 47 16

Siret N° 348 623 562 00017 - NAF 742 C
Association à but non lucratif enregistrée à la
Sous-Préfecture de Grasse (06) N° 7803/88

Important notice
The present document can be downloaded from:
http://www.etsi.org/standards-search
The present document may be made available in electronic versions and/or in print. The content of any electronic and/or
print versions of the present document shall not be modified without the prior written authorization of ETSI. In case of any
existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the
print of the Portable Document Format (PDF) version kept on a specific network drive within ETSI Secretariat.
Users of the present document should be aware that the document may be subject to revision or change of status.
Information on the current status of this and other ETSI documents is available at
https://portal.etsi.org/TB/ETSIDeliverableStatus.aspx
If you find errors in the present document, please send your comment to one of the following services:
https://portal.etsi.org/People/CommiteeSupportStaff.aspx
Copyright Notification
No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying
and microfilm except as authorized by written permission of ETSI.
The content of the PDF version shall not be modified without the written authorization of ETSI.
The copyright and the foregoing restriction extend to reproduction in all media.

© ETSI 2018.
© European Broadcasting Union 2018.
All rights reserved.

TM TM TM
DECT , PLUGTESTS , UMTS and the ETSI logo are trademarks of ETSI registered for the benefit of its Members.
TM TM
3GPP and LTE are trademarks of ETSI registered for the benefit of its Members and
of the 3GPP Organizational Partners.
oneM2M logo is protected for the benefit of its Members.
®
GSM and the GSM logo are trademarks registered and owned by the GSM Association.
ETSI

---------------------- Page: 2 ----------------------
3 ETSI TS 103 589 V1.1.1 (2018-06)
Contents
Intellectual Property Rights . 4
Foreword . 4
Modal verbs terminology . 4
1 Scope . 5
2 References . 5
2.1 Normative references . 5
2.2 Informative references . 5
3 Definitions and abbreviations . 6
3.1 Definitions . 6
3.2 Abbreviations . 6
4 Higher Order Ambisonics (HOA) Transport Format . 7
4.1 Introduction . 7
4.2 Generic HOA Transport Format . 7
4.3 ISO/IEC 23008-3-based HOA Transport Format (HoaTransportType = 1) . 11
4.3.1 Introduction. 11
4.3.2 HOA Transport Format defined in ISO/IEC 23008-3 . 12
4.3.3 Implementation of HOA Transport Encoder (TE) and HOA Emission Encoder (EE) . 12
4.4 ISO/IEC 23008-3-based HOA Transport Format modified for SN3D Normalization (HoaTransportType
= 2) . 16
4.5 V-vector based HOA Transport Format (HoaTransportType = 3) . 23
5 HOA Transport Format Audio Stream . 25
5.1 Introduction . 25
5.2 Syntax of HOA Transport Format Audio Stream . 26
5.3 Application Examples of HOA Transport Format Audio Stream . 28
Annex A (informative): Example guidelines for implementing HOA transport over SDI
utilizing communications modem technologies . 30
Annex B (informative): Example guidelines for HOA production . 32
History . 33


ETSI

---------------------- Page: 3 ----------------------
4 ETSI TS 103 589 V1.1.1 (2018-06)
Intellectual Property Rights
Essential patents
IPRs essential or potentially essential to normative deliverables may have been declared to ETSI. The information
pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found
in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in
respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web
server (https://ipr.etsi.org/).
Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee
can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web
server) which are, or may be, or may become, essential to the present document.
Trademarks
The present document may include trademarks and/or tradenames which are asserted and/or registered by their owners.
ETSI claims no ownership of these except for any which are indicated as being the property of ETSI, and conveys no
right to use or reproduce any trademark and/or tradename. Mention of those trademarks in the present document does
not constitute an endorsement by ETSI of products, services or organizations associated with those trademarks.
Foreword
This Technical Specification (TS) has been produced by Joint Technical Committee (JTC) Broadcast of the European
Broadcasting Union (EBU), Comité Européen de Normalisation ELECtrotechnique (CENELEC) and the European
Telecommunications Standards Institute (ETSI).
NOTE: The EBU/ETSI JTC Broadcast was established in 1990 to co-ordinate the drafting of standards in the
specific field of broadcasting and related fields. Since 1995 the JTC Broadcast became a tripartite body
by including in the Memorandum of Understanding also CENELEC, which is responsible for the
standardization of radio and television receivers. The EBU is a professional association of broadcasting
organizations whose work includes the co-ordination of its members' activities in the technical, legal,
programme-making and programme-exchange domains. The EBU has active members in about
60 countries in the European broadcasting area; its headquarters is in Geneva.
European Broadcasting Union
CH-1218 GRAND SACONNEX (Geneva)
Switzerland
Tel: +41 22 717 21 11
Fax: +41 22 717 24 81
Modal verbs terminology
In the present document "shall", "shall not", "should", "should not", "may", "need not", "will", "will not", "can" and
"cannot" are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of
provisions).
"must" and "must not" are NOT allowed in ETSI deliverables except when used in direct citation.

ETSI

---------------------- Page: 4 ----------------------
5 ETSI TS 103 589 V1.1.1 (2018-06)
1 Scope
Higher Order Ambisonics (HOA) signals are able to deliver a significantly enhanced immersive sound compared to
conventional stereo or 5.1 channel audio signals. However, there are some use cases where HOA signals cannot be
transported because of the large number of HOA input channels. The present document provides an HOA transport
format which allows unrestricted HOA order signals to be transported.
2 References
2.1 Normative references
References are either specific (identified by date of publication and/or edition number or version number) or
non-specific. For specific references, only the cited version applies. For non-specific references, the latest version of the
referenced document (including any amendments) applies.
Referenced documents which are not found to be publicly available in the expected location might be found at
https://docbox.etsi.org/Reference.
NOTE: While any hyperlinks included in this clause were valid at the time of publication, ETSI cannot guarantee
their long term validity.
The following referenced documents are necessary for the application of the present document.
[1] ISO/IEC 23008-3:2015/AMD 1:2016: "Information technology - High efficiency coding and
media delivery in heterogeneous environments - Part 3: 3D audio, 3D Audio Profile and Levels".
NOTE: Available at https://www.iso.org/standard/67953.html.
[2] ISO/IEC 23008-3:2015/DAM 5: "Information technology - High efficiency coding and media
delivery in heterogeneous environments - Part 3: 3D audio, Audio Metadata Enhancements".
NOTE: Available at https://www.iso.org/standard/74433.html.
[3] ISO/IEC 23008-3:2015: "Information technology - High efficiency coding and media delivery in
heterogeneous environments - Part 3: 3D audio".
NOTE: Available at https://www.iso.org/standard/63878.html.
[4] ISO/IEC 23008-3:2015/AMD 3:2017: " Information technology - High efficiency coding and
media delivery in heterogeneous environments - Part 3: 3D audio, MPEG-H 3D Audio Phase 2".
NOTE: Available at https://www.iso.org/standard/69561.html.
[5] ISO/IEC 13818-1:2015: "Information technology - Generic coding of moving pictures and
associated audio information - Part 1: Systems".
NOTE: Available at https://www.iso.org/standard/67331.html.
2.2 Informative references
References are either specific (identified by date of publication and/or edition number or version number) or
non-specific. For specific references, only the cited version applies. For non-specific references, the latest version of the
referenced document (including any amendments) applies.
NOTE: While any hyperlinks included in this clause were valid at the time of publication, ETSI cannot guarantee
their long term validity.
ETSI

---------------------- Page: 5 ----------------------
6 ETSI TS 103 589 V1.1.1 (2018-06)
The following referenced documents are not necessary for the application of the present document but they assist the
user with regard to a particular subject area.
[i.1] SMPTE Motion Imaging Journal: "Building The World's Most Complex TV Network: A Test Bed
for Broadcasting Immersive and Interactive Audio" R. L. Bleidt et al.: pp. 26-34, 2017.
NOTE: Available at http://ieeexplore.ieee.org/document/7963945/.
3 Definitions and abbreviations
3.1 Definitions
For the purposes of the present document, the following terms and definitions apply:
MPEG-H Audio Stream (MHAS): self-contained stream format to transport ISO/IEC 23008-3 data
MPEG-H 3DA: MPEG-H 3D Audio standard defined in ISO/IEC 23008-3 [1] to [4].
3.2 Abbreviations
For the purposes of the present document, the following abbreviations apply:
ACN Ambisonic Channel Number
AGC Adaptive Gain Control
AU Access Unit
BG Background (audio channel)
CRC Cyclic Redundancy Check
DAW Digital Audio Workstation
FG Foreground (audio channel)
HDMI High-Definition Multimedia Interface
HD-SDI High-Definition Serial Digital Interface
HOA Higher Order Ambisonics
HTF HOA Transport Format
HTFAS HOA Transport Format Audio Stream
ISO International Organization for Standardization
MHAS MPEG-H Audio Stream
MMT MPEG media transport
MPEG Moving Pictures Experts Group
MPEG-H LC MPEG-H Audio Low Complexity profile
NOTE: As defined in ISO/IEC 23008-3 [1].
NOC Network Operation Centre
OTA Over The Air (media)
OTT Over The Top (media)
PCM Pulse Code Modulation
SDI Serial Digital Interface
SID Single Index Designation
SMPTE Society of Motion Picture & Television Engineers
VHTF Vector based HOA Transport Format
ETSI

---------------------- Page: 6 ----------------------
7 ETSI TS 103 589 V1.1.1 (2018-06)
4 Higher Order Ambisonics (HOA) Transport Format
4.1 Introduction
Higher Order Ambisonics (HOA) signals are able to deliver a significantly enhanced immersive sound compared to
conventional stereo or 5.1 channel audio signals. However, there are some use cases where HOA signals cannot be
transported because of the large number of HOA input channels.
One use case is mobile devices where the number of input channels is limited by N Pulse-Code Modulation (PCM)
channels. As shown in Figure 1 (a), if N is 8, a maximum of First Order Ambisonics (FOA which requires 4 PCM
channels) can be transported.
Another use case is a typical broadcast workflow as shown in Figure 1 (b). Here, a contribution encoder can transmit
16 PCM channels from the remote truck to the Network Operation Centre (NOC) or local affiliate(s). However, the use
of single High-Definition Serial Digital Interface (HD-SDI) link has a limitation of being able to transport only 16 PCM
rd
channels. This restricts the transport to a maximum of 3 order HOA signals (requiring 16 PCM channels) with the
additional restriction that there are no additional discrete audio elements to be transported. If additional audio elements
nd
are to be transported, only a maximum of 2 order HOA (requiring 9 PCM channels) can be transported.
The present document aims to specify an HOA transport format which allows unrestricted HOA order signals to be
transported. This not only includes the above two cases, but also any other cases with limitations in bandwidth and the
number of transport channels. Other examples include High-Definition Multimedia Interface (HDMI) or other wired or
wireless connectivity interfaces.
Mobile Device A Mobile Device B
8
Input
Encoder Decoder Output
HOA

(a)
Remote Truck
Network Operation Center (NOC) Local Affiliate
Customer’s Living Room
1,4,9,16
Contribu- Contribu- 16 Contribu- Contribu- 16
Input
Emission
Emission
Output
tion tion tion
tion
HOA
SDI SDI Encoder Decoder
Encoder Decoder
Encoder Decoder
MPEG-2 or MMT
MPEG-2 or MMT MPEG-2 or MMT
or DASH/ROUTE
or DASH/ROUTE or DASH/ROUTE

(b)
Figure 1: (a) conventional mobile devices and
(b) conventional broadcast chain for order-restricted Ambisonics transport
4.2 Generic HOA Transport Format
st
To transport higher than 1 order HOA over the mobile device as shown in Figure 1 (a), an HOA Transport Encoder is
used in the production devices, such as a microphone array or a digital audio workstation (DAW). As shown in Figure 2
(a), the HOA transport encoder encodes the input HOA of any order into the HOA transport format (HTF) which
contains I transport audio signals along with the HOA Side-info data. The number I of transport audio signals is usually
much lower than the number of HOA input coefficients.
rd
To transport higher than 3 order HOA over the SDI framework, an HOA Transport Encoder is placed in front of the
contribution encoder as shown in Figure 2 (b). For example, the HOA transport encoder converts input 49 HOA
th
coefficients (6 order HOA signal) to the HOA transport format which contains 13 transport audio channels along with
a single HOA Side-info channel. The 16 channel HD-SDI can carry these (13+1) channels with 2 empty channels.
ETSI

---------------------- Page: 7 ----------------------
8 ETSI TS 103 589 V1.1.1 (2018-06)
For error protection in SDI transmission, the HOA Side-info can be modulated with communications modem
technologies into a PCM control track signal that fits into the audio signal bandwidth [i.1].
Production Device
Mobile Device A Mobile Device B
1, 4, 9,16,
7
Transport
HOA
25 ,36, 49
audio 8
Input
Transport Encoder Decoder
Output
HOA
1
Side
Encoder
info
HOA Transport
Format

(a)
Remote Truck SDI
Network Operation Center (NOC) Local Affiliate
Customer’s Living Room
1,4,9,16,
15
Transport
HOA
25 ,36 , 49
Contribu- Contribu-
16 Contribu- Contribu- 16
audio
Input
Emission Emission
Output
Transport tion tion
tion tion
HOA
1
Side SDI Encoder Decoder
Encoder Decoder SDI
Encoder Decoder
Encoder
info
HOA Transport
Format
MPEG-2 or MMT MPEG-2 or MMT MPEG-2 or MMT
or DASH/ROUTE
or DASH/ROUTE or DASH/ROUTE

(b)
Figure 2: HOA Transport Format for (a) mobile devices and (b) broadcast chain
Annex A presents an example guideline about HOA transport over Serial Digital Interface (SDI) utilizing
communications modem technologies [i.1].
Annex B shows the HOA content production workflow where the HOA transport encoder is placed outside the
broadcast chain.
In Table 1, the syntax of the configuration of Generic HOA transport format is defined as a binary representation
format. In Table 2, the corresponding semantics of the configuration of Generic HOA transport format is defined.
ETSI

---------------------- Page: 8 ----------------------
9 ETSI TS 103 589 V1.1.1 (2018-06)
Table 1: Syntax of HOATransportFormatConfig()
Syntax No. of bits Mnemonic
HOATransportFormatConfig(HoaTransportType)
{
if (HoaTransportType == 0) {
 InputSamplingFrequency; 4 uimsbf
 InputAudioBitDepth = (InputAudioBitDepthIdx+1)*8; 2 uimsbf
 HoaFrameLengthIdx; 3 uimsbf
 NumOfHoaCoeffs = ( HoaOrder + 1 )^2; 5 uimsbf
 NumOfTransportChannels = NumOfHoaCoeffs;
 HoaNormalization; 2 uimsbf
 HoaCoeffOrdering; 2 uimsbf
1
 IsScreenRelative;  uimsbf
} else if (HoaTransportType == 1) {
 HoaNormalization = 1;
 HoaCoeffOrdering = 0;
 NumOfTransportChannels = CodedNumOfTransportChannels+1; 5 uimsbf

 HOAConfig();

} else if (HoaTransportType == 2) {

 HoaNormalization = 0;

 HoaCoeffOrdering = 0;
5 uimsbf
 NumOfTransportChannels = CodedNumOfTransportChannels+1;

 HOAConfig_SN3D();

 isScreenRelative = isScreenRelative_E;

} else if (HoaTransportType == 3) {
4 uimsbf
 InputSamplingFrequency;
2 uimsbf
 InputAudioBitDepth = (InputAudioBitDepthIdx+1)*8;
3 uimsbf
 HoaFrameLengthIdx;
5 uimsbf
 NumOfHoaCoeffs = ( HoaOrder + 1 )^2;

 HoaNormalization = 0;

 HoaCoeffOrdering = 0;
1 uimsbf
 IsScreenRelative;
5 uimsbf
 NumOfTransportChannels = CodedNumOfTransportChannels+1;

}
if (IsScreenRelative) {

 if (hasNonStandardScreenSize) {
1 bslbf
  if (isCenteredInAzimuth) {
1 bslbf
  bsScreenSizeAz;
9 uimsbf
  } else {

  bsScreenSizeLeftAz;
10 uimsbf
  bsScreenSizeRightAz;
10 uimsbf
  }

  bsScreenSizeTopEl;
9 uimsbf
  bsScreenSizeBottomEl;
9 uimsbf
 }

}

}



Table 2: Semantics of HOATransportConfig()
HoaTransportType
This element contains information about HOA transport mode.
0: HOA coefficients (as defined in this clause)
1: ISO/IEC 23008-3-based HOA Transport Format as defined in
clause 4.3
2: Modified ISO/IEC 23008-3-based HOA Transport Format for SN3D
normalization as defined in clause 4.4
3: V-vector based HOA Transport Format as defined in clause 4.5
InputSamplingFrequency
This element contains information about input sampling frequency.
0: 24 kHz
1: 32 kHz
2: 44,1 kHz
3: 48 kHz
4: 96 kHz
5: 192 kHz
6 - 15: reserved
ETSI

---------------------- Page: 9 ----------------------
10 ETSI TS 103 589 V1.1.1 (2018-06)
InputAudioBitDepthIdx
This element determines the input audio bit depth by
InputAudioBitDepth = (InputAudioBitDepthIdx+1)*8.
HoaOrder
This element determines the HOA order of the coded signal.
HoaNormalization
This element contains information about HOA coefficient normalization.
0: SN3D normalization
1: N3D normalization
2: FuMa normalization
3: reserved
HoaCoeffOrdering
This element contains information about HOA coefficient ordering.
0: ACN
1: SID
2-3: reserved
IsScreenRelative
This element contains information about whether the content is:
0: not screen related
1: screen related
hasNonStandardScreenSize
This flag specifies whether the defined production screen size is different
from the default screen size. The definition is done via viewing angles (in
degrees) corresponding to the screen edges. The default screen size is
defined with the following values (a 4K display at an optimal viewing
distance):
° °
= 29. 0 , = −29. 0
left right
° °
= 17. 5 , = −17. 5
top bottom
isCenteredInAzimuth
This flag defines whether the production screen is frontal and centered in
azimuth (absolute values of the azimuth angles of the left and right screen
edge are identical) or not.
bsScreenSizeAz
This field defines the azimuth angles (in degree) corresponding to the left
and right screen edge:
= 0,5 bsScreenSizeAz
left
= min(max( , 0), 180)
left left
= −0,5 bsScreenSizeAz
right
= min(max(, −180), 0)
right right
bsScreenSizeLeftAz
This field defines the azimuth angle (in degree) corresponding to the left
screen edge:
= 0,5 bsScreenSizeLeftAz − 511
left
= min(max( , −180), 180)
left left
bsScreenSizeRightAz
This field defines the azimuth angle (in degree) corresponding to the right
screen edge:
= 0,5 bsScreenSizeRightAz − 511
right
= min(max(, −180), 180)
right right
bsScreenSizeTopEl
This field defines the elevation angle (in degree) corresponding to the top
screen edge:
= 0,5 bsScreenSizeTopEl − 255
top
= min (max( , −90), 90)
top top
bsScreenSizeBottomEl
This field defines the elevation angle (in degree) corresponding to the
bottom screen edge:
= 0,5 bsScreenSizeBottomEl − 255
bottom
= min (max( , −90), 90)
bottom bottom
HoaFrameLengthIdx
This element contains information about the HOA frame length L. See also
Table 5.
CodedNumOfTransportChannels
This element contains information about the coded number of transport
channels.
NumOfTransportChannels
This element contains information about the number of transport channels.
HOAConfig()
This element contains information about the configuration for HOA spatial
encoding as defined in ISO/IEC 23008-3 [1] to [4], clause 12.3.
HOAConfig_SN3D()
This element contains information about the configuration for HOA spatial
encoding as defined in clause 4.4.

In Table 3, the syntax of the frame data of Generic HOA transport format is defined as a binary representation format.
In Table 4, the corresponding semantics of the frame data of Generic HOA transport format is defined.
ETSI

---------------------- Page: 10 ----------------------
11 ETSI TS 103 589 V1.1.1 (2018-06)
Table 3: Syntax of HOATransportFormatFrame()
Syntax No. of bits Mnemonic
HOATransportFormatFrame(HoaTransportType)
{
if (HoaTransportType == 1) {
 HOAFrame();
} else if (HoaTransportType == 2) {
 HOAFrame_SN3D();
} else if (HoaTransportType == 3) {
 HOAFrame_VvecTransportFormat();
}


for (j=0;j< HoaFrameLength;j++) {
 for (i=0;i< NumOfTransportChannels;i++) {
  htfCoreAudioChannels[i][j]; InputAudioB bslbf
 } itDepth
}
}

Table 4: Semantics of HOATransportFormatFrame ()
HOAFrame()
The HOAFrame() holds the information that is required to decode the L
samples of an HOA frame of N3D normalization as described in clause 4.3.
HOAFrame_SN3D()
The HOAFrame() holds the information that is required to decode the L
samples of an HOA frame of SN3D normalization as described in clause 4.4.
HOAFrame_VvecTransportFormat()
The HOAFrame() holds the information that is required to decode the L
samples of an HOA frame based on the V-vectors as described in clause 4.5.
NumOfTransportChannels
This element contains information about the number of transport channels
defined in Table 1.
HoaFrameLength
This element contains information about the HOA frame length L defined in
Table 5.
htfCoreAudioChannels[i][j]
This element contains information about the audio data of a j-th sample in an
i-th transport channel.

Table 5: Value of HOA frame length in samples, HoaFrameLength (L), depending on •
InputSamplingFrequency and HoaFrameLengthIdx
InputSamplingFrequency (kHz) HoaFrameLengthIdx
0 1 2 3 4 5 6 7
24 192 256 384 480 512 768 960 1 024
32 256 384 512 640 832 1 024 1 280 1 366
44,1 384 512 768 960 1 024 1 536 1 920 2 048
48 384 512 768 960 1 024 1 536 1 920 2 048
96 768 1 024 1 536 1 920 2 048 3 072 3 840 4 096
192 1 536 2 048 3 072 3 840 4 096 6 144 7 680 8 192

4.3 ISO/IEC 23008-3-based HOA Transport Format
(HoaTransportType = 1)
4.3.1 Introduction
This clause defines Type 1 of the HOA Transport Format (HoaTransportType = 1) which is based on
ISO/IEC 23008-3 (MPEG-H 3D Audio) [1] to [4].
ETSI

---------------------- Page: 11 ----------------------
12 ETSI TS 103 589 V1.1.1 (2018-06)
4.3.2 HOA Transport Format defined in ISO/IEC 23008-3
In [2], the HOA input signal is analysed and encoded into the spatial coding parameters and the directional and ambient
signals. The number of signals is usually lower than the number of HOA input coefficients. The HOA Frame Creater
converts the resulting HOA spatial coding parameters to the HOA payloads HOAConfig() and HOAFrame().
In some environments (see e.g. Figure 2), the HOA spatial encoder is separated from the MPEG-H 3D Audio Core
encoder. In this case, the HOA Transport Format consists of spatial coding parameters and the predominant and ambient
signals. This HOA Transport Format can be transmitted from the HOA spatial encoder to the MPEG-H 3D Audio Core
encoder. Compared with the input HOA, the HOA Transport Format usually requires a significantly reduced number of
transport channels.
4.3.3 Implementation of HOA Transport Encoder (TE) and HOA Emission
Encoder (EE)
Based on [2], the following terms are defined for simplicity:
- The combination of the Spatial HOA Encoder and the HOA Frame Creater is defined as the HOA
Encoder.
- The predominant and ambient signals are defined as HOA Transport Audio Signals.
- The combination of the HOAConfig and HOAFrame is defined as the HOA Side-info.
- The combination of the HOA Transport Audio Signals and the HOA Side-info is defined as HOA
Transport Format.
As shown in annexes A and B, there are several ways to design the broadcast chain. To make these systems working, it
is beneficial to design the HOA Transport Encoder (TE) and the Emission Encoder (EE) such that:
• The bit-rate, the number of transport channels, and hoaIndependencyFlag are determined at EE. An
hoaIndependencyFlag indicates whether a frame is independently decodable.
• Delay and complexity increase at TE and EE should be minimized.
Thus, three design criteria are defined:
1) Predominant audio channels (or called Foreground Audio (FG) channels): A V-vector represents the spatial
distribution of the sound field for a particular vector-based predominant. FG audio channels and full V-vector
elements are transmitted from TE to EE. At EE, a subset of FG audio channels and a subset of V-vector
elements are selected and transmitted without any modification (no delay is required). If the EE modifies any
FG audio channel, one frame delay is required for the adaptive gain correction (AGC) lookahead.
2) Ambient audio channels (or called background audio (BG) channels): As BG channels, H_BG, original HOA
coefficients, H, are transmitted from TE to EE without applying any decorrelation and energy compensation:
H_BG=H. At EE, a subset of BG channels is selected and transmitted without any modification (no delay is
required). If the EE modifies any BG audio channel, one frame delay is required for the AGC lookahead.
3) To create random access points at EE, all the HOA Side-info parameter shall be encoded independently at TE.
The predictive coding is not allowed at TE.
In the TE implementation, the total number of FGs and BGs are selected as 4 and 9, respectively. At EE, a subset of
FGs and BGs is selected based on the EE bit rate. ChannelType is set to be
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.