Digital Audio Compression (AC-4) Standard; Part 2: Immersive and personalized audio

RTS/JTC-043-2

General Information

Status
Published
Publication Date
05-Feb-2018
Current Stage
12 - Completion
Due Date
31-Jan-2018
Completion Date
06-Feb-2018
Ref Project

Buy Standard

Standard
ETSI TS 103 190-2 V1.2.1 (2018-02) - Digital Audio Compression (AC-4) Standard; Part 2: Immersive and personalized audio
English language
250 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

ETSI TS 103 190-2 V1.2.1 (2018-02)






TECHNICAL SPECIFICATION
Digital Audio Compression (AC-4) Standard;
Part 2: Immersive and personalized audio



---------------------- Page: 1 ----------------------
2 ETSI TS 103 190-2 V1.2.1 (2018-02)



Reference
RTS/JTC-043-2
Keywords
audio, broadcasting, codec, content, digital,
distribution, object audio, personalization

ETSI
650 Route des Lucioles
F-06921 Sophia Antipolis Cedex - FRANCE

Tel.: +33 4 92 94 42 00  Fax: +33 4 93 65 47 16

Siret N° 348 623 562 00017 - NAF 742 C
Association à but non lucratif enregistrée à la
Sous-Préfecture de Grasse (06) N° 7803/88

Important notice
The present document can be downloaded from:
http://www.etsi.org/standards-search
The present document may be made available in electronic versions and/or in print. The content of any electronic and/or
print versions of the present document shall not be modified without the prior written authorization of ETSI. In case of any
existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the
print of the Portable Document Format (PDF) version kept on a specific network drive within ETSI Secretariat.
Users of the present document should be aware that the document may be subject to revision or change of status.
Information on the current status of this and other ETSI documents is available at
https://portal.etsi.org/TB/ETSIDeliverableStatus.aspx
If you find errors in the present document, please send your comment to one of the following services:
https://portal.etsi.org/People/CommiteeSupportStaff.aspx
Copyright Notification
No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying
and microfilm except as authorized by written permission of ETSI.
The content of the PDF version shall not be modified without the written authorization of ETSI.
The copyright and the foregoing restriction extend to reproduction in all media.

© ETSI 2018.
All rights reserved.

TM TM TM
DECT , PLUGTESTS , UMTS and the ETSI logo are trademarks of ETSI registered for the benefit of its Members.
TM
3GPP and LTE™ are trademarks of ETSI registered for the benefit of its Members and
of the 3GPP Organizational Partners.
oneM2M logo is protected for the benefit of its Members.
GSM® and the GSM logo are trademarks registered and owned by the GSM Association.
ETSI

---------------------- Page: 2 ----------------------
3 ETSI TS 103 190-2 V1.2.1 (2018-02)
Contents
Intellectual Property Rights . 17
Foreword . 17
Modal verbs terminology . 17
Introduction . 18
Motivation . 18
Structure of the present document . 18
1 Scope . 19
2 References . 19
2.1 Normative references . 19
2.2 Informative references . 19
3 Definitions, symbols, abbreviations and conventions . 20
3.1 Definitions . 20
3.2 Symbols . 25
3.3 Abbreviations . 25
3.4 Conventions . 26
4 Decoding the AC-4 bitstream . 28
4.1 Introduction . 28
4.2 Channels and objects . 28
4.3 Immersive audio . 29
4.4 Personalized Audio. 29
4.5 AC-4 bitstream . 30
4.5.1 Bitstream structure . 30
4.5.2 Data dependencies . 32
4.5.3 Frame rates. 33
4.6 Decoder compatibilities . 33
4.7 Decoding modes . 34
4.7.1 Introduction. 34
4.7.2 Full decoding mode . 34
4.7.3 Core decoding mode . 34
4.8 Decoding process . 35
4.8.1 Overview . 35
4.8.2 Selecting an audio presentation . 36
4.8.3 Decoding of substreams . 36
4.8.3.1 Introduction . 36
4.8.3.2 Identification of substream type . 37
4.8.3.3 Substream decoding overview . 39
4.8.3.4 Decoding of object properties . 40
4.8.3.4.1 Introduction . 40
4.8.3.4.2 Object audio metadata location . 40
4.8.3.5 Spectral frontends . 41
4.8.3.6 Stereo and multichannel processing (SMP) . 42
4.8.3.7 Inverse modified discrete cosine transformation (IMDCT) . 42
4.8.3.8 Simple coupling (S-CPL) . 42
4.8.3.9 QMF analysis . 42
4.8.3.10 Companding . 42
4.8.3.10.1 Introduction . 42
4.8.3.10.2 Channel audio substream . 42
4.8.3.10.3 Channel audio substream with an immersive channel element . 42
4.8.3.10.4 Object audio substream . 43
4.8.3.11 A-SPX . 43
4.8.3.11.1 Introduction . 43
4.8.3.11.2 Core decoding mode with ASPX_SCPL codec mode . 43
4.8.3.11.3 Full decoding mode with ASPX_SCPL codec mode . 44
ETSI

---------------------- Page: 3 ----------------------
4 ETSI TS 103 190-2 V1.2.1 (2018-02)
4.8.3.12 Advanced joint channel coding (A-JCC) . 44
4.8.3.13 Advanced joint object coding (A-JOC) . 44
4.8.3.14 Advanced coupling (A-CPL) . 45
4.8.3.15 Dialogue enhancement . 45
4.8.3.16 Direct dynamic range control bitstream gain application . 46
4.8.3.17 Substream gain application for operation with associated audio. 46
4.8.3.18 Substream gain application for operation with dialogue substreams . 47
4.8.3.19 Substream rendering. 47
4.8.4 Mixing of decoded substreams . 47
4.8.5 Loudness correction . 48
4.8.5.1 Introduction . 48
4.8.5.2 Dialnorm location . 48
4.8.5.3 Downmix loudness correction . 48
4.8.5.4 Alternative audio presentation loudness correction . 49
4.8.5.5 Real-time loudness correction data . 49
4.8.6 Dynamic range control . 49
4.8.7 QMF synthesis . 49
4.8.8 Sample rate conversion . 50
5 Algorithmic details . 50
5.1 Bitstream processing . 50
5.1.1 Introduction. 50
5.1.2 Elementary stream multiplexing tool . 50
5.1.3 Efficient high frame rate mode . 52
5.2 Stereo and multichannel processing (SMP) for immersive audio . 54
5.2.1 Introduction. 54
5.2.2 Interface . 55
5.2.2.1 Inputs . 55
5.2.2.2 Outputs . 55
5.2.2.3 Controls . 55
5.2.3 Processing the immersive_channel_element . 55
5.2.3.1 Introduction . 55
5.2.3.2 immersive_codec_mode ∈ {SCPL, ASPX_SCPL, ASPX_ACPL_1} . 56
5.2.3.3 immersive_codec_mode = ASPX_ACPL_2 . 57
5.2.3.4 immersive_codec_mode = ASPX_AJCC . 57
5.2.4 Processing the 22_2_channel_element . 58
5.3 Simple coupling (S-CPL) . 58
5.3.1 Introduction. 58
5.3.2 Interface . 58
5.3.2.1 Inputs . 58
5.3.2.2 Outputs . 58
5.3.3 Reconstruction of the output channels . 59
5.3.3.1 Full decoding. 59
5.3.3.2 Core decoding . 59
5.4 Advanced spectral extension (A-SPX) postprocessing tool . 60
5.4.1 Introduction. 60
5.4.2 Interface . 60
5.4.2.1 Inputs . 60
5.4.2.2 Outputs . 60
5.4.3 Processing . 60
5.5 Advanced coupling (A-CPL) for immersive audio . 61
5.5.1 Introduction. 61
5.5.2 Processing the immersive_channel_element . 61
5.6 Advanced joint channel coding (A-JCC) . 63
5.6.1 Introduction. 63
5.6.2 Interface . 63
5.6.2.1 Inputs . 63
5.6.2.2 Outputs . 63
5.6.2.3 Controls . 63
5.6.3 Processing . 64
5.6.3.1 Parameter band to QMF subband mapping . 64
5.6.3.2 Differential decoding and dequantization . 64
ETSI

---------------------- Page: 4 ----------------------
5 ETSI TS 103 190-2 V1.2.1 (2018-02)
5.6.3.3 Interpolation . 65
5.6.3.4 Decorrelator and transient ducker . 66
5.6.3.5 Reconstruction of the output channels . 67
5.6.3.5.1 Input channels . 67
5.6.3.5.2 A-JCC full decoding mode . 67
5.6.3.5.3 A-JCC core decoding mode . 71
5.7 Advanced joint object coding (A-JOC) . 74
5.7.1 Introduction. 74
5.7.2 Interface . 75
5.7.2.1 Inputs . 75
5.7.2.2 Outputs . 75
5.7.2.3 Controls . 75
5.7.3 Processing . 75
5.7.3.1 Parameter band to QMF subband mapping . 75
5.7.3.2 Differential decoding . 76
5.7.3.3 Dequantization . 77
5.7.3.4 Parameter time interpolation . 82
5.7.3.5 Decorrelator and transient ducker . 83
5.7.3.6 Signal reconstruction using matrices . 83
5.7.3.6.1 Processing . 83
5.7.3.6.2 Decorrelation input matrix. 86
5.8 Dialogue enhancement for immersive audio . 87
5.8.1 Introduction. 87
5.8.2 Processing . 87
5.8.2.1 Dialogue enhancement for core decoding of A-JCC coded 9.X.4 content . 87
5.8.2.2 Dialogue enhancement for core decoding of parametric A-CPL coded 9.X.4 content . 90
5.8.2.3 Dialogue enhancement for full decoding of A-JOC coded content . 91
5.8.2.4 Dialogue enhancement for core decoding of A-JOC coded content . 92
5.8.2.5 Dialogue enhancement for non A-JOC coded object audio content . 93
5.9 Object audio metadata timing . 93
5.9.1 Introduction. 93
5.9.2 Synchronization of object properties . 93
5.10 Rendering . 94
5.10.1 Introduction. 94
5.10.2 Channel audio renderer . 95
5.10.2.1 Introduction . 95
5.10.2.2 General rendering matrix . 96
5.10.2.3 Panning of a stereo or mono signal . 96
5.10.2.4 Substream downmix or upmix for full decoding . 97
5.10.2.5 Matrix coefficients for channel-based renderer for full decoding . 98
5.10.2.6 Substream downmix or upmix for core decoding . 101
5.10.2.7 Matrix coefficients for channel-based renderer for core decoding . 101
5.10.3 Intermediate spatial format rendering . 102
5.10.3.1 Introduction . 102
5.10.3.2 Conventions . 102
5.10.3.3 Interface . 103
5.10.3.3.1 Inputs . 103
5.10.3.3.2 Outputs . 103
5.10.3.3.3 Controls . 103
5.10.3.4 Processing . 103
5.11 Accurate frame rate control . 103
6 Bitstream syntax . 104
6.1 Introduction . 104
6.2 Syntax specification . 107
6.2.1 AC-4 frame info . 107
6.2.1.1 ac4_toc . 107
6.2.1.2 ac4_presentation_info . 108
6.2.1.3 ac4_presentation_v1_info . 109
6.2.1.4 frame_rate_fractions_info .
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.