Information technology -- High efficiency coding and media delivery in heterogeneous environments

This document specifies technology that supports the efficient transmission of immersive audio signals and flexible rendering for the playback of immersive audio in a wide variety of listening scenarios. These include home theatre setups with 3D loudspeaker configurations, 22.2 loudspeaker systems, automotive entertainment systems and playback over headphones connected to a tablet or smartphone.

Technologies de l'information -- Codage à haute efficacité et livraison des medias dans des environnements hétérogènes

General Information

Status
Published
Publication Date
27-Feb-2019
Current Stage
9092 - International Standard to be revised
Start Date
30-Jul-2021
Ref Project

RELATIONS

Buy Standard

Standard
ISO/IEC 23008-3:2019 - Information technology -- High efficiency coding and media delivery in heterogeneous environments
English language
798 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (sample)

INTERNATIONAL ISO/IEC
STANDARD 23008-3
Second edition
2019-02
Information technology — High
efficiency coding and media delivery
in heterogeneous environments —
Part 3:
3D audio
Technologies de l'information — Codage à haute efficacité et livraison
des medias dans des environnements hétérogènes —
Partie 3: Audio 3D
Reference number
ISO/IEC 23008-3:2019(E)
ISO/IEC 2019
---------------------- Page: 1 ----------------------
ISO/IEC 23008-3:2019(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2019

All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may

be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting

on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address

below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Fax: +41 22 749 09 47
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2019 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/IEC 23008-3:2019(E)
Contents Page

Foreword ............................................................................................................................................................. x

Introduction ....................................................................................................................................................... xii

1 Scope ....................................................................................................................................................... 1

2 Normative references ............................................................................................................................. 1

3 Terms, definitions, symbols, abbreviations and mnemonics ............................................................... 2

3.1 Terms, definitions, symbols and abbreviated terms ........................................................................... 2

3.2 Mnemonics .............................................................................................................................................. 2

4 Technical overview ................................................................................................................................ 2

4.1 Decoder block diagram .......................................................................................................................... 2

4.2 Overview over the codec building blocks .............................................................................................. 3

4.3 Efficient combination of decoder processing blocks in the time domain and QMF domain ............... 6

4.4 Rule set for determining processing domains ...................................................................................... 9

4.4.1 Audio core codec processing domain .................................................................................................... 9

4.4.2 Mixing ..................................................................................................................................................... 9

4.4.3 DRC-1 Operation domains (DRC in rendering context) ...................................................................... 10

4.4.4 Audio core codec interface domain to rendering ............................................................................... 10

4.4.5 Rendering context ................................................................................................................................ 10

4.4.6 Post-processing context ....................................................................................................................... 10

4.4.7 End-of-chain context ............................................................................................................................ 11

4.5 Sample rate converter .......................................................................................................................... 11

4.6 Decoder delay ....................................................................................................................................... 11

4.7 Contribution mode of MPEG-H 3D audio ............................................................................................. 12

4.8 MPEG-H 3D audio profiles and levels .................................................................................................. 12

4.8.1 General .................................................................................................................................................. 12

4.8.2 Profiles .................................................................................................................................................. 12

5 MPEG-H 3D audio core decoder ........................................................................................................... 22

5.1 Definitions ............................................................................................................................................ 22

5.1.1 Joint stereo ........................................................................................................................................... 22

5.1.2 MPEG surround based stereo (MPS 212) ............................................................................................ 22

5.2 Syntax .................................................................................................................................................... 22

5.2.1 General .................................................................................................................................................. 22

5.2.2 Decoder configuration ......................................................................................................................... 23

5.2.3 MPEG-H 3D audio core bitstream payloads ........................................................................................ 41

5.3 Data structure ....................................................................................................................................... 60

5.3.1 General .................................................................................................................................................. 60

5.3.2 General configuration data elements .................................................................................................. 61

5.3.3 Loudspeaker configuration data elements ......................................................................................... 63

5.3.4 Core decoder configuration data elements ......................................................................................... 65

5.3.5 Downmix matrix data elements ........................................................................................................... 69

5.3.6 HOA rendering matrix data elements ................................................................................................. 72

5.3.7 Signal group information elements ..................................................................................................... 74

5.3.8 Low frequency enhancement (LFE) channel element, mpegh3daLfeElement() ............................... 75

5.4 Configuration element descriptions .................................................................................................... 75

5.4.1 General .................................................................................................................................................. 75

5.4.2 Downmix configuration ....................................................................................................................... 76

5.4.3 HOA rendering matrix configuration .................................................................................................. 81

5.5 Tool descriptions .................................................................................................................................. 86

5.5.1 General .................................................................................................................................................. 86

5.5.2 Quad channel element .......................................................................................................................... 86

5.5.3 Transform splitting .............................................................................................................................. 88

© ISO/IEC 2019 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO/IEC 23008-3:2019(E)

5.5.4 MPEG surround for mono to stereo upmixing ..................................................................................... 95

5.5.5 Enhanced noise filling .......................................................................................................................... 97

5.5.6 Audio pre-roll ..................................................................................................................................... 121

5.5.7 Fullband LPD ....................................................................................................................................... 124

5.5.8 Time-domain bandwidth extension................................................................................................... 135

5.5.9 LPD stereo coding ............................................................................................................................... 148

5.5.10 Multichannel coding tool .................................................................................................................... 155

5.5.11 Filterbank and block switching .......................................................................................................... 166

5.5.12 Frequency domain prediction ............................................................................................................ 166

5.5.13 Long-term postfilter ........................................................................................................................... 169

5.5.14 Tonal component coding .................................................................................................................... 175

5.5.15 Internal channel on MPS212 for low complexity format conversion ............................................... 184

5.5.16 High resolution envelope processing (HREP) tool ............................................................................ 196

5.6 Buffer requirements ........................................................................................................................... 202

5.6.1 Minimum decoder input buffer .......................................................................................................... 202

5.6.2 Bit reservoir ........................................................................................................................................ 203

5.6.3 Maximum bit rate ............................................................................................................................... 203

5.7 Stream access point requirements and inter-frame dependency .................................................... 203

6 Dynamic range control and loudness processing ............................................................................. 205

6.1 General ................................................................................................................................................ 205

6.2 Description .......................................................................................................................................... 205

6.3 Syntax .................................................................................................................................................. 205

6.3.1 Loudness metadata ............................................................................................................................. 205

6.3.2 Dynamic range control metadata....................................................................................................... 205

6.3.3 Data elements ..................................................................................................................................... 206

6.4 Decoding process ................................................................................................................................ 207

6.4.1 General ................................................................................................................................................ 207

6.4.2 Dynamic range control ....................................................................................................................... 209

6.4.3 Usage of downmixId in MPEG-H ......................................................................................................... 209

6.4.4 DRC set selection process ................................................................................................................... 210

6.4.5 DRC-1 for SAOC 3D Content ................................................................................................................ 212

6.4.6 DRC-1 for HOA content ....................................................................................................................... 212

6.4.7 Loudness normalization ..................................................................................................................... 214

6.4.8 Peak limiter ........................................................................................................................................ 214

6.4.9 Time-synchronization of DRC gains ................................................................................................... 214

6.4.10 Default parameters ............................................................................................................................. 214

7 Object metadata decoding .................................................................................................................. 215

7.1 General ................................................................................................................................................ 215

7.2 Description .......................................................................................................................................... 215

7.3 Syntax .................................................................................................................................................. 216

7.3.1 Object metadata configuration .......................................................................................................... 216

7.3.2 Top level object metadata syntax ...................................................................................................... 217

7.3.3 Subsidiary payloads for efficient object metadata decoding ............................................................ 218

7.3.4 Subsidiary payloads for object metadata decoding with low delay ................................................. 222

7.3.5 Enhanced object metadata configuration .......................................................................................... 227

7.4 Data structure ..................................................................................................................................... 230

7.4.1 Definition of ObjectMetadataConfig() payloads ................................................................................ 230

7.4.2 Efficient object metadata decoding .................................................................................................... 230

7.4.3 Object metadata decoding with low delay ......................................................................................... 239

7.4.4 Enhanced object metadata ................................................................................................................. 244

8 Object rendering ................................................................................................................................. 247

8.1 Description .......................................................................................................................................... 247

8.2 Terms and definitions ........................................................................................................................ 247

8.3 Input data ............................................................................................................................................ 248

8.4 Processing ........................................................................................................................................... 249

8.4.1 General remark ................................................................................................................................... 249

8.4.2 Imaginary loudspeakers .................................................................................................................... 249

8.4.3 Dividing the loudspeaker setup into a triangle mesh ....................................................................... 250

iv © ISO/IEC 2019 – All rights reserved
---------------------- Page: 4 ----------------------
ISO/IEC 23008-3:2019(E)

8.4.4 Rendering algorithm .......................................................................................................................... 252

9 SAOC 3D .............................................................................................................................................. 256

9.1 Description ......................................................................................................................................... 256

9.2 Definitions .......................................................................................................................................... 256

9.3 Delay and synchronization ................................................................................................................ 258

9.4 Syntax .................................................................................................................................................. 258

9.4.1 Payloads for SAOC 3D ......................................................................................................................... 258

9.4.2 Definition of SAOC 3D payloads ......................................................................................................... 262

9.5 SAOC 3D processing ............................................................................................................................ 264

9.5.1 Compressed data stream decoding and dequantization of SAOC 3D data ....................................... 264

9.5.2 Time/frequency transforms .............................................................................................................. 264

9.5.3 Signals and parameters ...................................................................................................................... 264

9.5.4 SAOC 3D decoding .............................................................................................................................. 266

9.5.5 Dual mode ........................................................................................................................................... 271

10 Generic loudspeaker rendering/format conversion ........................................................................ 272

10.1 Description ......................................................................................................................................... 272

10.2 Definitions .......................................................................................................................................... 273

10.2.1 General remarks ................................................................................................................................. 273

10.2.2 Variable definitions ............................................................................................................................ 273

10.3 Processing ........................................................................................................................................... 274

10.3.1 Application of transmitted downmix matrices ................................................................................. 274

10.3.2 Application of transmitted equalizer settings .................................................................................. 278

10.3.3 Downmix processing involving multiple channel groups ................................................................ 278

10.3.4 Initialization of the format converter................................................................................................ 279

10.3.5 Audio signal processing ..................................................................................................................... 294

11 Immersive loudspeaker rendering/format conversion ................................................................... 299

11.1 Description ......................................................................................................................................... 299

11.2 Syntax .................................................................................................................................................. 301

11.3 Definitions .......................................................................................................................................... 301

11.3.1 General remarks ................................................................................................................................. 301

11.3.2 Variable definitions ............................................................................................................................ 302

11.4 Processing ........................................................................................................................................... 303

11.4.1 Initialization of the format converter................................................................................................ 303

11.4.2 Audio signal processing ..................................................................................................................... 343

12 Higher order ambisonics (HOA) ........................................................................................................ 350

12.1 Technical overview ............................................................................................................................ 350

12.1.1 Block diagram ..................................................................................................................................... 350

12.1.2 Overview of the decoder tools ........................................................................................................... 351

12.2 Syntax .................................................................................................................................................. 353

12.2.1 Configuration of HOA elements ......................................................................................................... 353

12.2.2 Payloads of HOA elements ................................................................................................................. 356

12.3 Data structure ..................................................................................................................................... 368

12.3.1 Definitions of HOA Config ................................................................................................................... 368

12.3.2 Syntax of getSubbandBandwidths() .................................................................................................. 373

12.3.3 Definitions of HOA payload ................................................................................................................ 373

12.4 HOA tool description .......................................................................................................................... 381

12.4.1 HOA frame converter ......................................................................................................................... 381

12.4.2 Spatial HOA decoding ......................................................................................................................... 398

12.4.3 HOA renderer ..................................................................................................................................... 428

12.4.4 Layered coding for HOA ..................................................................................................................... 436

13 Binaural renderer .............................................................................................................................. 439

13.1 General ................................................................................................................................................ 439

13.2 Frequency-domain binaural renderer .............................................................................................. 439

13.2.1 General ................................................................................................................................................ 439

13.2.2 Definitions .......................................................................................................................................... 441

13.2.3 Parameterization of binaural room impulse responses ................................................................... 445

13.2.4 Frequency-domain binaural processing ........................................................................................... 457

© ISO/IEC 2019 – All rights reserved v
---------------------- Page: 5 ----------------------
ISO/IEC 23008-3:2019(E)

13.3 Time-domain binaural renderer ....................................................................................................... 464

13.3.1 General ................................................................................................................................................ 464

13.3.2 Definitions ........................................................................................................................................... 465

13.3.3 Parameterization of binaural room impulse responses ................................................................... 467

13.3.4 Time-domain binaural processing ..................................................................................................... 471

14 MPEG-H 3D audio stream (MHAS) ...................................................................................................... 472

14.1 Overview ............................................................................................................................................. 472

14.2 Syntax .................................................................................................................................................. 472

14.2.1 Main MHAS syntax elements .............................................................................................................. 472

14.2.2 Subsidiary MHAS syntax elements ..................................................................................................... 474

14.3 Semantics ............................................................................................................................................ 475

14.3.1 mpeghAudioStreamPacket() .............................................................................................................. 475

14.3.2 MHASPacketPayload() ........................................................................................................................ 475

14.3.3 Subsidiary MHAS packets ................................................................................................................... 477

14.4 Description of MHASPacketTypes ...................................................................................................... 477

14.4.1 PACTYP_FILLDATA ............................................................................................................................. 477

14.4.2 PACTYP_MPEGH3DACFG..................................................................................................................... 477

14.4.3 PACTYP_MPEGH3DAFRAME ............................................................................................................... 477

14.4.4 PACTYP_SYNC ...................................................................................................................................... 478

14.4.5 PACTYP_SYNCGAP ............................................................................................................................... 478

14.4.6 PACTYP_MARKER ................................................................................................................................ 478

14.4.7 PACTYP_CRC16 and PACTYP_CRC32 .................................................................................................. 479

14.4.8 PACTYP_DESCRIPTOR ............................................................................................

...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.