ISO/IEC 14496-2:1999/Amd 1:2000
(Amendment)Information technology - Coding of audio-visual objects - Part 2: Visual - Amendment 1: Visual extensions
Information technology - Coding of audio-visual objects - Part 2: Visual - Amendment 1: Visual extensions
Technologies de l'information — Codage des objets audiovisuels — Partie 2: Codage visuel — Amendement 1: Extensions visuelles
General Information
Relations
Frequently Asked Questions
ISO/IEC 14496-2:1999/Amd 1:2000 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology - Coding of audio-visual objects - Part 2: Visual - Amendment 1: Visual extensions". This standard covers: Information technology - Coding of audio-visual objects - Part 2: Visual - Amendment 1: Visual extensions
Information technology - Coding of audio-visual objects - Part 2: Visual - Amendment 1: Visual extensions
ISO/IEC 14496-2:1999/Amd 1:2000 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.
ISO/IEC 14496-2:1999/Amd 1:2000 has the following relationships with other standards: It is inter standard links to ISO/IEC 14496-2:1999, ISO/IEC 14496-2:2001; is excused to ISO/IEC 14496-2:1999. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.
You can purchase ISO/IEC 14496-2:1999/Amd 1:2000 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 14496-2
First edition
1999-12-01
AMENDMENT 1
2000-07-15
Information technology — Coding of
audio-visual objects —
Part 2:
Visual
AMENDMENT 1: Visual extensions
Technologies de l'information — Codage des objets audiovisuels —
Partie 2: Codage visuel
AMENDEMENT 1: Extensions visuelles
Reference number
ISO/IEC 14496-2:1999/Amd.1:2000(E)
©
ISO/IEC 2000
ISO/IEC 14496-2:1999/Amd.1:2000(E)
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but shall not
be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading this
file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat accepts no liability in this
area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters
were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely event
that a problem relating to it is found, please inform the Central Secretariat at the address given below.
© ISO/IEC 2000
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic
or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISO's member body
in the country of the requester.
ISO copyright office
Case postale 56 � CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.ch
Web www.iso.ch
Printed in Switzerland
ii © ISO/IEC 2000 – All rights reserved
© ISO/IEC ISO/IEC 14496-2:1999/Amd.1:2000(E)
Contents
1 Scope. 1
2 Normative references. 1
3 Terms and definitions . 2
4 Abbreviations and symbols . 12
4.1 Arithmetic operators . 12
4.2 Logical operators . 13
4.3 Relational operators. 13
4.4 Bitwise operators . 13
4.5 Conditional operators . 13
4.6 Assignment. 13
4.7 Mnemonics . 13
4.8 Constants . 14
5 Conventions . 14
5.1 Method of describing bitstream syntax. 14
5.2 Definition of functions. 15
5.2.1 Definition of next_bits() function . 15
5.2.2 Definition of bytealigned() function. 15
5.2.3 Definition of nextbits_bytealigned() function. 15
5.2.4 Definition of next_start_code() function. 16
5.2.5 Definition of next_resync_marker() function. 16
5.2.6 Definition of transparent_mb() function. 16
5.2.7 Definition of transparent_block() function . 16
5.2.8 Definition of byte_align_for_upstream() function. 16
5.3 Reserved, forbidden and marker_bit. 16
5.4 Arithmetic precision. 17
6 Visual bitstream syntax and semantics. 17
6.1 Structure of coded visual data. 17
6.1.1 Visual object sequence . 18
6.1.2 Visual object. 18
6.1.3 Video object. 18
6.1.4 Mesh object . 24
6.1.5 FBA object . 25
6.1.6 3D Mesh Object . 29
6.2 Visual bitstream syntax. 30
6.2.1 Start codes. 30
6.2.2 Visual Object Sequence and Visual Object . 33
6.2.3 Video Object Layer.35
6.2.4 Group of Video Object Plane . 40
6.2.5 Video Object Plane and Video Plane with Short Header. 40
6.2.6 Macroblock . 52
6.2.7 Block. 58
6.2.8 Still Texture Object. 59
6.2.9 Mesh Object. 73
6.2.10 FBA Object. 75
6.2.11 3D Mesh Object . 85
6.2.12 Upstream message. 103
6.3 Visual bitstream semantics. 104
6.3.1 Semantic rules for higher syntactic structures. 104
6.3.2 Visual Object Sequence and Visual Object . 104
iii
ISO/IEC 14496-2:1999/Amd.1:2000(E) © ISO/IEC
6.3.3 Video Object Layer.110
6.3.4 Group of Video Object Plane .120
6.3.5 Video Object Plane and Video Plane with Short Header .120
6.3.6 Macroblock related.131
6.3.7 Block related.135
6.3.8 Still texture object .135
6.3.9 Mesh object .143
6.3.10 FBA object.145
6.3.11 3D Mesh Object .151
6.3.12 Upstream message.162
7 The visual decoding process.164
7.1 Video decoding process .165
7.2 Higher syntactic structures .166
7.3 VOP reconstruction.166
7.4 Texture decoding .167
7.4.1 Variable length decoding .167
7.4.2 Inverse scan .169
7.4.3 DC and AC prediction for intra macroblocks.170
7.4.4 Inverse quantisation.172
7.4.5 Inverse DCT .176
7.4.6 Upsampling of the Inverse DCT output for Reduced Resolution VOP.176
7.5 Shape decoding.178
7.5.1 Higher syntactic structures .178
7.5.2 Macroblock decoding.179
7.5.3 Arithmetic decoding.189
7.5.4 Spatial scalable binary shape decoding.191
7.5.5 Grayscale Shape Decoding.200
7.5.6 Multiple Auxiliary Component Decoding.203
7.6 Motion compensation decoding .203
7.6.1 Padding process .203
7.6.2 Sample interpolation for non-integer motion vectors.207
7.6.3 General motion vector decoding process.209
7.6.4 Unrestricted motion compensation.211
7.6.5 Vector decoding processing and motion-compensation in progressive P- and S(GMC)-VOP.212
7.6.6 Overlapped motion compensation .214
7.6.7 Temporal prediction structure.215
7.6.8 Vector decoding process of non-scalable progressive B-VOPs.216
7.6.9 Motion compensation in non-scalable progressive B-VOPs.216
7.6.10 Motion Compensation Decoding of Reduced Resolution VOP.221
7.7 Interlaced video decoding.226
7.7.1 Field DCT and DC and AC Prediction .226
7.7.2 Motion compensation.227
7.8 Sprite decoding .236
7.8.1 Higher syntactic structures .236
7.8.2 Sprite Reconstruction .237
7.8.3 Low-latency sprite reconstruction.237
7.8.4 Sprite reference point decoding .238
7.8.5 Warping .239
7.8.6 Sample reconstruction.241
7.8.7 GMC decoding.242
7.9 Generalized scalable decoding.243
7.9.1 Temporal scalability.245
7.9.2 Spatial scalability .248
7.10 Still texture object decoding.252
7.10.1 Decoding of the DC subband.253
7.10.2 ZeroTree Decoding of the Higher Bands.254
7.10.3 Inverse Quantisation.259
7.10.4 Still Texture Error Resilience.267
7.10.5 Wavelet Tiling .270
7.10.6 Scalable binary shape object decoding .271
iv
© ISO/IEC ISO/IEC 14496-2:1999/Amd.1:2000(E)
7.11 Mesh object decoding . 277
7.11.1 Mesh geometry decoding. 278
7.11.2 Decoding of mesh motion vectors. 281
7.12 FBA object decoding. 283
7.12.1 Frame based face object decoding. 283
7.12.2 DCT based face object decoding. 284
7.12.3 Decoding of the viseme parameter fap 1. 285
7.12.4 Decoding of the viseme parameter fap 2. 286
7.12.5 Fap masking . 286
7.12.6 Frame Based Body Decoding . 286
7.12.7 DCT based body object decoding . 287
7.13 3D Mesh Object Decoding. 288
7.13.1 Start codes and bit stuffing. 290
7.13.2 The Topological Surgery decoding process. . 290
7.13.3 The Forest Split decoding process. 292
7.13.4 Header decoder . 293
7.13.5 partition type . 294
7.13.6 Vertex Graph Decoder. 296
7.13.7 Triangle Tree Decoder. 299
7.13.8 Triangle Data Decoder. 300
7.13.9 Forest Split decoder. 305
7.13.10 Arithmetic decoder. 312
7.14 NEWPRED mode decoding . 317
7.14.1 Decoder Definition. 317
7.14.2 Upstream message. 317
7.15 Output of the decoding process. 317
7.15.1 Video data. 318
7.15.2 2D Mesh data . 318
7.15.3 Face animation parameter data . 318
8 Visual-Systems Composition Issues. 318
8.1 Temporal Scalability Composition. 318
8.2 Sprite Composition . 319
8.3 Mesh Object Composition. 320
8.4 Spatial Scalability composition . 321
9 Profiles and Levels. 321
9.1 Visual Object Types . 321
9.2 Visual Profiles. 324
9.3 Visual Profiles@Levels . 325
9.3.1 Natural Visual . 325
9.3.2 Synthetic Visual. 325
9.3.3 Synthetic/Natural Hybrid Visual. 327
Annex A (normative) Coding transforms. 328
A.1 Discrete cosine transform for video texture . 328
A.2 Discrete wavelet transform for still texture. 329
A.2.1 Adding the mean . 329
A.2.2 Wavelet filter. 329
A.2.3 Symmetric extension. 330
A.2.4 Decomposition level. 330
A.2.5 Shape adaptive wavelet filtering and symmetric extension . 331
A.3 Shape-Adaptive DCT (SA-DCT). 332
A.3.1 Definition of Forward SA-DCT . 332
A.3.2 Definition of Inverse SA-DCT . 334
A.4 SA-DCT with DC Separation and ΔDC Correction (ΔDC-SA-DCT) . 335
A.4.1 Definition of Forward ΔDC-SA-DCT . 336
A.4.2 Definition of Inverse ΔDC-SA-DCT. 336
Annex B (normative) Variable length codes and arithmetic decoding. 338
B.1 Variable length codes. 338
B.1.1 Macroblock type. 338
v
ISO/IEC 14496-2:1999/Amd.1:2000(E) © ISO/IEC
B.1.2 Macroblock pattern .340
B.1.3 Motion vector.342
B.1.4 DCT coefficients.344
B.1.5 Shape Coding .354
B.1.6 Sprite Coding.359
B.1.7 DCT based facial object decoding.360
B.1.8 Shape decoding for still texture object .369
B.2 Arithmetic Decoding .370
B.2.1 Aritmetic decoding for still texture object.370
B.2.2 Arithmetic decoding for shape decoding.373
B.2.3 FBA Object Decoding.376
Annex C (normative) Face and body object decoding tables and definitions.378
Annex D (normative) Video buffering verifier.411
D.1 Introduction .411
D.2 Video Rate Buffer Model Definition .411
D.3 Comparison between ISO/IEC 14496-2 VBV and the ISO/IEC 13818-2 VBV (Informative).414
D.4 Video Complexity Model Definition .415
D.5 Video Reference Memory Model Definition.417
D.6 Interaction between VBV, VCV and VMV (informative).418
D.7 Video Presentation Model Definition (informative).418
Annex E (informative) Features supported by the algorithm .420
E.1 Error resilience.420
E.1.1 Resynchronization .420
E.1.2 Data Partitioning.421
E.1.3 Reversible VLC.421
E.1.4 Decoder Operation .422
E.1.5 Adaptive Intra Refresh (AIR) Method.425
E.1.6 NEWPRED.427
E.2 Complexity Estimation.429
E.3 Resynchronization in Case of Unknown Video Header Format .429
Annex F (informative) Preprocessing and postprocessing.431
F.1 VOP Generation Tools: Automatic and Semi-automatic Segmentations.431
F.1.1 Automatic Segmentation .431
F.1.2 Semi-automatic Segmentation.441
F.1.3 References.449
F.2 Bounding Rectangle of VOP Formation .450
F.3 Postprocessing for Coding Noise Reduction .451
F.3.1 Deblocking filter .451
F.3.2 Deringing filter.453
F.3.3 Further issues .455
F.4 Chrominance Decimation and Interpolation Filtering for Interlaced Object Coding .455
Annex G (normative) Profile and level indication and restrictions.457
Annex H (informative) Patent statements.460
H.1 Patent statements for ISO/IEC 14496 Version 1.460
H.2 Patent statements for the extensions provided in ISO/IEC 14496 Version 2.461
Annex I (informative) Encoder Complexity Reduction Based on Intelligent Pre-Quantisation .463
I.1 Introduction .463
I.2 Feature Selection and Pre-quantisation.463
I.3 Model Verification and Threshold Setting.465
I.3.1 H.263 Quantiser.465
I.3.2 MPEG-4 Quantiser.465
Annex J (normative) View dependent object scalability .467
J.1 Introduction .467
J.2 Decoding Process of a View-Dependent Object .467
J.2.1 General Decoding Scheme .467
J.2.2 Computation of the View-Dependent Scalability parameters.469
J.2.3 VD mask computation.471
vi
© ISO/IEC ISO/IEC 14496-2:1999/Amd.1:2000(E)
J.2.4 Differential mask computation. 472
J.2.5 DCT coefficients decoding. 472
J.2.6 Texture update. 472
J.2.7 IDCT . 473
Annex K (normative) Decoder Configuration Information. 474
K.1 Introduction . 474
K.2 Description of the set up of a visual decoder (informative) . 474
K.2.1 Processing of decoder configuration information. 475
K.3 Specification of decoder configuration information. 476
K.3.1 VideoObject . 476
K.3.2 StillTextureObject. 476
K.3.3 MeshObject. 477
K.3.4 FaceObject. 477
K.3.5 3DMeshObject . 477
Annex L (informative) Rate control. 478
L.1 Frame Rate Control . 478
L.1.1 Introduction . 478
L.1.2 Description . 478
L.1.3 Summary. 482
L.2 Multiple Video Object Rate Control . 482
L.2.1 Initialization . 482
L.2.2 Quantisation Level Calculation for I-frame and first P-frame. 482
L.2.3 Update Rate-Distortion Model. 485
L.2.4 Post-Frameskip Control . 485
L.3 Macroblock Rate Control . 487
L.3.1 Rate-Distortion Model . 487
L.3.2 Target Number of Bits for Each Macroblock. 488
L.3.3 Macroblock Rate Control . 488
Annex M (informative) Binary shape coding. 491
M.1 Introduction . 491
M.2 Context-Based Arithmetic Shape Coding. 491
M.2.1 Intra Mode. 491
M.2.2 Inter Mode. 492
M.3 Texture Coding of Boundary Blocks . 493
M.4 Encoder Architecture . 493
M.5 Encoding Guidelines. 494
M.5.1 Lossy Shape Coding . 494
M.5.2 Coding Mode Selection . 495
M.6 Conclusions. 495
M.7 References. 495
Annex N (normative) Visual profiles@levels. 497
Annex O (informative) 3D Mesh Coding. 501
O.1 Introduction . 501
O.2 Topological Surgery Representation. 501
O.2.1 Simple Polygon Representation . 502
O.2.2 Vertex Graph representation. 503
O.3 Encoding guidelines for 3D Mesh Coding. 504
O.3.1 Topological Surgery Encoding . 504
O.3.2 Support for non-manifolds and Non-orientable manifolds. 505
O.3.3 Support for Error Resilience. 507
O.4 Encoder considerations for efficient compression of Vertex Properties. 511
O.5 Progressive Forest Split Representation . 512
O.5.1 Encoding the Forest. 512
O.5.2 Support for meshes with polygonal faces. 513
O.5.3 Method for generating of a PFS Representation of a Triangular 3D Mesh . 513
O.5.4 Topological Tests. 514
O.5.5 Geometric Tests . 516
O.6 Complexity estimation for Computational Graceful Degradation . 516
vii
ISO/IEC 14496-2:1999/Amd.1:2000(E) © ISO/IEC
O.7 QoS for SNHC through upstream .518
O.8 References.519
Bibliography.521
viii
ISO/IEC 14496-2:1999/Amd.1:2000(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission)
form the specialized system for
...








Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...