Information technology - Generic coding of moving pictures and associated audio information: Video

Technologies de l'information — Codage générique des images animées et du son associé: Données vidéo

General Information

Status
Withdrawn
Publication Date
15-May-1996
Withdrawal Date
15-May-1996
Current Stage
9599 - Withdrawal of International Standard
Start Date
21-Dec-2000
Completion Date
30-Oct-2025
Ref Project

Relations

Standard
ISO/IEC 13818-2:1996 - Information technology -- Generic coding of moving pictures and associated audio information: Video
English language
201 pages
sale 15% off
Preview
sale 15% off
Preview
Standard
ISO/IEC 13818-2:1996 - Information technology — Generic coding of moving pictures and associated audio information: Video Released:10/10/1996
French language
201 pages
sale 15% off
Preview
sale 15% off
Preview

Frequently Asked Questions

ISO/IEC 13818-2:1996 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology - Generic coding of moving pictures and associated audio information: Video". This standard covers: Information technology - Generic coding of moving pictures and associated audio information: Video

Information technology - Generic coding of moving pictures and associated audio information: Video

ISO/IEC 13818-2:1996 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.040.40 - Coding of audio, video, multimedia and hypermedia information. The ICS classification helps identify the subject area and facilitates finding related standards.

ISO/IEC 13818-2:1996 has the following relationships with other standards: It is inter standard links to ISO 14673-3:2004, ISO/IEC 13818-2:1996/Amd 5:2000, ISO/IEC 13818-2:1996/Amd 1:1997, ISO/IEC 13818-2:1996/Amd 3:1998, ISO/IEC 13818-2:1996/Amd 2:1997, ISO/IEC 13818-2:1996/Amd 4:1999, ISO/IEC 13818-2:1996/Cor 2:1997, ISO/IEC 13818-2:1996/Cor 1:1997, ISO/IEC 13818-2:2000; is excused to ISO/IEC 13818-2:1996/Amd 3:1998, ISO/IEC 13818-2:1996/Cor 2:1997, ISO/IEC 13818-2:1996/Amd 2:1997, ISO/IEC 13818-2:1996/Amd 1:1997, ISO/IEC 13818-2:1996/Cor 1:1997, ISO/IEC 13818-2:1996/Amd 4:1999. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.

You can purchase ISO/IEC 13818-2:1996 directly from iTeh Standards. The document is available in PDF format and is delivered instantly after payment. Add the standard to your cart and complete the secure checkout process. iTeh Standards is an authorized distributor of ISO standards.

Standards Content (Sample)


INTERNATIONAL lSO/lEC
STANDARD 13818-2
First edition
1996-05-15
Information technology - Generic coding
of moving pictures and associated audio
information: Video
Technologies de I’informa tion - Codage des images animees et du son
associb: Vid6o
Reference number
lSO/IEC 13818-2:1996(E)
ISO/IEC 13818-2: 1996(E)
CONTENTS
Page
Scope .
.....................................................................................................................................
Normative references
......................................................................................................................................................
Definitions
Abbreviations and symbols .
4.1 Arithmetic operators .
4.2 Logical operators .
............................................................................................................................
4.3 Relational operators
4.4 Bitwise operators .
.........................................................................................................................................
4.5 Assignment
..........................................................................................................................................
4.6 Mnemonics
.............................................................................................................................................
4.7 Constants
Conventions .
...............................................................................................
5.1 Method of describing bitstream syntax
........................................................................................................................
5.2 Definition of functions
...................................................................................................
5.3 Reserved, forbidden and marker - bit
............................................................................................................................
5.4 Arithmetic precision
...........................................................................................................
6 Video bitstream syntax and semantics
Structure of coded video data .
6.1
Video bitstream syntax .
6.2
Video bitstream semantics .
6.3
...........................................................................................................................
7 The video decoding process
..................................................................................................................
7.1 Higher syntactic structures
....................................................................................................................
7.2 Variable length decoding
Inverse scan .
7.3
Inverse quantisation .
7.4
7.5 Inverse DCT .
7.6 Motion compensation .
7.7 Spatial scalability .
7.8 SNR scalability .
7.9 Temporal scalability ~.~,.”.“~~.“.“.“.” .
7.10 Data partitioning .
7.11 Hybrid scalability .
...........................................................................................................
7.12 Output of the decoding process
8 Profiles and levels .
8.1 ISO/IEC 11172-2 compatibility .
................................................................................................
8.2 Relationship between defined profiles
...................................................................................................
8.3 Relationship between defined levels
....................................................................................................................................
8.4 Scalable layers
8.5 .
Parameter values for defined profiles, levels and layers
0 ISO/IEC 1996
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or
utilized in any form or by any means, electronic or mechanical, including photocopying and
microfilm, without permission in writing from the publisher.
ISO/IEC Copyright Office l Case postale 56 l CH- 12 11 Geneve 20 l Switzerland
Printed in Switzerland
0 ISOIIEC ISO/IEC 13818-2: 1996(E)
Annex A - Discrete cosine transform .
..................................................................................................................
Annex B - Variable length code tables
Macroblock addressing .
B.l
.................................................................................................................................
B.2 Macroblock type
.............................................................................................................................
B.3 Macroblock pattern
B.4 Motion vectors .
DCT coefficients .
B.5
Video buffering verifier .
Annex C -
Annex D - Features supported by the algorithm .
D.1 Overview .
D.2 Video formats .
D.3 Picture quality .
D.4 Data rate control .
D.5 Low delay mode .
........................................................................................................
D.6 Random access/channel hopping
D.7 Scalability .
Compatibility .
D.8
...........................................................
D.9 Differences between this Specification and ISO/IEC 11172-2
D.10 Complexity .
D.11 Editing encoded bitstreams .
D.12 Trick modes .
D.13 Error resilience .
D.14 Concatenated sequences .
................................................................................................................. 169
Annex E - Profile and level restrictions
................................................................................................ 169
E.l Syntax element restrictions in profiles
...........................................................................................................
E.2 Permissible layer combinations
..........................................................................................................................................
Annex F - Bibliography
iii
ISO/IEC 13818=2:1996(E)
0 ISO/IEC
Foreword
IS0 (the International Organization for Standardization) and IEC (the
International Electrotechnical Commission) form the specialized system for
worldwide standardization. National bodies that are members of IS0 or IEC par-
ticipate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields
of technical activity. IS0 and IEC technical committees collaborate in fields of
mutual interest. Other international organizations, governmental and non-
governmental, in liaison with IS0 and IEC, also take part in the work.
In the field of information technology, IS0 and IEC have established a joint
technical committee, ISO/IEC JTC 1. Draft International Standards adopted by the
joint technical committee are circulated to national bodies for voting. Publication
as an International Standard requires approval by at least 75 % of the national
bodies casting a vote.
International Standard ISO/IEC 138 18-2 was prepared by Joint Technical
Committee ISO/IEC JTC 1, Information technology, Subcommittee SC 29,
Coding of audio, picture,
multimedia and hypermedia information, in
collaboration with ITU-T. The identical text is published as ITU-T
Recommendation H.262.
ISO/IEC 13818 consists of the following parts, under the general title Information
technology - Generic coding of moving pictures and associated audio
information:
- Part 1: Systems
- Part 2: Video
- Part 3: Audio
- Part 4: Compliance testing
- Part 6: Extensions for DSM-CC
- Part 9: Extension for real time inte$ace for systems decoders
Annexes A to C form an integral part of this part of ISO/IEC 13818. Annexes D to
F are for information only.
iv
@ ISOIIEC
ISO/IEC 13818-2: 1996(E)
Introduction
Intro. 1 Purpose
This Part of this Specification was developed in response to the growing need for a generic coding method of moving
pictures and of associated sound for various applications such as digital storage media, television broadcasting and
communication. The use of this Specification means that motion video can be manipulated as a form of computer data
and can be stored on various storage media, transmitted and received over existing and future networks and distributed
on existing and future broadcasting channels.
Intro. 2 Application
The applications of this Specification cover, but are not limited to, such areas as listed below:
BSS Broadcasting Satellite Service (to the home)
CATV Cable TV Distribution on optical networks, copper, etc.
CDAD Cable Digital Audio Distribution
DSB Digital Sound Broadcasting (terrestrial and satellite broadcasting)
DTTB Digital Terrestrial Television Broadcasting
EC Electronic Cinema
EN6 Electronic News Gathering (including SNG, Satellite News Gathering)
FSS Fixed Satellite Service (e.g. to head ends)
HTT Home Television Theatre
IPC Interpersonal Communications (videoconferencing, videophone, etc.)
ISM Interactive Storage Media (optical disks, etc.)
MMM Multimedia Mailing
NCA News and Current Affairs
NDB Networked Database Services (via ATM, etc.)
RVS Remote Video Surveillance
SSM Serial Storage Media (digital VTR, etc.)
Intro. 3 Profiles and levels
This Specification is intended to be generic in the sense that it serves a wide range of applications, bitrates, resolutions,
qualities and services. Applications should cover, among other things, digital storage media, television broadcasting and
communications. In the course of creating this Specification, various requirements from typical applications have been
considered, necessary algorithmic elements have been developed, and they have been integrated into a single syntax.
Hence, this Specification will facilitate the bitstream interchange among different applications.
Considering the practicality of implementing the full syntax of this Specification, however, a limited number of subsets
of the syntax are also stipulated by means of “profile” and “level”. These and other related terms are formally defined in
clause 3.
A “profile” is a defined subset of the entire bitstream syntax that is defined by this Specification. Within the bounds
imposed by the syntax of a given profile it is still possible to require a very large variation in the performance of
encoders and decoders depending upon the values taken by parameters in the bitstream. For instance, it is possible to
specify frame sizes as large as (approximately) 214 samples wide by 214 lines high. It is currently neither practical nor
economic to implement a decoder capable of dealing with all possible frame sizes.
In order to deal with this problem, “levels” are defined within each profile. A level is a defined set of constraints
imposed on parameters in the bitstream. These constraints may be simple limits on numbers. Alternatively they may take
the form of constraints on arithmetic combinations of the parameters (e.g. frame width multiplied by frame height
multiplied by frame rate).
Bitstreams complying with this Specification use a common syntax. In order to achieve a subset of the complete syntax,
flags and parameters are included in the bitstream that signal the presence or otherwise of syntactic elements that occur
later in the bitstream. In order to specify constraints on the syntax (and hence define a profile) it is thus only necessary to
constrain the values of these flags and parameters that specify the presence of later syntactic elements.

0 ISO/IEC
ISO/IEC 13818-2: 1996(E)
Intro. 4 The scalable and the non-scalable syntax
The full syntax can be divided into two major categories: One is the non-scalable syntax, which is structured as a super
set of the syntax defined in ISO/IEC 11172-2. The main feature of the non-scalable syntax is the extra compression tools
for interlaced video signals. The second is the scalable syntax, the key property of which is to enable the reconstruction
of useful video from pieces of a total bitstream. This is achieved by structuring the total bitstream in two or more layers,
starting from a standalone base layer and adding a number of enhancement layers. The base layer can use the non-
scalable syntax, or in some situations conform to the ISOLIEC 11172-2 syntax.
Intro. 4.1 Overview of the non-scalable syntax
The coded representation defined in the non-scalable syntax achieves a high compression ratio while preserving good
image quality. The algorithm is not lossless as the exact sample values are not preserved during coding. Obtaining good
image quality at the bitrates of interest demands very high compression, which is not achievable with intra picture
coding alone. The need for random access, however, is best satisfied with pure intra picture coding. The choice of the
techniques is based on the need to balance a high image quality and compression ratio with the requirement to make
random access to the coded bitstream.
A number of techniques are used to achieve high compression” The algorithm first uses block-based motion
compensation to reduce the temporal redundancy. Motion compensation is used both for causal prediction of the current
picture from a previous picture, and for non-causal, interpolative prediction from past and future pictures. Motion
vectors are defined for each 1 &sample by 16-line region of the picture. The prediction error, is further compressed using
the Discrete Cosine Transform (DCT) to remove spatial correlation before it is quantised in an irreversible process that
discards the less important information. Finally, the motion vectors are combined with the quantised DCT information,
and encoded using variable length codes.
Intro. 4.1.1 Temporal processing
Because of the conflicting requirements of random access and highly efficient compression, three main picture types are
defined. Intra Coded Pictures (I-Pictures) are coded without reference to other pictures. They provide access points to
the coded sequence where decoding can begin, but are coded with only moderate compression. Predictive Coded
Pictures (P-Pictures) are coded more efficiently using motion compensated prediction from a past intra or predictive
coded picture and are generally used as a reference for further prediction. Bidirectionally-predictive Coded Pictures
(B-Pictures) provide the highest degree of compression but require both past and future reference pictures for motion
compensation. Bidirectionally-predictive coded pictures are never used as references for prediction (except in the case
that the resulting picture is used as a reference in a spatially scalable enhancement layer). The organisation of the three
picture types in a sequence is very flexible. The choice is left to the encoder and will depend on the requirements of the
application. Figure Intro. 1 illustrates an example of the relationship among the three different picture types.
Bidirectional Interpolation
Figure Intro. 1 - Example of temporal picture structure
vi
o ISOIIEC ISO/IEC 13818=2:1996(E)
Intro. 4.1.2 Coding interlaced video
Each frame of interlaced video consists of two fields which are separated by one field-period. The Specification allows
either the frame to be encoded as picture or the two fields to be encoded as two pictures. Frame encoding or field
encoding can be adaptively selected on a frame-by-frame basis. Frame encoding is typically preferred when the video
scene contains significant detail with limited motion. Field encoding, in which the second field can be predicted from the
first, works better when there is fast movement.
Intro. 4.1.3 Motion representation - Macroblocks
As in ISO/IEC 11172-2, the choice of 16 by 16 macroblocks for the motion-compensation unit is a result of the trade-off
between the coding gain provided by using motion information and the overhead needed to represent it. Each
macroblock can be temporally predicted in one of a number of different ways. For example, in frame encoding, the
prediction from the previous reference frame can itself be either frame-based or field-based. Depending on the type of
the macroblock, motion vector information and other side information is encoded with the compressed prediction error
in each macroblock. The motion vectors are encoded differentially with respect to the last encoded motion vectors using
variable length codes. The maximum length of the motion vectors that may be represented can be programmed, on a
picture-by-picture basis, so that the most demanding applications can be met without compromising the performance of
the system in more normal situations.
It is the respo nsibili ty of the en coder to calculate appropriate motion vectors. This Specification does not specify how
this should be done.
Intro. 4.1.4 Spatial redundancy reduction
Both source pictures and prediction errors have high spatial redundancy. This Specification uses a block-based DCT
method with visually weighted quantisation and run-length coding. After motion compensated prediction or
interpolation, the resulting prediction error is split into 8 by 8 blocks. These are transformed into the DCT domain where
they are weighted before being quantised. After quantisation many of the DCT coefficients are zero in value and so
two-dimensional run-length and variable length coding is used to encode the remaining DCT coefficients efficiently.
Intro. 4.1.5 Chrominance formats
In addition to the 4:2:0 format supported in ISO/IEC 11172-2 this Specification supports 4:2:2 and 4:4:4 chrominance
formats.
Intro. 4.2 Scalable extensions
The scalability tools in this Specification are designed to support applications beyond that supported by single layer
video. Among the noteworthy applications areas addressed are video telecommunications, video on Asynchronous
Transfer Mode networks (ATM), interworking of video standards, video service hierarchies with multiple spatial,
temporal and quality resolutions, HDTV with embedded TV, systems allowing migration to higher temporal resolution
HDTV, etc. Although a simple solution to scalable video is the simulcast technique which is based on
transmission/storage of multiple independently coded reproductions of video, a more efficient alternative is scalable
video coding, in which the bandwidth allocated to a given reproduction of video can be partially re-utilised in coding of
the next reproduction of video. In scalable video coding, it is assumed that given a coded bitstream, decoders of various
complexities can decode and display appropriate reproductions of coded video. A scalable video encoder is likely to
have increased complexity when compared to a single layer encoder. However, this Recommendation I International
Standard provides several different forms of scalabilities that address non-overlapping applications with corresponding
complexities. The basic scalability tools offered are:
-
data partitioning;
-
SNR scalability;
-
spatial scalability; and
-
temporal
scalability.
Moreover, combinations of these basic scalability tools are also supported and are referred to as hybrid scalability. In the
case of basic scalability, two layers of video referred to as the lower layer and the enhancement layer are allowed,
whereas in hybrid scalability up to three layers are supported. Tables Intro. 1 to Intro. 3 provide a few example
applications of various scalabilities.
vii
ISO/IEC 13818-2: 1996(E)
0 ISO/IEC
Table Intro. 1 - Applications of SNR scalability
*
Lower layer Enhancement layer Application
Recommendation Same resolution and format as Two quality service for Standard TV (SDTV)
ITU-R BT.601 lower layer
High Definition Same resolution and format as Two quality service for HDTV
lower layer
Video production / distribution
4:2:0 high definition 4:2:2 chroma simulcast
Table Intro. 2 - Applications of spatial scalability
Base Enhancement Application
Progressive (30 Hz) Progressive (30 Hz)
Interlace (30 Hz) Interlace (30 Hz) 1 HDTWSDTV scalability
I I I
1 Progressive (30 Hz) I Interlace (30 Hz) ISO/IEC 11172-28compatibility with this Specification
I
I
Interlace (30 Hz) I Progressive (60 Hz) I Migration to high resolution progressive HDTV
I I
Table Intro. 3 - Applications of temporal scalability
Enhancement Higher Application
Base
I
I I I
Progressive (30 Hz) 1 ProgT(gressive (60 Hz) 1 Migration HDTV to high ~~ resolution progressive
r
Interlace (30 Hz) Interlace (30 Hz) Progressive (60 Hz) Migration to high resolution progressive
HDTV
r I
I
Intro. 4.2.1 Spatial scalable extension
Spatial scalability is a tool intended for use in video applications involving telecommunications, interworking of video
standards, video database browsing, interworking of HDTV and TV, etc., i.e. video systems with the primary common
feature that a minimum of two layers of spatial resolution are necessary. Spatial scalability involves generating two
spatial resolution video layers from a single video source such that the lower layer is coded by itself to provide the basic
spatial resolution and the enhancement layer employs the spatially interpolated lower layer and carries the full spatial
resolution of the input video source. The lower and the enhancement layers may either both use the coding tools in this
Specification, or the ISO/IEC 11172-2 Standard for the lower layer and this Specification for the enhancement layer.
The latter case achieves a further advantage by facilitating interworking between video coding standards. Moreover,
spatial scalability offers flexibility in choice of video formats to be employed in each layer. An additional advantage of
spatial scalability is its ability to provide resilience to transmission errors as the more important data of the lower layer
can be sent over channel with better error performance, while the less critical enhancement layer data can be sent over a
channel with poor error performance.
Intro. 4.2.2 SNR scalable extension
SNR scalability is a tool intended for use in video applications involving telecommunications, video services with
multiple qualities, standard TV and HDTV, i.e. video systems with the primary common feature that a minimum of two
layers of video quality are necessary. SNR scalability involves generating two video layers of same spatial resolution but
different video qualities from a single video source such that the lower layer is coded by itself to provide the basic video
quality and the enhancement layer is coded to enhance the lower layer. The enhancement layer when added back to the
. . .
Vlll
0 ISO/IEC
ISO/IEC 13818-2: 1996(E)
lower layer regenerates a higher quality reproduction of the input video. The lower and the enhancement layers may
either use this Specification or ISO/IEC 11172-2 Standard for the lower layer and this Specification for the enhancement
layer. An additional advantage of SNR scalability is its ability to provide high degree of resilience to transmission errors
as the more important data of the lower layer can be sent over channel with better error performance, while the less
critical enhancement layer data can be sent over a channel with poor error performance.
Intro. 4.2.3 Temporal scalable extension
Temporal scalability is a tool intended for use in a range of diverse video applications from telecommunications
to HDTV for which migration to higher temporal resolution systems from that of lower temporal resolution systems may
be necessary. In many cases, the lower temporal resolution video systems may be either the existing systems or the less
expensive early generation systems, with the motivation of introducing more sophisticated systems gradually. Temporal
scalability involves partitioning of video frames into layers, whereas the lower layer is coded by itself to provide the
basic temporal rate and the enhancement layer is coded with temporal prediction with respect to the lower layer, these
layers when decoded and temporal multiplexed to yield full temporal resolution of the video source. The lower temporal
resolution systems may only decode the lower layer to provide basic temporal resolution, whereas more sophisticated
systems of the future may decode both layers and provide high temporal resolution video while maintaining
interworking with earlier generation systems. An additional advantage of temporal scalability is its ability to provide
resilience to transmission errors as the more important data of the lower layer can be sent over channel with better error
performance, while the less critical enhancement layer can be sent over a channel with poor error performance.
Intro. 4.2.4 Data partitioning extension
Data partitioning is a tool intended for use when two channels are available for transmission and/or storage of a
video bitstream, as may be the case in ATM networks, terrestrial broadcast, magnetic media, etc. The bitstream is
partitioned between these channels such that more critical parts of the bitstream (such as headers, motion vectors, low
frequency DCT coefficients) are transmitted in the channel with the better error performance, and less critical data (such
as higher frequency DCT coefficients) is transmitted in the channel with poor error performance. Thus, degradation to
channel errors are minimised since the critical parts of a bitstream are better protected. Data from neither channel may be
decoded on a decoder that is not intended for decoding data partitioned bitstreams.
ix
This page intentionally left blank

INTERNATIONAL STANDARD
ITU-T RECOMMENDATION
INFORMATION TECHNOLOGY -
GENERIC CODING OF MOVING PICTURES AND
ASSOCIATED AUDIO INFORMATION: VIDEO
1 Scope
This Recommendation I International Standard specifies the coded representation of picture information for digital
storage media and digital video communication and specifies the decoding process. The representation supports constant
bitrate transmission, variable bitrate transmission, random access, channel hopping, scalable decoding, bitstream editing,
as well as special functions such as fast forward playback, fast reverse playback, slow motion, pause and still pictures.
This Recommendation I International Standard is forward compatible with ISOKIEC 11172-2 and upward or downward
compatible with EDTV, HDTV, SDTV formats.
This Recommendation I International Standard is primarily applicable to digital storage media, video broadcast and
communication The storage media may be directly connected to the decoder, or via communications means such as
busses, EANs, or telecommunications links.
2 Normative references
The following Recommendations and International Standards contain provisions which through reference in this text,
constitute provisions of this Recommendation I International Standard. At the time of publication, the editions indicated
were valid. All Recommendations and Standards are subject to revision, and parties to agreements based on this
Recommendation I International Standard are encouraged to investigate the possibility of applying the most recent
edition of the Recommendations and Standards indicated below. Members of IEC and IS0 maintain registers of
currently valid International Standards. The Telecommunication Standardization Bureau of the ITU maintains a list of
currently valid ITU-T Recommendations.
-
Recommendations and Reports of the CCIR, 1990, XVIIth Plenary Assembly, Dusseldorf 1990,
Volume XI - Part 1 Broadcasting Service (Television) - Recommendation ITU-R BT.60 l-3 Encoding
parameters of digital television for studios.
-
CCIR Volume X and XI Part 3 - Recommendation ITU-R BR.648 Recording of audio signals.
-
CCIR Volume X and XI Part 3 - Report ITU-R 955-2 Satellite sound broadcasting to vehicular, portable
and fixed receivers in the range 500 - 3000 MHz.
-
ISO/IEC 11172- 1: 1993, Information technology - Coding of moving pictures and associated audio for
digital storage media at up to about I,5 Mbit/s - Part I : Systems.
-
Coding of moving pictures and associated audio for
ISOfIEC 11172-2: 1993, Information technology -
digital storage media at up to about I,5 M&it/s - Part 2 : Video.
-
ISO/IEC 11172-3: 1993, Information technology - Coding of moving pictures and associated audio for
digital storage media at up to about I,5 Mbit/s - Part 3 : Audio.
-
IEEE Standard Specifications for the Implementations of 8 by 8 Inverse Discrete Cosine Transform, IEEE
Std 1180- 1990, December 6, 1990.
- IEC Publication 908: 1987, Compact disc digital audio system.
-
IEC Publication 461: 1986, Time and control code for video tape recorders.
-
ITU-T Recommendation H.261 (1993), Video codec for audiovisual services at p x 64 kbit/s.
-
CCITT Recommendation T.8 1 (1992) (JPEG) ISO/IEC 109 18- 1: 1994, Information technology - Digital
- Requirements and guidelines.
compression and coding of continuous-tone still images
ITU-T Rec. H.262 (1995 E) 1
ISO/IEC 13818-2: 1996(E)
Definitions
For the purposes of this Recommeejdation I International Standard, the following definitions apply.
3,l AC coefficient: Any DCT coefficient for which the frequency in one or both dimensions is non-zero.
32 big picture: A coded picture that would cause VBV buffer underflow as defined in C.7. Big pictures can only

occur in sequences where low-delay is equal to 1. Skipped picture” is a term that is sometimes used to describe the
same concept.
33 . B-field picture: A field structure B-Picture.
34 . B-frame picture: A frame structure B-Picture.
B-picture; bidirectionally predictive-coded picture: A compensated
3.5 picture that is coded using motion
prediction from past and/or future reference fields or frames.
36 backward compatibility: A newer coding standard is backward compatible with an older coding standard if
decoders designed to operate with the older coding standard are able to continue to operate by decoding all or part of a
bitstream produced according to the newer coding standard.
3.7 backward motion vector: A motion vector that is used for motion compensation from a reference frame or
reference field at a later time in display order.
3.8 backward prediction: Prediction from the future reference frame (field).
39 . base layer: First, independently decodable layer of a scalable hierarchy.
3.10 bit-stream; stream: An ordered series of bits that forms the coded representation of the data.
3.11 bitrate: The rate at which the coded bitstream is delivered from the storage medium to the input of a decoder.
3.12 block: An 8row by &column matrix of samples, or 64c DCT coefficients (source, quantised or dequantised).
3.13 bottom field: One of two fields that comprise a frame. Each line of a bottom field is spatially located
immediately below the corresponding line of the top field.
3.14 byte aligned: A bit in a coded bitstream is byte-aligned if its position is a multiple of 8 bits from the first bit in
the stream.
3.15 byte: Sequence of 8 bits.
channei: A digital medium that stores or transports a bitstream constructed according to ITU-T Rec. IX.262 1
3.16
ISO/IEC 13818-2.
3.17 chrominance format: Defines the number of chrominance blocks in a macroblock.
chroma simulcast: A type of scalability (which is a subset of SNR scalability) where the enhancement layer(s)
3.18
contain only coded refinement data for the DC coefficients, and all the data for the AC coefficients, of the chrominance
components.
3.19 chrominance component: A matrix, block or single sample representing one of the two colour difference
signals related to the primary colours in the manner defined in the bitstream. The symbols used for the chrominance
signals are Cr and Cb.
3.20 coded B-frame: A B-frame picture or a pair of B-field pictures.
3.21 coded frame: A coded frame is a coded I-frame, a coded P-frame or a coded B-frame.
3.22 coded I-frame: An I-frame picture or a pair of field pictures, where the first field picture is an I-picture and
the second field picture is an I-picture or a P-picture.
3.23 coded P-frame: A P-frame picture or a pair of P-field pictures.
3.24 coded picture: A coded picture is made of a picture header, the optional extensions immediately following it,
and the following picture data. A coded picture may be a coded frame or a coded field.
2 ITU-T Rec. H.262 (1995 E)
ISO/IEC 13818-2: 1996(E)
coded video bitstream: A coded representation of a series of one or more pictures as defined in ITU-T
3.25
Rec. H.262 1 ISO/IEC 13818-2.
3.26 coded order: The order in which the pictures are transmitted and decoded. This order is not necessarily the
same as the display order.
3.27 coded representation: A data element as represented in its encoded form.
3.28 coding parameters: The set of user-definable parameters that characterise a coded video bitstream. Bitstreams
are characterised by coding parameters. Decoders are characterised by the bitstreams that they are capable of decoding.
3.29 component: A matrix, block or single sample from one of the three matrices (luminance and two
chrominance) that make up a picture.
3.30 compression: Reduction in the number of bits used to represent an item of data.
3.31 constant bitrate coded video: A coded video bitstream with a constant bitrate.
3.32 constant bitrate: Operation where the bitrate is constant from start to finish of the coded bitstream.
3.33 data element: An item of data as represented before encoding and after decoding.
3.34 data pa rtitioning: A method for dividing a bitstream into two separate bitstreams for error resi lience
purposes. The two bitstreams have to be recombined before decoding.
3.35 D-Picture: A type of picture that shall not be used except in ISO/IEC 11172-2.
3.36 DC coefficient: The DCT coefficient for which the frequency is zero in both dimensions.
3.37 DCT coefficient: The amplitude of a specific cosine basis function.
3.38 decoder input buffer: The First-In First-Out (FIFO) buffer specified in the video buffering verifier.
3.39 decoder: An embodiment of a decoding process.
decoding (process): The process defined in ITU-T Rec. H.262 I ISOLIEC 13818-2 that reads an input coded
3.40
bitstream and produces decoded pictures.
3.41 dequantisation: The process of resealing the quantised DCT coefficients after their representation in the
bitstream has been decoded and before they are presented to the inverse DCT.
3.42 digital storage media; DSM: A digital storage or transmission device or system.
3.43 discrete cosine transform; DCT: Either the forward discrete cosine transform or the inverse discrete
cosine transform. The DCT is an invertible, discrete orthogonal transformation. The inverse DCT is defined in Annex A
of ITU-T Rec. H.262 I ISOLIEC 13818-2.
3.44 display aspect ratio: The ratio height/width (in SI units) of the intended display.
3.45 display order: The order in which the decoded pictures are displayed. Normally this is the same order in
which they were presented at the input of the encoder.
3.46 display process: The (non-normative) process by which reconstructed frames are displayed.
3.47 dual-prime prediction: A prediction mode in which two forward field-based predictions are averaged. The
predicted block size is 16 x 16 luminance samples. Dual-prime prediction is only used in interlaced P-pictures.
3.48 editing: The process by which one or more coded bitstreams are manipulated to produce a
new coded bitstream. Conforming edited bitstreams must meet the requirements defined in ITU-T Rec. H.262 I
ISO/IEC 13818-2.
3.49 encoder: An embodiment of an encoding process.
3.50
encoding (process): A process, not specified in ITU-T Rec. H.262 I ISO/IEC 138 18-2, that reads a
stream of input pictures and produces a valid coded bitstream as defined in ITU-T Rec. H.262 I ISO/IEC 13818-2.
ITU-T Rec. H.262 (1995 E) 3
ISO/IEC 13818-2: 1996(E)
3.51 enhancement layer: A relative reference to a layer (above the base layer) in a scalable hierarchy. For all forms
of scalability, its decoding process can be described by reference to the lower layer decoding process and the appropriate
additional decoding process for the enhancement layer itself.
fast forward playback: The process of displaying a sequence, or parts of a sequence, of pictures in display-
3.52
order faster than real-time.
process of displaying the picture sequence in the reverse of display order faster
3.53 fast reverse playback: The
than real-time.
3.54 field: For an interlaced video signal, a “field” is the assembly of alternate lines of a frame. Therefore an
interlaced frame is composed of two fields, a top field and a bottom field.
field-based prediction: A prediction mode using only one field of the reference frame. The predicted block
3.55
size is 16 x 16 luminance samples. Field-based prediction is not used in progressive frames.
field period: The reciprocal of twice the frame rate.
3.56
3.57 field picture; field structure picture: A field structure picture is a coded picture with picture - structure is
equal to “Top field” or “Bottom field”.
3.58 flag: A one bit integer variable which may take one of only two values (zero and one).
3.59 forbidden: The term “forbidden” when used in the clauses defining the coded bitstream indicates that the
value shall never be used. This is usually to avoid emulation of start codes.
3.60 forced updating: The process by which macroblocks are intra-coded from time-to-time to ensure that
mismatch errors between the inverse DCT processes in encoders and decoders cannot build up excessively.
if
3.61 forward compatibility: A newer coding standard is forward compatible with an older coding standard
able to decode bitstreams of the older coding standard
decoders designed to operate with the newer coding standard are
tion vector that is used for motion compensation from a reference frame or
3.62 forward motion vector: A mo
reference field at a n earlier time in display order.
3.63 forward prediction: Prediction from the past reference frame (field).
3.64 frame: A frame contains lines of spatial information of a video signal. For progressive video, these lines
contain samples starting from one time instant and continuing through successive lines to the bottom of the frame. For
interlaced video, a frame consists of two fields, a top field and a bottom field. One of these fields will commence one
field period later than the other.
3.65 frame-based prediction: A prediction mode using both fields of the reference frame.
3.66 frame period: The reciprocal of the frame rate,
3.67 frame picture; frame structure picture: A frame structure picture is a coded picture with picture
- structure is
equal to “Frame”.
3.68 frame rate: The rate at which frames are output from the decoding process.
future reference frame (field): A future reference frame (field) is a reference frame (field) that occurs at a
3.69
later time than the current picture in display order.
3.70 frame re-ordering: The process of re-ordering the reconstructed frames when the coded order is different
from the
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...