Information technology — Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s — Part 4: Compliance testing

Specifies how tests can be designed to verify whether bitstreams and decoders meet requirements specified in parts 1, 2 and 3 of ISO/IEC 11172. Summarizes the requirements, cross references them to characteristics, and defines how compliance with them can be tested. Gives guidelines how to construct tests and determine their outcome. Defines some actual tests only for audio.

Technologies de l'information — Codage de l'image animée et du son associé pour les supports de stockage numérique jusqu'à environ 1,5 Mbit/s — Partie 4: Essais de conformité

General Information

Status
Published
Publication Date
01-Mar-1995
Current Stage
9093 - International Standard confirmed
Start Date
29-Jul-2008
Completion Date
30-Oct-2025
Ref Project
Standard
ISO/IEC 11172-4:1995 - Information technology -- Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s
English language
42 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


INTERNATIONAL
ISO/IEC
STANDARD 11172-4
First edition
1995-03-I 5
Information technology - Coding of
moving pictures and associated audio for
digital storage media at up to about
I,5 Mbit/s -
Part 4:
Compliance testing
Technologies de I’informa tion - Codage de /‘image animbe et du son
associ6 pour les supports de stockage numgrique jusqu% environ
1,5 Mbit/s -
Partie 4: Essais de conformit
Reference number
&O/I EC 11172-4: 1995(E)
lSO/IEC 11172-4: 1995 (E)
Contents Page
Foreword . iii
Introduction . iv
Section 1: General . 1
1.1 Scope . 1
1.2 Normative references . 1
Section 2: Technical elements 2
.................................................................
2.1 Definitions .
2.2 Symbols and abbreviations .
2.3 Bitstream characteristics .
2.4 Decoder characteristics . 16
2.5 Procedures to test bitstream compliance . 18
2.6 Procedures to test decoder compliance . 29
Annexes
A Definition of audio decoder tests . 31
B Descriptions of the ISO/IEC 11172 (MPEG) audio test bitstreams 3 2
0 ISO/IEC 1995
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any
form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in
writing from the publisher.
ISO/IEC Copyright Office l Case Postale 56 l CH1211 Geneve 20 l Switzerland
Printed in Switzerland.
ii
0 ISOAEC lSO/IEC 11172-4: 1995 (E)
Foreword
IS0 (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for world-wide standardization. National Bodies that are members
of IS0 and IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. IS0 and IEC
technical committees collaborate in fields of mutual interest. Other international organizations,
governmental and non-governmental, in liaison with IS0 and IEC, also take part in the work.
In the field of information technology, IS0 and IEC have established a joint technical committee, ISO/IEC
JTC 1. Draft International Standards adopted by the joint technical committee are circulated to national
bodies for voting. Publication as an International Standard requires approval by at least 75 % of the
national bodies casting a vote.
ISO/IEC 11172 consists of the following parts, under the general title Information technology - Coding of
moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s:
- Part I: Systems
- Part 2: Video
- Part 3: Audio
- Part 4: Compliance testing
Annex A forms an integral part of this part of ISO/IEC 11172. Annex B is for information only.
. . .
lSO/IEC 11172-4: 1995 (E) 0 ISOAEC
Introduction
This International Standard was prepared by ISO/IEC JTClISC29NVGll also known as MPEG (Moving
Pictures Expert Group). MPEG was formed in 1988 to establish an International Standard for the coded
representation of moving pictures and associated audio stored on digital storage media. Parts 1,2 and 3 of
this International Standard were unanimously approved by the participating National Bodies in November
1992.
This International Standard is published in four parts. Part 1 - Systems - specifies the system coding layer
of the standard. It defines a multiplexed structure for combining audio and video data and means of
representing the timing information needed to replay synchronized sequences in real-time. Part 2 - video -
specifies the coded representation of video data and the decoding process required to reconstruct pictures.
Part 3 - audio - specifies the coded representation of audio data and the decoding process required to
reconstruct audio. Part 4 - compliance testing - specifies procedures to determine characteristics of coded
bitstreams and to test compliance of bitstreams and decoders with the requirements specified in Parts 1, 2
and 3.
Parts 1,2 and 3 of ISO/IEC 11172 specify a multiplex structure and coded representations of audiovisual
information. Parts 1, 2 and 3 of ISO/IEC 11172 allow for large flexibility, achieving suitability of this
International Standard for many different applications. The flexibility is obtained by including parameters
in the bitstream that define the characteristics of coded bitstreams. Examples are the audio sampling
frequency, picture size, picture rate and bitrate parameters.
This part of ISO/IEC 11172 specifies how tests can be designed to verify whether bitstreams and decoders
meet the requirements as specified in parts 1,2 and 3 of ISO/IEC 11172. These tests can be used for
various purposes such as:
whether the encoder
manufacturers of encoders , and customers, can use the tests to verify
produces valid bitstreams.
manufacturers of decoders and their customers can use the tests to verify whether the decoder meets
the requirements specified in parts 1,2 and 3 of ISO/IEC 11172 for the claimed decoder
capabilities.
applications can use the tests to verify whether the characteristics of a given bitstream meet the
application requirements, for example whether the size of the coded picture does not exceed the
maximum value allowed for the application.

INTERNATIONAL STANDARD@ ‘SO/‘EC lSO/IEC 11172-4: 1995 (E)
Information technology - Coding of moving pictures
and associated audio for digital storage media at up to
about I,5 Mbit/s -
Part 4:
Compliance testing
Section 1: General
1.1 Scope
This part of ISO/IEC 11172 specifies how tests can be designed to verify whether bitstreams and decoders
meet requirements specified in parts 1,2 and 3 of ISO/IEC 11172. In this part of ISO/IEC 11172, encoders
are not addressed specifically. An encoder is entitled to be an ISO/IEC 11172 encoder if it generates
bitstreams compliant with the syntactic and semantic bitstream requirements specified in parts 1,2 and 3 of
ISO/IEC 11172.
Characteristics of coded bitstreams and decoders are defined for parts 1,2 and 3 of ISO/IEC 11172. The
characteristics of a bitstream define the subset of the standard that is exploited in the bitstream. Examples
are the applied values or range of the picture size and bitrate parameters. Decoder characteristics define the
properties and capabilities of the applied decoding process. An example of a property is the applied
arithmetic accuracy. The capabilities of a decoder specify which coded bitstreams the decoder can decode and
reconstruct, by defining the subset of the standard that may be exploited in decodable bitstreams. A
bitstream can be decoded by a decoder if the characteristics of the coded bitstream are within the subset of the
standard specified by the decoder capabilities.
Procedures are descibed for testing compliance of bitstreams and decoders to the requirements defined in parts
1, 2 and 3 of ISO/IEC 11172. Given the set of characteristics claimed, the requirements that must be met
are fully determined by parts 1,2 and 3 of ISO/IEC 11172. This part of ISO/IEC 11172 summarizes the
requirements, cross references them to characteristics, and defines how compliance with them can be tested.
Guidelines are given how to construct tests and determine their outcome. Some actual tests are defined
only for audio.
1.2 Normative references
The following International Standards contain provisions which, through reference in this text, constitute
provisions of this part of ISO/IEC 11172. At the time of publication, the editions indicated were valid.
All standards are subject to revision, and parties to agreements based on this part of ISO/IEC 11172 are
encouraged to investigate the possibility of applying the most recent editions of the standards indicated
below. Members of IEC and IS0 maintain registers of currently valid International Standards.
ISO/IEC 11172- 1: 1993 Information technology - Coding of moving pictures and associated audio for digital
storage media at up to about 1,5 MbitLs - Part I: Systems.
ISO/IEC 11172-2: 1993 Information technology - Coding of moving pictures and associated audio for digital
storage media at up to about I,5 Mbit/,, - Part 2: Video.
ISO/IEC 11172-3: 1993 Information technology - Coding of moving pictures and associated audio for digital
storage media at up to about 1,5 MbitLs - Part 3: Audio.
0 ISOAEC
ISOAEC I1 172-4: 1995 (E)
CCIR Recommendation 601-2 Encoding parameters of digital television for studios.
CCIR Report 624-4 Characteristics of systems for monochrome and colour television.
CCIR Recommendation 648 Recording of audio signals.
CCIR Report 955-2 Sound broadcasting by satellite for portable and mobile receivers, including Annex IV
of Advanced Digital System II.
Summary description
CCITT Recommendation J. 17 Pre-emphasis used on Sound-Programme Circuits.
IEEE Draft Standard P118O/D2 1990 Specification for the implementation of 8x8 inverse discrete cosine
transform ‘:
IEC publication 908: 1987 CD Digital Audio System.
Section 2: Technical elements
2.1 Definitions
For the purposes of this part of ISO/IEC 11172, the following definitions apply. If the definition is
specific to a part, this is noted in square brackets.
2.1.1 ac coefficient [video]: Any DCT coefficient for which the frequency in one or both dimensions
is non-zero.
In the case of compressed audio an access unit is an audio access unit. In
2.1 .2 access unit [system]:
an access unit is the coded representation of a picture.
the case of compressed video
2. 1.3 adaptive segmentation [audio]: A subdivision of the digital representation of an audio signal
variable segments of time.
2.1.4 adaptive bit allocation [audio]: The assignment of bits to subbands in a time and frequency
varying fashion according to a psychoacoustic model.
noise to frequency bands in a
2.1.5 adaptive noise allocation [audio]: The assignment of coding
time and frequency varying fashion according to a psychoacoustic model.
2.1.6 alias [audio]: Mirrored signal component resulting from sub-Nyquist sampling.
2.1.7 analysis filterbank [audio]: Filterbank in the encoder that transforms a broadband PCM audio
signal into a set of subsampled subband samples.
2.1.8 audio access unit [audio]: For Layers I and II an audio access unit is defined as the smallest
part of the encoded bitstream which can be decoded by itself, where decoded means “fully reconstructed
sound”. For Layer III an audio access unit is part of the bitstream that is decodable with the use of
previously acquired main information.
2.1.9 audio buffer [audio]: A buffer in the system target decoder for storage of compressed audio data.
2.1.10 audio sequence [audio]: A non-interrupted series of audio frames in which the following
parameters are not changed:
- ID
- Layer
- Sampling Frequency
- For Layer I and II: Bitrate index
2.1.11 backward motion vector [video]: A motion vector that is used for motion compensation
from a reference picture at a later time in display order.
ISOAEC 11172-4: 1995 (E)
0 ISOAEC
2.1.12 Bark [audio]: Unit of critical band rate. The Bark scale is a non-linear mapping of the frequency
scale over the audio range closely corresponding with the frequency selectivity of the human ear across the
band.
2.1.13 bidirectionally predictive-coded picture;
B-picture [video]: A picture that is coded
using motion compensated prediction from a past and/or future reference picture.
2.1.14 bitrate: The rate at which the compressed bitstream is delivered from the storage medium to the
input of a decoder.
2.1.15 bitstream characteristics [compliance]: The subset of the standard that is exploited by the
encoder in generating the bitstream. For example, an encoder may apply syntactic and semantic constraints,
such as restricted ranges of parameters, to produce a bitstream that exploits a subset of the capabilities
supported by parts 1,2 and 3 of ISO/IEC 11172. Examples are the applied values or range of the picture
size and bitrate parameters in video bitstreams.
2.1.16 bitstream compliance [compliance]: A bitstream is compliant, if the bitstream meets the
syntactic and semantic bitstream requirements, specified in the normative clauses of parts 1,2 and 3 of
ISOPIEC 11172.
2.1.17 bitstream requirements [compliance]: Requirements for bitstreams defined in the normative
clauses of parts 1, 2 and 3 of ISO/IEC 11172.
2.1.18 block companding [audio]: Normalizing of the digital representation of an audio signal
within a certain time period.
2.1.19 block [video]: An 8-row by &column orthogonal block of pels.
2.1.20 bound [audio]: The lowest subband in which intensity stereo coding is used.
2.1.21 byte aligned: A bit in a coded bitstream is byte-aligned if its position is a multiple of &bits
from the first bit in the stream.
2.1.22 byte: Sequence of &bits.
2.1.23 channel: A digital medium that stores or transports an ISO/IEC 11172 stream.
2.1.24 channel [audio]: The left and right channels of a stereo signal
2.1.25 chrominance (component) [video]: A matrix, block or single pel representing one of the
two colour difference signals related to the primary colours in the manner defined in CCIR Ret 601. The
symbols used for the colour difference signals are Cr and Cb.
2.1.26 coded audio bitstream [audio]: A coded representation of an audio signal as specified in
ISO/IEC 11172-3.
2.1.27 coded video bitstream [video]: A coded representation of a series of one or more pictures as
specified in ISO/IEC 11172-2.
2.1.28 coded order [video]: The order in which the pictures are stored and decoded. This order is not
necessarily the same as the display order.
2.1.29 coded representation: A data element as represented in its encoded form.
2.1.30 coding parameters [video]: The set of user-definable parameters that characterize a coded video
bitstream. Bitstreams are characterized by coding parameters. Decoders are characterized by the bitstreams
that they are capable of decoding.

ISOIIEC 11172-4: 1995 (E) 0 ISOAEC
2.1.31 component [video]: A matrix, block or single from one of the three matrices (luminance
Pel
and two chrominance) that make up a picture.
2.1.32 compression: Reduction in the number of bits used to represent an item of data.
2.1.33 constant bitrate coded video [video]: A compressed video bitstream with a constant
average bitrate.
2.1.34 constant bitrate: Operation where the bitrate is constant from start to finish of the compressed
bitstream.
2.1.35 constrained parameters [video]: The values of the set of coding parameters defined in
2.4.3.2 of ISO/IEC 11172-2.
2.1.36 constrained system parameter stream (CSPS) [system]: An ISO/IEC 11172
multiplexed stream for which the constraints defined in 2.4.6 of this part of ISO/IEC 11172 apply.
2.1.37 CRC: Cyclic redundancy code.
2.1.38 critical band rate [audio]: Psychoacoustic function of frequency. At a given audible
frequency it is proportional to the number of critical bands below that frequency. The units of the critical
band rate scale are Barks.
2.1.39 critical band [audio]: Psychoacoustic measure in the spectral domain which corresponds to the
frequency selectivity of the human ear. This selectivity is expressed in Bark.
2.1.40 data element: An item of data as represented before encoding and after decoding.
2.1.41 dc-coefficient [video]: The DCT coefficient for which the frequency is zero in both
dimensions.
2.1.42 dc-coded picture; D-picture [video]: A picture that is coded using only information from
itself. Of the DCT coefficients in the coded representation, only the dc-coefficients are present.
2.1.43 DCT coefficient: The amplitude of a specific cosine basis function.
2.1.44 decoded stream: The decoded reconstruction of a compressed bitstream.
2.1.45 decoder characteristics [compliance]: The properties and capabilities of the decoding
process applied in the decoder.
2.1.46 decoder compliance [compliance]: A decoder is compliant, if the decoder meets the decoder
requirements, specified in the normative clauses of parts 1, 2 and 3 of ISO/IEC 11172, to decode compliant
bitstreams within the subset of the standard defined by the specified capabilities of the decoder.
2.1.47 decoder input buffer [video]: The first-in first-out (FIFO) buffer specified in the video
buffering verifier.
2.1.48 decoder input rate [video]: The data rate specified in the video buffering verifier and encoded
in the coded video bitstream.
2.1.49 decoder: An embodiment of a decoding process.
2.1.50 decoding (process): The process defined in ISO/IEC 11172 that reads an input coded bitstream
and produces decoded pictures or audio samples.
2.1.51 decoder requirements [compliance]: Requirements for decoders defined in the normative
clauses of parts 1, 2 and 3 of ISO/IEC 11172.

ISOAEC 11172-4: 1995 (E)
0 ISOAEC
2.1.52 decoding time-stamp; DTS [system]: A field that may be present in a packet header that
indicates the time that an access unit is decoded in the system target decoder.
2.1.53 de-emphasis [audio]: Filteri applied to an audio signal after storage or transmission to undo
a linear distortion due to emphasis.
2.1.54 dequantization [video]: The process of resealing the quantized DCT coefficients after their
representation in the bitstream has been decoded and before they are presented to the inverse DCT.
2.1.55 digital storage media; DSM: A digital storage or transmission device or system.
2.1.56 discrete cosine transform; DCT [video]: Either the forward discrete cosine transform or the
inverse discrete cosine transform. The DCT is an invertible, discrete orthogonal transformation. The
inverse DCT is defined in annex A of ISO/IEC 11172-2.
2.1.57 display order [video]: The order in which the decoded pictures should be displayed. Normally
this is the same order in which they were presented at the input of the encoder.
2.1.58 dual channel mode [audio]: A mode, where two audio channels with independent programme
contents (e.g. bilingual) are encoded within one bitstream. The coding process is the same as for the stereo
mode.
2.1.59 editing: The process by which one or more compressed bitstreams are manipulated to produce a
new compressed bitstream. Conforming edited bitstreams must meet the requirements defined in ISO/IEC
11172.
2.1.60 elementary stream [system]: A generic term for one of the coded video, coded audio or other
coded bitstreams.
or transmission to
2.1.61 emphasis [audio]: Filtering applied to an audio signal before storage
improve the signal-to-noise ratio at hi .gh frequencies.
2.1.62 encoder: An embodiment of an encoding process.
2.1.63 encoding (process): A process, not specified in ISO/IEC 11172, that reads a stream of input
pictures or audio samples and produces a valid coded bitstream as defined in ISO/IEC 11172.
2.1.64 entropy coding: Variable length lossless coding of the digital representation of a signal to
reduce redundancy.
2.1.65 fast forward playback [video]: The process of displaying a sequence, or parts of a sequence,
of pictures in display-order faster than real-time.
2.1.66 FFT: Fast Fourier Transformation. A fast algorithm for performing a discrete Fourier transform
(an orthogonal transform).
2.1.67 filterbank [audio]: A set of band-pass filters covering the entire audio frequency range.
2.1.68 fixed segmentation [audio]: A subdivision of the digital representation of an audio signal
into fixed segments of time.
2.1.69 forbidden: The term “forbidden” when used in the clauses defining the coded bitstream indicates
that the value shall never be used. This is usually to avoid emulation of start codes.
2.1.70 forced updating [video]: The process by which macroblocks are intra-coded from time-to-time
to ensure that mismatch errors between the inverse DCT processes in encoders and decoders cannot build up
excessively.
2.1.71 forward motion vector [video]: A motion vector that is used for motion compensation from
a reference picture at an earlier time in display order.
lSO/lEC 11172-4: 1995 (E) 0 ISO/IEC
2.1.72 frame [audio]: A part of the audio signal that corresponds to audio PCM samples from an audio
access unit.
2.1.73 free format [audio]: Any bitrate other than the defined bitrates that is less than the maximum
valid bitrate for each layer.
reference
2.1.74 future reference picture [video]: The future reference picture is the picture that
occurs at a later time than the current picture in display order.
2.1.75 granules [Layer II] [audio]: The set of 3 consecutive subband samples from all 32 subbands
that are considered together before quantization. They correspond to 96 PCM samples.
2.1.76 granules [Layer III] [audio]: 576 frequency lines that carry their own side information.
2.1.77 group of pictures [video]: A series of one or more coded pictures intended to assist random
access. The group of pictures is one of the layers in the coding syntax defined in ISO/IEC 11172-2.
2.1.78 Hann window [audio]: A time function applied sample-by-sample to a block of audio samples
before Fourier transformation.
2.1.79 Huffman coding: A specific method for entropy coding.
2.1.80 hybrid filterbank [audio]: A serial combination of subband filterbank and MDCT.
2.1.81 IMDCT [audio]: Inverse Modified Discrete Cosine Transform.
2.1.82 intensity stereo [audio]: A method of exploiting stereo irrelevance or redundancy in
stereophonic audio programmes based on retaining at high frequencies only the energy envelope of the right
and left channels.
2.1 .83 interlace [video]: The property of conventional television pictures where alternating lines of
the picture represent different instances in time.
2.1.84 intra coding [video]: Coding of a macroblock or picture that uses information only from that
macroblock or picture.
2.1.85 intra-coded picture; I-picture [video]: A picture coded using information only from itself.
2.1.85a ISO/IEC 11172-l decoder [compliance]: An embodiment of a decoding process for an
ISO/IEC 11172-l bitstream. MPEG-system decoder is a synonym.
2.1.85b ISO/IEC 11172-2 decoder [compliance]: An embodiment of a decoding process for an
ISO/IEC 11172-2 bitstream. MPEG-video decoder is a synonym.
2.1.85~ ISO/IEC 11172-3 decoder [compliance]: An embodiment of a decoding process for an
ISO/IEC 11172-3 bitstream. MPEG-audio decoder is a synonym.
2.1.86 ISO/IEC 11172 (multiplexed) stream [system]: A bitstream composed of zero or more
elementary streams combined in the manner defined in this part of ISO/IEC 11172.
2.1.87 joint stereo coding [audio]: Any method that exploits stereophonic irrelevance or
stereophonic redundancy.
2.1.88 joint stereo mode [audio]: A mode of the audio coding algorithm using joint stereo coding.
2.1.89 layer [audio]: One of the levels in the coding hierarchy of the audio system defined in ISO/IEC
11172-3.
0 ISOAEC lSO/IEC 11172-4: 1995 (E)
2.1.90 layer [video and systems]: One of the levels in the data hierarchy of the video and system
specifications defined in this part of ISO/IEC 11172 and ISO/IEC 11172-2.
2.1.91 luminance (component) [video]: A matrix, block or single pel representing a monochrome
representation of the signal and related to the primary colours in the manner defined in CCIR Ret 601. The
symbol used for luminance is Y.
2.1.92 macroblock [video]: The four 8 by 8 blocks of luminance data and the two corresponding 8 by
8 blocks of chrominance data coming from a 16 by 16 section of the luminance component of the picture.
Macroblock is sometimes used to refer to the pel data and sometimes to the coded representation of the pel
values and other data elements defined in the macroblock layer of the syntax defined in ISO/IEC 11172-2.
The usage is clear from the context.
2.1.93 mapping [audio]: Conversion of an audio signal from time to frequency domain by subband
filtering and/or by MDCT.
2.1.94 masking [audio]: A property of the human auditory system which an audio signal cannot be
bY
perceived in the presence of another audio signal .
io]: and time below which an audio signal
2.1.95 masking threshold [aud A function in freq ,uency
cannot be perceived by the human aud itory system
2.1.96 MDCT [audio]: Modified Discrete Cosine Transform.
2.1.97 motion compensation [video]: The use of motion vectors to improve the efficiency of the
prediction of pel values. The prediction uses motion vectors to provide offsets into the past and/or future
reference pictures containing previously decoded pel values that are used to form the prediction error signal.
2.1.98 motion estimation [video]: The process of estimating motion vectors during the encoding
process.
2.1.99 motion vector [video]: A two- *dimensional vector used for motion compensation that prov ides
an offset from the coordinate pos #ition in the current picture to the coordinates in a reference picture.
2.1.100 MS stereo [audio]: A method of exploiting stereo irrelevance or redundancy in stereophonic
audio programmes based on coding the sum and difference signal instead of the left and right channels.
2.1.101 non-intra coding [video]: Coding of a macroblock or picture that uses information both
from itself and from macroblocks and pictures occurring at other times.
2.1.102 non-tonal component [audio]: A noise-like component of an audio signal.
2.1.103 Nyquist sampling: Sampling at or above twice the maximum bandwidth of a signal.
2.1.104 pack [system]: A pack consists of a pack header followed by one or more packets. It is a layer
in the system coding syntax described in this part of ISO/IEC 11172.
2.1.105 packet data [system]: Contiguous bytes of data from an elementary stream present in a
packet.
2.1.106 packet header [system]: The data structure used to convey information about the elementary
stream data contained in the packet data.
2.1.107 packet [system]: A packet consists of a header followed by a number of contiguous bytes
from an elementary data stream. It is a layer in the system coding syntax described in this part of ISO/IEC
11172.
2.1.108 padding [audio]: A method to adjust the average length in time of an audio frame to the
duration of the corresponding PCM samples, by conditionally adding a slot to the audio frame.

ISOAEC 11172-4: 1995 (E) 0 ISOAEC
2.1.109 past reference picture [video]: The past reference picture is the reference picture that occurs
at an earlier time than the current picture in display order.
2.1.110 pel aspect ratio [video]: The ratio of the nominal vertical height of pel on the display to its
nominal horizontal width.
2.1.111 pel [video]: Picture element.
2.1.112 picture period [video]: The reciprocal of the picture rate.
2.1.113 picture rate [video]: The nominal rate at which pictures should be output from the decoding
process.
2.1.114 picture [video]: Source, coded or reconstructed image data. A source or reconstructed picture
consists of three rectangular matrices of 8-bit numbers representing the luminance and two chrominance
signals. The Picture layer is one of the layers in the coding syntax defined in ISOLIEC 11172-2. Note that
the term “picture” is always used in ISO/IEC 11172 in preference to the terms field or frame.
2.1.115 polyphase filterbank [audio]: A set of equal bandwidth filters with special phase
interrelationships, allowing for an efficient implementation of the filterbank.
2.1.116 prediction [video]: The use of a predictor to provide an estimate of the pel value or data
element currently being decoded.
2.1.117 predictive-coded picture; P-picture [video]: A picture that is coded using motion
compensated prediction from the past reference picture.
2.1.118 prediction error [video]: The difference between the actual value of a pel or data element and
its predictor.
2.1.119 predictor [video]: A linear combination of previously decoded pel values or data elements.
2.1.120 presentation time-stamp; PTS [system]: A field that may be present in a packet header
that indicates the time that a presentation unit is presented in the system target decoder.
2.1.121 presentation unit; PU [system]: A decoded audio access unit or a decoded picture.
2.1.122 psychoacoustic model [audio]: A mathematical model of the masking behaviour of the
human auditory system.
2.1.123 quantization matrix [video]: A set of sixty-four 8-bit values used by the dequantizer.
2.1.124 quantized DCT coefficients [video]: DCT coefficients before dequantization. A variable
length coded representation of quantized DCT coefficients is stored as part of the compressed video
bitstream.
2.1.125 quantizer scalefactor [video]: A data element represented in the bitstream and used by the
decoding process to scale the dequantization.
2.1.126 random access: The process of beginning to read and decode the coded bitstream at an arbitrary
point.
2.1.127 reference picture [video]: Reference pictures are the nearest adjacent I- or P-pictures to the
current picture in display order.
buffer system target decoder for storage of a reconstructed I-
2.1.128 reorder buffer [video]: A
picture or a reconstructed P-picture.
lSO/IEC 11172-4: 1995 (E)
0 ISOAEC
2.1.129 requantization [audio]: Decoding of coded subband samples in order to recover the original
quantized values.
2.1.130 reserved: The term “reserved” when used in the clauses defining the coded bitstream indicates
that the value may be used in the future for ISO/IEC defined extensions.
2.1.131 reverse playback [video]: The process of displaying the picture sequence in the reverse of
display order.
2.1.132 scalefactor band [audio]: A set of frequency lines in Layer III which are scaled by one
scalefactor.
2.1.133 scalefactor index [audio]: A numerical code for a scalefactor.
2.1.134 scalefactor [audio]: Factor by which a set of values is scaled before quantization.
2.1.135 sequence header [video]: A block of data in the coded bitstream containing the coded
representation of a number of data elements.
2.1.136 side information: Information in the bitstream necessary for controlling the decoder.
2.1.137 skipped macroblock [video]: A macroblock for which no data are stored.
2.1.138 slice [video]: A series of macroblocks. It is one of the layers of the coding syntax defined in
ISO/IEC 11172-2.
2.1.139 slot [audio]: A slot is an elementary part in the bitstream. In Layer I a slot equals four bytes,
in Layers II and III one byte.
2.1.140 source stream: A single non-multiplexed stream of samples before compression coding.
2.1.141 spreading function [audio]: A function that describes the frequency spread of masking.
2.1.142 start codes [system and video]: 32-bit codes embedded in that coded bitstream that are
unique. They are used for several purposes including identifying some of the layers in the coding syntax.
2.1.143 STD input buffer [system]: A first-in first-out buffer at the input of the system target
decoder for storage of compressed data from elementary streams before decoding.
2.1.144 stereo mode [audio]: Mode, where two audio channels which form a stereo pair (left and
right) are encoded within one bitstream. The coding process is the same as for the dual channel mode.
2.1.145 stuffing (bits); stuffing (bytes) : Code-words that may be inserted into the compressed
bitstream that are discarded in the decoding process. Their purpose is to increase the bitrate of the stream.
2.1.146 subband [audio]: Subdivision of the audio frequency band.
2.1.147 subband filterbank [audio]: A set of band filters covering the entire audio frequency range.
In ISO/IEC 11172-3 the subband filterbank is a polyphase filterbank.
2.1.148 subband samples [audio]: The subband filterbank within the audio encoder creates a filtered
and subsampled representation of the input audio stream. The filtered samples are called subband samples.
From 384 time-consecutive input audio samples, 12 time-consecutive subband samples are generated within
each of the 32 subbands.
2.1.149 syncword [audio]: A 12-bit code embedded in the audio bitstream that identifies the start of a
frame.
ISOAEC 11172-4: 1995 (E) 0 ISOAEC
2.1.150 synthesis filterbank [audio]: Filterbank in the decoder that reconstructs a PCM audio
signal from subband samples.
2.1.151 system header [system]: The system header is a data structure defined in this part of
ISOAEC 11172 that carries information summarising the system characteristics of the ISO/IEC 11172
multiplexed stream.
2.1.152 system target decoder; STD [system]: A hypothetical reference model of a decoding
process used to describe the semantics of an ISO/IEC 11172 multiplexed bitstream.
2.1.153 test procedure [compliance]: a method to verify compliance of a bitstream or a decoder.
2.1.154 time-stamp [system]: A term that indicates the time of an event.
2.1.155 triplet [audio]: A set of 3 consecutive subband samples from one subband. A triplet from
each of the 32 subbands forms a granule.
2.1.156 tonal component [audio]: A sinusoid-like component of an audio signal.
2.1.157 variable bitrate: Operation where the bitrate varies with time during the decoding of a
compressed bitstream.
2.1.158 variable length coding; VLC: A reversible procedure for coding that assigns shorter code-
words to frequent events and longer code-words to less frequent events.
2.1.159 video buffering verifier; VBV [video]: A hypothetical decoder that is conceptually
connected to the output of the encoder. Its purpose is to provide a constraint on the variability of the data
rate that an encoder or editing process may produce.
2.1.160 video sequence [video]: A series of one or more groups of pictures. It is one of the layers of
the coding syntax defined in ISO/IEC 11172-2.
2.1.161 zig-zag scanning order [video]: A specific sequential ordering of the DCT coefficients from
(approximately) the lowest spatial frequency to the highest.
lSO/IEC 11172-4: 1995 (E)
0 ISOAEC
2.2 Symbols and abbreviations
The mathematical operators used to describe this International Standard are similar to those used in the C
programming language. However, integer division with truncation and rounding are specifically defined.
The bitwise operators are defined assuming twos-complement representation of integers. Numbering and
counting loops generally begin from zero.
2.2.1 Arithmetic operators
+ Addition.
Subtraction (as a binary operator) or negation (as a unary operator).
++ Increment.
--
Decrement.
*
Multiplication.
A
Power.
I Integer division with truncation of the result toward zero. For example, 7/4 and -7/-4 are
truncated to 1 and -714 and 71-4 are truncated to - 1.
II Integer division with rounding to the nearest integer. Half-integer values are rounded away
from zero unless otherwise specified. For example 3N2 is rounded to 2, and -3//2 is rounded
to -2.
Integer division with truncation of the result towards-=.
DIV
I I Absolute value. I x I = x when x > 0
IxI=Owhenx==O
1x1 = -x when x c 0
% Modulus operator. Defined only for positive numbers.
Sign(x) = 1 x >o
S&Ye >
0 x==o
-1 x co
NINT ( ) Nearest integer operator. Returns the nearest integer value to the real-valued argument. Half-
integer values are rounded away from zero.
sin Sine.
cos Cosine.
Exponential.
exp
If Square root.
Logarithm to base ten.
lwo
Logarithm to base e.
1%
Logarithm to base 2.
log2
2.2.2 Logical operators
II Logical OR.
&& Logical AND.
lSO/IEC 11172-4: 1995 (E) 0 ISOAEC
! Logical NOT.
2.2.3 Relational operators
> Greater than.
>= Greater than or equal to.
<
Less than.
<= Less than or equal to.
Equal to.
!=
Not equal to.
max [,.,I the maximum value in the argument list.
min [,.,I the minimum value in the argument list.
2.2.4 Bitwise operators
A twos complement number representation is assumed where the bitwise operators are used.
&
AND.
I OR.
>> Shift right with sign extension.
<< Shift left with zero fill.
2.2.5 Assignment
-
-
Assignment operator.
2.2.6 Mnemonics
The following mnemonics are defined to describe the different data types used in the coded bit-stream.
bslbf Bit string, left bit first, where “left” is the order in which bit strings are written in
ISO/IEC 11172. Bit strings are written as a string of 1s and OS within single quote
marks, e.g. ‘1000 0001’. Blanks within a bit string are for ease of reading and have no
significance.
ch Channel. If ch has the value 0, the left channel of a stereo signal or the first of two
independent signals is indicated. (Audio)
nch Number of channels; equal to 1 for single channel mode, 2 in other modes. (Audio)
-
Granule of 3 * 32 subband samples in audio Layer II, 18 * 32 sub-band samples in
gr
audio Layer III. (Audio)
main-data The main data portion of the bitstream contains the scalefactors, Huffman encoded
data, and &cillary information. (Audio)
The location in the bitstream of the beginning of the main data for the frame. The
main-data-beg
location is equal to the ending location of the previous frame’s main data plus one bit.
It is calculated from the main-data-end value of the previous frame. part2-length The number of main-data bits used for scalefactors. (Audio)
0 ISOAEC ISOAEC 11172-4: 1995 (E)
rpchof Remainder polynomial coefficients, highest order first. (Audio)
sb Subband. (Audio)
sblimit The number of the lowest sub-band for which no bits are allocated. (Audio)
Scalefactor selection information. (Audio)
scfsi
Number of scalefactor band (long block scalefactor band) from which point on window
switch-point-l
switching is used. (Audio)
switch-point-s Number of scalefactor band (short block scalefactor band) from which point on window
switching is used. (Audio)
uimsbf Unsigned integer, most significant bit first.
vlclbf Variable length code, left bit first, where “left” refers to the order in which the VLC
codes are written.
Number of the actual time slot in case of block_type==2,0 < window 5 2.
window (Audio)
The byte order of multi-byte words is most significant byte first.
2.2.7 Constants
n 3,14159265358.
e 2,71828182845.
lSO/IEC 11172-4: 1995 (E)
0 ISOAEC
2,3 Bitstream characteristics
Bitstream characteristics specify the constraints that are applied by the encoder in generating the bitstream.
These syntactic and semantic constraints may for example restrict the range or the values of parameters that
are encoded directly or indirectly in the bitstream. The constraints applied to a given bitstream may or may
not be known a priori .
2.3.1 System bitstreams
System encoders may apply restrictions to the following parameters of system bitstreams (see
ISO/IEC 11172- 1):
mux-rate
a>
rate-bound
b)
STD buffer size
STD-buffer-size bound
delaicausedby system target decoder input buffering
difference between two SCRs in successive packs
length of a pack
!a
length of a packet
h)
number of packets in a pack
presence of time stamps in packet headers (DTS, PTS)
j>
CSPS-flag
k)
use of private streams
1)
packet rate
m>
fixed or variable bitrate operation (fixed flag parameter)
n)
number of multiplexed audio streams (audio bound parameter)
0)
number of multiplexed video streams (video:bound parameter)
P)
locking of audio sampling frequency and frequency of system clock
a
(system-audio-lock-flag parameter)
locking of video picture rate and frequency of system clock (system-video-lock-flag
parameter)
2.3.2 Video bitstreams
A requirement for MPEG video encoders is that the arithmetic precision in the decoder process used in the
encoder to produce the coded bitstream shall have the full accuracy specified in ISO/IEC 11172-2.
Video encoders may apply restrictions to the following parameters of video bitstreams (see ISO/IEC 111722):
horizontal - size
a>
vertical-size
b)
pel aspect ratio
Cl
picture rate
bit rate
e>
VBV buffer size
f) - -
constrained-parameter-flag
g>
forward f code
W
backward-f code
--
total number of macroblocks.
j>
The number of macroblocks per second.
k>
range and accuracy (half or integer pixel) of motion vectors.
1)
use of a non-default quantizer matrix for intra coded blocks
m>
use of a non-default quantizer matrix for non-intra coded blocks
n>
slice structure, that is the definition of where slices start and end within the picture.
0)
IBPD structure. That is the picture coding types and sequences of different picture coding
P)
types, such as for example the number of consecutive B frames, may be restricted.
fixed and/or variable bitrate operation (encoded in the bit-rate field and in the vbv-delay
field)
lSO/IEC 11172-4: 1995 (E)
0 ISOAEC
the occurrence and specification of user-data
2.3.3 Audio bitstreams
Audio encoders may apply restrictions to the following parameters of audio bitstreams (see ISO/IEC 11172-3):
layer
bitrate index
-
sampling-frequency
mode
mode-extension
emphasis
generation of crc check
-
value of fixed bitrate when coding in free format mode.
generation of ancillary data
ISOAEC 11172-4: 1995 (E) 0 ISOAEC
2.4 Decoder characteristics
The characteristics of a decoder specify the properties and capabilities of the decoding process applied in the
decoder. An example of a property is the arithmetic accuracy that is applied. The capabilities of a decoder
specify which coded bitstreams the decoder can reconstruct, by defining the subset of the standard that may
be exploited in decodable bitstreams. A bitstream can be decoded by a decoder if the characteristics of the
coded bitstream are within the subset of the standard defined by the decoder capabilities. Compliance to
ISO/IEC 11172 by a decoder requires that the capabilities of the decoder are specified. That is, each
constraint on the subset of the standard that may be exploited in bitstreams decodable by the decoder shall be
specified. Compliant decoders are required to decode all bitstreams that are compliant with the defined sub-
set without relying on private data, user data, ancillary data or other information.
2.4.1 System decoders
An ISO/IEC 11172-l decoder may support specific values only, or a specific range of the following
parameters in system bitstreams. These parameters are encoded directly or indirectly in the bitstream.
mux-rate
STD buffer size
packet rate -
Furthermore, a decoder may constrain the support of fixed and/or variable bitrate operation (see definition of
the fixed-flag field in 2.4.4.2 of ISO/IEC 11172-l), and may require locking between the 90 kHz system
clock and the audio sampling frequency and/or the video picture rate (see the system-audio-lock-flag and the
system-video-lock-flag fields in 2.4.4.2 of ISO/IEC 11172-l). Decoders shall specify how private streams
are handled.
2.4.2 Video decoders
An ISO/IEC 11172-2 video decoder may support specific values only, or a specific
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...