Information technology — JPEG XR image coding system — Part 2: Image coding specification

This document specifies a coding format, referred to as JPEG XR, which is designed primarily for continuous-tone photographic content.

Technologies de l'information — Système de codage d'image JPEG XR — Partie 2: Spécification de codage d'image

General Information

Publication Date
Current Stage
6060 - International Standard published
Start Date
Due Date
Completion Date
Ref Project


Buy Standard

ISO/IEC 29199-2:2020 - Information technology -- JPEG XR image coding system
English language
228 pages
sale 15% off
sale 15% off

Standards Content (Sample)

STANDARD 29199-2
Fourth edition
Information technology — JPEG XR
image coding system —
Part 2:
Image coding specification
Technologies de l'information — Système de codage d'image JPEG
XR —
Partie 2: Spécification de codage d'image
Reference number
ISO/IEC 29199-2:2020(E)
ISO/IEC 2020

---------------------- Page: 1 ----------------------
ISO/IEC 29199-2:2020(E)

© ISO/IEC 2020
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Fax: +41 22 749 09 47
Published in Switzerland
ii © ISO/IEC 2020 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC 29199-2:2020(E)
Contents                                 Page
Foreword …………………………………………………………………………………………………………………………………………………v
Introduction . vi
1 Scope 1
2 Normative references . 1
3 Terms and definitions . 1
4 Abbreviated terms . 9
5 Conventions . 9
5.6 Adaptive VLC deltaDisc tables . 25
5.7 Adaptive inverse scanning tables . 25
6 General provisions, provisions specified in annexes, and image and codestream structures . 26
6.1 General . 26
6.2 Image planes and component arrays . 26
6.3 Image windowing . 27
6.4 Image partitioning . 27
6.5 Transform coefficients and frequency bands . 28
6.6 Codestream structure . 29
6.7 Precision and word length . 29
7 Overview of decoder . 30
7.1 General . 30
7.2 Overview of parsing process . 31
7.3 Overview of the decoding process . 32
8 Syntax, semantics, and parsing process . 33
8.1 General . 33
8.2 CODED_IMAGE( ) . 34
8.3 IMAGE_HEADER( ) . 35
8.7 CODED_TILES( ) . 55
8.8 Adaptive VLC code table selection . 87
8.9 Adaptation of CBPLP state variables . 94
8.10 Adaptive CBPHP prediction . 95
8.11 Adaptive inverse scanning . 96

© ISO/IEC 2020 – All rights reserved

---------------------- Page: 3 ----------------------
ISO/IEC 29199-2:2020(E)
8.12 Adaptive coefficient normalization . 99
9 Decoding process. 101
9.1 General . 101
9.2 Image decoding . 101
9.3 Image plane decoding . 102
9.4 Tile transform coefficient processing . 102
9.5 Coefficient remapping . 104
9.6 Transform coefficient prediction . 106
9.7 Derivation of quantization parameters . 115
9.8 Dequantization . 118
9.9 Sample reconstruction . 121
9.10 Output formatting. 152
Annex A (normative) Tag-based file format . 167
Annex B (normative) Profiles and levels . 195
Annex C (informative) Colour imagery representation and colour management . 199
Annex D (informative) Encoder processing . 202
Annex E (normative) Media type specification for the Annex A tag-based file format . 217
Annex F (normative) Storage in the ISO/IEC 23008-12 image file format and associated media type
registrations . 219
Bibliography . 227

© ISO/IEC 2020 – All rights reserved

---------------------- Page: 4 ----------------------
ISO/IEC 29199-2:2020(E)
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical activity.
ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see
Attention is drawn to the possibility that some of the elements of this document may be the subject of
patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see or the
IEC list of patent declarations received (see
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT)
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information, in collaboration
with ITU-T. The technically identical text is published as Rec. ITU-T T.832.
This fourth edition cancels and replaces the third edition (ISO/IEC 29199-2:2012), which has been
technically revised. It also incorporates the Amendment ISO/IEC 29199-2:2012/Amd.1:2017.
The main changes compared to the previous edition include:
  the specification of additional colour type identifiers;
  the specification of an alternative file storage format based on ISO/IEC 23008-12 for the storage
and interchange of JPEG XR coded images and image sequences;
  the specification of media type identifiers for use for use in various internet protocols.
A list of all parts in the ISO/IEC 29199 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at

© ISO/IEC 2020 – All rights reserved

---------------------- Page: 5 ----------------------
ISO/IEC 29199-2:2020(E)
This document specifies requirements and implementation guidelines for the compressed representation of
digital images for storage and interchange in a form referred to as JPEG XR. The JPEG XR design provides a
practical coding technology for a broad range of applications with excellent compression capability and important
additional functionalities. An input image is typically operated on by an encoder to create a JPEG XR coded image.
The decoder then operates on the coded image to produce an output image that is either an exact or approximate
reconstruction of the input image.
The primary intended application of JPEG XR is the representation of continuous-tone still images such as
photographic images. The manner of representation of the compressed image data and the associated decoding
process are specified. These processes and representations are generic, that is, they are applicable to a broad
range of applications using compressed colour and grayscale images in communications and computer systems
and within embedded applications, including mobile devices.
As of 2008, the most widely used digital photography format is a nominal implementation of the first JPEG coding
format as specified in ITU-T Recommendation T.81 | ISO/IEC 10918-1. This encoding uses a bit depth of 8 for each
of three channels, resulting in 256 representable values per channel (a total of 16 777 216 representable colour
More demanding applications may require a bit depth of 16, providing 65 536 representable values for each
colour values. Additional scenarios may necessitate even greater bit
channel, and resulting in over 2.8 * 10
depths and sample representation formats. When memory or processing power is at a premium, as few as five or
six bits per channel may be used.
The JPEG XR specification enables greater effective use of compressed imagery with this broadened diversity of
application requirements. JPEG XR supports a wide range of colour encoding formats including monochrome, RGB,
CMYK and n-component encodings using a variety of unsigned integer, fixed point, and floating point decoded
numerical representations with a variety of bit depths. The primary goal is to provide a compressed format
specification appropriate for a wide range of applications while keeping the implementation requirements for
encoders and decoders simple. A special focus of the design is support for emerging high dynamic range (HDR)
imagery applications.
JPEG XR combines the benefits of optimized image quality and compression efficiency together with low-
complexity encoding and decoding implementation requirements. It also provides an extensive set of additional
functionalities, including:
 high compression capability;
 low computational and memory resource requirements;
 lossless and lossy compression;
 image tile segmentation for random access and large image formats;
 support for low-complexity compressed-domain image manipulations;
 support for embedded thumbnail images and progressive resolution refinement;
 embedded codestream scalability for both image resolution and fidelity;
 alpha plane support;
 bit-exact decoder results for fixed and floating point image formats.

© ISO/IEC 2020 – All rights reserved

---------------------- Page: 6 ----------------------
ISO/IEC 29199-2:2020(E)
Important detailed design properties include:
 high performance, embedded system friendly compression;
 small memory footprint;
 integer-only operations with no divides;
 a signal processing structure that is highly amenable to parallel processing;
 use of the same signal processing operations for both lossless and lossy compression operation;
 support for a wide range of decoded sample formats (many of which support high dynamic range imagery):
 monochrome, RGB, CMYK or n-component image representation;
 8- or 16-bit unsigned integer;
 16- or 32-bit fixed point;
 16- or 32-bit floating point;
 several packed bit formats;
 1-bit per sample monochrome;
 5- or 10-bit per sample RGB;
 radiance RGBE.
The algorithm uses a reversible hierarchical lifting-based lapped biorthogonal transform. The transform has
lossless image representation capability and requires only a small number of integer processing operations for
both encoding and decoding. The processing is based on 16 × 16 macroblocks in the transform domain, which may
or may not affect overlapping areas in the spatial domain (with the overlapping property selected under the
control of the encoder). The design provides encoding and decoding with a minimal memory footprint suitable for
embedded implementations.
The algorithm provides native support for both RGB and CMYK colour types by converting these colour formats to
an internal luma-dominant format through the use of a reversible colour transform. In addition, YUV,
monochrome and arbitrary n-channel colour formats are supported.
The transforms employed are reversible; both lossless and lossy operations are supported using the same
algorithm. Using the same algorithm for both types of operation simplifies implementation, which is especially
important for embedded applications.
A wide range of numerical encodings at multiple bit depths are supported: 8-bit and 16-bit formats, as well as
additional specialized packed bit formats, are supported for both lossy and lossless compression. (32-bit formats
are supported using lossy compression.) Up to 24 bits are retained through the various transforms. While only
integer arithmetic is used for internal processing, lossless and lossy coding are supported for floating point and
fixed point image data – as well as for integer image formats.
The main body of this document specifies the syntax and semantics of JPEG XR coded images and the associated
decoding process that produces an output image from a coded image. Annex A specifies a tag-based file storage
format for storage and interchange of such coded images. Annex B specifies profiles and levels, which determine
conformance requirements for classes of encoders and decoders. Aspects of colour imagery representations and
colour management are discussed in Annex C. The typical expected encoding process is described in Annex D.
Annex E contains a media type specification for images encoded according to the tag-based format specified in
Annex A for use in various internet protocols. Annex F specifies an alternative file storage format based on

© ISO/IEC 2020 – All rights reserved

---------------------- Page: 7 ----------------------
ISO/IEC 29199-2:2020(E)
ISO/IEC 23008-12 and associated media type specifications for the storage and interchange of JPEG XR coded
images and image sequences. Annexes A, B, E, and F are an integral part of this document and contain normative
The International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC)
draw attention to the fact that it is claimed that compliance with this document may involve the use of patents.
ISO and IEC take no position concerning the evidence, validity and scope of these patent rights.
The holders of these patent rights have assured ISO and IEC that they are willing to negotiate licences under
reasonable and non-discriminatory terms and conditions with applicants throughout the world. In this respect,
the statements of the holders of these patent rights are registered with ISO and IEC. Information may be obtained
from the patent database available at
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights other than those in the patent database. ISO and IEC shall not be held responsible for identifying any or all
such patent rights.

© ISO/IEC 2020 – All rights reserved

---------------------- Page: 8 ----------------------

Information technology — JPEG XR image coding system —
Part 2:
Image coding specification
1 Scope
This document specifies a coding format, referred to as JPEG XR, which is designed primarily for continuous-tone
photographic content.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references, the
latest edition of the referenced document (including any amendments) applies.
Rec. ITU-T T.833 | ISO/IEC 29199-3, Information technology — JPEG XR image coding system — Part 3: Motion
ISO/IEC/IEEE 60559, Information technology — Microprocessor systems — Floating-Point arithmetic
ISO/IEC 10646:2017, Information technology — Universal coded character set (UCS)
ISO/IEC 23008-12:2017, Information technology — High efficiency coding and media delivery in heterogeneous
environments — Part 12: Image file format
3 Terms and definitions
For the purposes of this document, the terms, definitions and abbreviated terms specified in ISO/IEC 23008-12
and the following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
 ISO Online browsing platform: available at
 IEC Electropedia: available at
NOTE For the avoidance of doubt, in case of ambiguities, the definitions in this document take precedence over the
definitions of ISO/IEC 23008-12 except in regard to the file format specified in Annex F. A sample in the context of ISO/IEC
23008-12 and in Annex F is "all the data associated with a single time". In Annex G, this is meant as all data associated with one
coded image, not "element in a two-dimensional image array that comprises an image plane".
adaptive coefficient normalization
parsing sub-process where transform coefficients (3.75) are dynamically partitioned into a VLC-coded (3.77) part
and a fixed-length coded (3.28) part, in a manner designed to control (i.e., "normalize") bits used to represent the
VLC-coded part
Note 1 to entry: The fixed-length coded part of DC coefficients and low-pass coefficients is called FLC refinement and the fixed-
length coded part of high-pass coefficients is called flexbits.

© ISO/IEC 2020 – All rights reserved

---------------------- Page: 9 ----------------------
ISO/IEC 29199-2:2020(E)
adaptive inverse scanning
parsing sub-process where the zigzag scan order (3.80) associated with a set of transform coefficients (3.75) is
dynamically modified, based on the statistics of previously-parsed transform coefficients
adaptive VLC
parsing sub-process where the code table associated with VLC (3.77) parsing of a particular syntax element is
switched, among a finite set of fixed tables, based on the statistics of previously-parsed instances of this syntax
alpha image plane
optional secondary image plane (3.36) associated with an image of the same dimensions as the luma (3.45)
component of the primary image plane (3.57)
Note 1 to entry: The alpha image plane has one component, a luma component.
m×n array of samples (3.64), or an m×n array of transform coefficients (3.75)
block index
integer in the range 0 to 15 identifying, by its position in raster scan order (3.61), a particular 4×4 block (3.5)
within a partition of a 16×16 block into 16 4×4 blocks
sequence of 8 bits
bit in a codestream (3.13) where its position is an integer multiple of 8 bits from the beginning of the codestream,
where the first bit in the codestream is at position 0
component (3.14) of the primary image plane (3.57) with non-zero index, or the transform coefficients (3.75) and
sample values associated with this component
coded block pattern high-pass
syntax element indicating the coded block status (3.12), i.e. the presence or absence of non-zero high-pass
coefficients (3.34), for each of the blocks (3.5) in the macroblock (3.46)
coded block pattern low-pass
syntax element indicating the presence or absence of non-zero low-pass coefficients (3.44) in the macroblock
coded block status
indication of the presence or absence of non-zero transform coefficients (3.75) in that block (3.5)
sequence of bits contained in a sequence of bytes (3.7) from which syntax elements are parsed

© ISO/IEC 2020 – All rights reserved

---------------------- Page: 10 ----------------------
ISO/IEC 29199-2:2020(E)
Note 1 to entry: The most significant bit of the first byte is the first bit of the codestream, the next most significant bit of the
first byte is the second bit of the codestream, and so on, to the least significant bit of the first byte (which is the eighth bit of the
codestream), followed by the most significant bit of the second byte (which is the ninth bit of the codestream), and so on, up to
and including the least significant bit of the last byte of the sequence of bytes (which is the last bit of the codestream).
array of samples associated with an image plane (3.36)
possible value of a specific instance of a context variable (3.16)
context variable
variable used in the parsing process (3.54) to select which data structure is to be used for the adaptive VLC (3.3)
parsing of a given syntax element
DC coefficient
first subset when the transform coefficients (3.75), that are contained in a specific macroblock (3.46) and a specific
component (3.14), are partitioned into 3 subsets
DC-LP array
array of all DC (3.17) and low-pass (3.44) transform coefficients (3.75), for all macroblocks (3.46) associated with a
specific component (3.14)
embodiment of a parsing process (3.54) and decoding process (3.20)
decoding process
process of computing output sample values from the parsed syntax elements of the codestream (3.13)
process of rescaling the quantized transform coefficients (3.75) after their value has been parsed from the
codestream (3.13) and before they are presented to the inverse transform process (3.41)
one of DiscrimVal1 or DiscrimVal2, which are the two member variables of an instance of the adaptive VLC (3.3)
data structure
Note 1 to entry: The adaptive VLC data structure is specified in subclause 5.5.5.
embodiment of an encoding process (3.24)
encoding process
process of converting source sample values into a codestream (3.13)

© ISO/IEC 2019 – All rights reserved

---------------------- Page: 11 ----------------------
ISO/IEC 29199-2:2020(E)
extended image
image (3.35) produced by the decoding process (3.20) prior to windowing (3.79)
Note 1 to entry: The extended image has a luma (3.45) array that is an integer multiple of 16 in width and height.
finite-length sequence of bytes (3.7) that is accessible to a decoder (3.19) in a manner such that the decoder can
obtain access to the data at specified positions within the sequence of bytes
EXAMPLE Access to the data can be achieved by storing the entire sequence of bytes in random access memory or by
performing "position seek" operations to specified positions within the sequence of bytes.
file format
specified structure for the content of a file (3.26)
fixed-length code
code which assigns a finite set of allowable bit patterns to a specific set of values, where each bit pattern has the
same length
FLC refinement
fixed-length coded (3.28) part of a DC coefficient (3.17) or low-pass coefficient (3.44) that is parsed using adaptive
fixed-length codes
fixed-length coded (3.28) part of the high-pass coefficient (3.34) information which is parsed using adaptive fixed-
length codes
frequency band
one of three subsets of the transform coefficients (3.75) for an image (3.35), which are separately parsed: DC
coefficients (3.17), low-pass coefficients (3.44) and high-pass coefficients (3.34)
frequency mode
codestream (3.13) structure mode where the DC (3.17), low-pass (3.44), high-pass (3.34) and flexbits (3.30)
frequency bands (3.31) for each tile (3.72) are grouped separately
hard tiles
codestream (3.13) structure mode where the overlap operators are not applied across tile boundaries; instead,
boundary overlap operators are applied at tile boundaries
high-pass coefficient
third subset, when the transform coefficients (3.75) that are contained in a specific macroblock (3.46) and a
specific component (3.14) are partitioned into 3 subsets
result of the decoding process (3.20) consisting of a primary image plane (3.57) and an optional alpha image plane

© ISO/IEC 2020 – All rights reserved

---------------------- Page

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.