ISO/IEC 18477-1:2015
(Main)Information technology — Scalable compression and coding of continuous-tone still images — Part 1: Scalable compression and coding of continuous-tone still images
Information technology — Scalable compression and coding of continuous-tone still images — Part 1: Scalable compression and coding of continuous-tone still images
ISO/IEC 18477-1:2015 specifies a coding format, referred to as JPEG XT, which is designed primarily for continuous-tone photographic content.
Technologies de l'information — Compression échelonnable et codage d'images plates en ton continu — Partie 1: Codage des images à gamme dynamique élevée
General Information
Relations
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 18477-1
First edition
2015-06-15
Information technology — Scalable
compression and coding of
continuous-tone still images —
Part 1:
Scalable compression and coding of
continuous-tone still images
Technologies de l’information — Compression échelonnable et codage
d’images plates en ton continu —
Partie 1: Codage des images à gamme dynamique élevée
Reference number
ISO/IEC 18477-1:2015(E)
©
ISO/IEC 2015
---------------------- Page: 1 ----------------------
ISO/IEC 18477-1:2015(E)
COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2015, Published in Switzerland
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form
or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior
written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of
the requester.
ISO copyright office
Ch. de Blandonnet 8 • CP 401
CH-1214 Vernier, Geneva, Switzerland
Tel. +41 22 749 01 11
Fax +41 22 749 09 47
copyright@iso.org
www.iso.org
ii © ISO/IEC 2015 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/IEC 18477-1:2015(E)
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Symbols and abbreviated terms . 4
4.1 Symbols . 4
4.2 Abbreviated terms . 4
5 Conventions . 4
5.1 Conformance language . 4
5.2 Operators . 4
5.2.1 Arithmetic operators . 4
5.2.2 Assignment operators . 5
5.2.3 Precedence order of operators . 5
5.2.4 Mathematical functions . 5
6 General . 5
6.1 General definitions . 5
6.2 Functional overview on the decoding process . 6
6.3 Encoder requirements . 6
6.4 Decoder requirements. 6
Annex A (normative) Component subsampling and expansion of subsampling .7
Annex B (normative) Codestream syntax . 9
Annex C (normative) Multi-component decorrelation .16
Bibliography .18
© ISO/IEC 2015 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO/IEC 18477-1:2015(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical
activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work. In the field of information technology, ISO and IEC have established a joint technical committee,
ISO/IEC JTC 1.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular the different approval criteria needed for
the different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
Details of any patent rights identified during the development of the document will be in the Introduction
and/or on the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation on the meaning of ISO specific terms and expressions related to conformity
assessment, as well as information about ISO’s adherence to the WTO principles in the Technical Barriers
to Trade (TBT) see the following URL: Foreword - Supplementary information
The committee responsible for this document is ISO/IEC JTC 1, Information technology, SC 29, Coding of
audio, picture, multimedia and hypermedia information.
ISO/IEC 18477 consists of the following parts, under the general title: JPEG HDR image coding system:
— Part 1: Coding of high dynamic range images
— Part 2: Extensions for high dynamic range images
— Part 3: Box file format
— Part 6: IDR Integer Coding
— Part 7: HDR Floating-Point Coding
The following parts are under preparation:
— Part 4: Conformance testing
— Part 5: Reference software
— Part 8: Coding of high dynamic range images
— Part 9: Encoding of alpha channels
iv © ISO/IEC 2015 – All rights reserved
---------------------- Page: 4 ----------------------
ISO/IEC 18477-1:2015(E)
Introduction
This part of ISO/IEC 18477 specifies a coded codestream format for storage of continuous-tone
photographic content. JPEG XT is a scalable image coding system that builds on top of the legacy
Rec. ITU-T T.81 | ISO/IEC 10918-1 coding system, also known as JPEG, but extends it in a backwards
compatible way. This part of ISO/IEC 18477 specifies the commonly deployed components of the JPEG
coding system. Additional parts of ISO/IEC 18477 will extend on this baseline.
JPEG XT has been designed to be backwards compatible to legacy applications while at the same time
having a small coding complexity; JPEG XT uses, whenever possible, functional blocks of Rec. ITU-T T.81 |
ISO/IEC 10918-1, Rec. ITU-T T.86 | ISO/IEC 10918-4 and Rec. ITU-T T.871 | ISO/IEC 10918-5 to extend the
functionality of the legacy JPEG Coding System. It is optimized for good image quality and compression
efficiency while also enabling low-complexity encoding and decoding implementations.
© ISO/IEC 2015 – All rights reserved v
---------------------- Page: 5 ----------------------
INTERNATIONAL STANDARD ISO/IEC 18477-1:2015(E)
Information technology — Scalable compression and
coding of continuous-tone still images —
Part 1:
Scalable compression and coding of continuous-tone still
images
1 Scope
This part of ISO/IEC 18477 specifies a coding format, referred to as JPEG XT, which is designed primarily
for continuous-tone photographic content.
2 Normative references
The following documents, in whole or in part, are normatively referenced in this document and are
indispensable for its application. For dated references, only the edition cited applies. For undated
references, the latest edition of the referenced document (including any amendments) applies.
ISO/IEC 10918-1, Information technology — Digital compression and coding of continuous-tone still images
— Requirements and guidelines
ISO/IEC 10918-4, Information technology — Digital compression and coding of continuous-tone still
images: Registration of JPEG profiles, SPIFF profiles, SPIFF tags, SPIFF colour spaces, APPn markers, SPIFF
compression types, and Registration Authorities (REGAUT)
ISO/IEC 10918-5, Information technology — Digital compression and coding of continuous-tone still images:
JPEG File Interchange Format (JFIF)
3 Terms and definitions
For the purposes of this document, the following definitions apply.
3.1
bit stream
partially encoded or decoded sequence of bits comprising an entropy-coded segment
3.2
block
8 × 8 array of samples or an 8 × 8 array of DCT coefficient values of one component
3.3
byte
group of 8 bits
3.4
coder
embodiment of a coding process
3.5
coding
encoding or decoding
© ISO/IEC 2015 – All rights reserved 1
---------------------- Page: 6 ----------------------
ISO/IEC 18477-1:2015(E)
3.6
compression
reduction in the number of bits used to represent source image data
3.7
component
two-dimensional array of samples having the same designation in the output or display device
Note 1 to entry: An image typically consists of several components, e.g. red, green and blue.
3.8
continuous-tone image
image whose components have more than one bit per sample
3.9
discrete cosine transform
DCT
either the forward discrete cosine transform or the inverse discrete cosine transform
3.10
downsampling
procedure by which the spatial resolution of a component is reduced
3.11
entropy-coded (data) segment
independently decodable sequence of entropy encoded bytes of compressed image data
3.12
entropy decoder
embodiment of an entropy decoding procedure
3.13
entropy decoding
lossless procedure which recovers the sequence of symbols from the sequence of bits produced by the
entropy encoder
3.14
entropy encoder
embodiment of an entropy encoding procedure
3.15
entropy encoding
lossless procedure which converts a sequence of input symbols into a sequence of bits such that the
average number of bits per symbol approaches the entropy of the input symbols
3.16
grayscale image
continuous-tone image that has only one component
3.17
joint photographic experts group
JPEG
informal name of the committee which created this International Standard
Note 1 to entry: The “joint” comes from the ITU-T and ISO/IEC collaboration.
3.18
legacy decoder
embodiment of a decoding process conforming to Rec. ITU-T T.81 | ISO/IEC 10918-1 operating in baseline,
sequential or progressive mode with a sample resolution of eight bits
2 © ISO/IEC 2015 – All rights reserved
---------------------- Page: 7 ----------------------
ISO/IEC 18477-1:2015(E)
3.19
marker
two-byte code in which the first byte is hexadecimal FF and the second byte is a value between 1 and
hexadecimal FE
3.20
marker segment
marker and associated set of parameters
3.21
pixel
collection of sample values in the spatial image domain having all the same sample coordinates
Note 1 to entry: A pixel may consist of three samples describing its red, green and blue value.
3.22
precision
number of bits allocated to a particular sample or DCT coefficient
3.23
procedure
set of steps which accomplishes one of the tasks which comprise an encoding or decoding process
3.24
quantization value
integer value used in the quantization procedure
3.25
quantize
act of performing the quantization procedure for a DCT coefficient
3.26
sample
one element in the two-dimensional array which comprises a component
3.27
sample grid
common coordinate system for all samples of an image with the samples at the top left edge of the image
having the coordinates (0, 0), the first coordinate increases towards the right, the second to the bottom
3.28
scan
single pass through the data for one or more of the components in an image
3.29
scan header
marker segment that contains a start-of-scan marker and associated scan parameters that are coded at
the beginning of a scan
3.30
upsampling
procedure by which the spatial resolution of a component is increased
3.31
vertical sampling factor
relative number of vertical data units of a particular component with respect to the number of vertical
data units in the other components in the frame
© ISO/IEC 2015 – All rights reserved 3
---------------------- Page: 8 ----------------------
ISO/IEC 18477-1:2015(E)
4 Symbols and abbreviated terms
4.1 Symbols
X width of the sample grid in positions
Y height of the sample grid in positions
Nf number of components in an image
s subsampling factor of component i in horizontal direction
i, x
s subsampling factor of component i in vertical direction
i, y
H subsampling indicator of component i in the frame header
i
V subsampling indicator of component i in the frame header
i
v sample value at the sample grid position x, y
x, y
4.2 Abbreviated terms
ASCII American Standard Code for Information Interchange
DC Lowpass
AC Highpass
LSB Least Significant Bit
MSB Most Significant Bit
DCT Discrete Cosine Transformation
5 Conventions
5.1 Conformance language
The keyword “reserved” indicates a provision that is not specified at this time, shall not be used, and
may be specified in the future. The keyword “forbidden” indicates “reserved” and in addition indicates
that the provision will never be specified in the future.
5.2 Operators
NOTE Many of the operators used in this part of ISO/IEC 18477 are similar to those used in the C
programming language.
5.2.1 Arithmetic operators
+ Addition
− Subtraction (as a binary operator) or negation (as a unary prefix operator)
× Multiplication
/ Division without truncation or rounding
4 © ISO/IEC 2015 – All rights reserved
---------------------- Page: 9 ----------------------
ISO/IEC 18477-1:2015(E)
5.2.2 Assignment operators
= Assignment operator
5.2.3 Precedence order of operators
Operators are listed below in descending order of precedence. If several operators appear in the same
line, they have equal precedence. When several operators of equal precedence appear at the same level
in an expression, evaluation proceeds according to the associativity of the operator either from right to
left or from left to right.
Operators Type of operation Associativity
(), [ ], . Expression Left to Right
− Unary negation
×, / Multiplication Left to Right
+, − Addition and Subtraction Left to Right
< , >, <=, >= Relational Left to Right
5.2.4 Mathematical functions
Ceiling of x. Returns the smallest integer that is greater than or equal to x.
x
Floor of x. Returns the largest integer that is lesser than or equal to x.
x
|x| Absolute value, is –x for x < 0, otherwise x.
sign(x) Sign of x, zero if x is zero, +1 if x is positive, -1 if x is negative.
clamp(x, min, Clamps x to the range [min, max]: returns min if x < min, max if x > max or otherwise x.
max)
6 General
6.1 General definitions
The purpose of this clause is to give an informative overview of the elements specified in this part of
ISO/IEC 18477. Another purpose is to introduce many of the terms which are defined in Clause 3. These
terms are printed in italics upon first usage in this clause.
There are three elements specified in this part of ISO/IEC 18477:
a) An encoder is an embodiment of an encoding process. An encoder takes as input digital source image data
and encoder specifications, and by means of a specified set of procedures generates as output a codestream.
b) A decoder is an embodiment of a decoding process. A decoder takes as input a codestream, and by
means of a specified set of procedures generates as output digital reconstructed image data.
c) The codestream is a compressed image data representation which includes all necessary data to
allow a (full or approximate) reconstruction of the sample values of a digital image. Additional data
might be required that define the interpretation of the sample data, such as the spatial dimensions
of the samples.
© ISO/IEC 2015 – All rights reserved 5
---------------------- Page: 10 ----------------------
ISO/IEC 18477-1:2015(E)
6.2 Functional overview on the decoding process
The high-level algorithm for decoding is as follows: The samples are first reconstructed following the
decoder specifications defined in Rec. ITU-T T.81 | ISO/IEC 10918-1. If the resulting component arrays
are subsampled, they are upsampled on a common sample grid following the specifications in Annex A.
Following that, the output data is processed by an inverse decorrelation transformation. If the data
is already in an RGB type colour space, e.g. RGB with ITU-R Recommendation BT.601 primaries, this
transformation will be the identity transformation. Otherwise, either the ICT is used to transform the
data into RGB. The inverse decorrelation transformation is defined in Annex C, and the markers that are
required to select the transformation are defined in Annex B.
6.3 Encoder requirements
An encoding process converts source image data to compressed image data. This includes first generating
a low dynamic range image, and representing it by a coding process specified in Annex F or Annex G of
Rec. ITU-T T.81 | ISO/IEC 10918-1, and then generating a residual image which is encoded by one of the
processes defined in this part of ISO/IEC 18477.
In order to comply with this part of ISO/IEC 18477, an encoder shall satisfy at least one of the following
two requirements. An encoder shall with appropriate accuracy, convert source image data to compressed
image data which comply with the codestream format syntax specified in Annex B for the encoding
process(es) embodied by the encoder. A limited accuracy sufficient to match the error bounds specified
in the compliance tests is acceptable.
There is no requirement in this part of ISO/IEC 18477 that any encoder which embodies one of the
encoding processes specified here shall be able to operate for all ranges of the parameters which are
allowed for that process. An encoder is only required to meet the compliance tests and to generate the
compressed data format according to Annex B for those parameter values which it does use.
6.4 Decoder requirements
A decoding process converts compressed image data to reconstructed image data. For that, it has to
follow the decoding operation specified in Rec. ITU-T T.81 | ISO/IEC 10918-1 with sufficient accuracy,
using either the Baseline, Sequential or Progressive Scan process of the earlier Recommendation |
International Standard defined there in Annexes F and G. This process generates sample values on a
sample grid, which are then converted into a digital image by following the upsampling specifications in
Annex B and the multi-component decorrelation (ICT) process in Annex C.
6 © ISO/IEC 2015 – All rights reserved
---------------------- Page: 11 ----------------------
ISO/IEC 18477-1:2015(E)
Annex A
(normative)
Component subsampling and expansion of subsampling
In this Annex and all of its subclauses, the flow charts and tables are normative only in the sense that
they are defining an output that alternative implementations shall duplicate.
A.1 Component dimensions and subsampling factors (normative)
An image is defined to consist of Nf components, each of which is identified by a unique identifier C
i
defined in the frame header of the codestream format specified in Annex B. The number of components
Nf shall be either one or three. A component consists of a rectangular array of samples x wide and
i
y samples high. The component dimensions are derived from the image dimensions X and Y, also
i
parameters recorded in the frame header. These two parameters define a sample grid of X grid points
wide and Y grid points high, where the left topmost grid coordinate is (0, 0) and coordinates increase
from left to right and from top to bottom. However, the dimensions of the component do not need to
coincide with the dimensions of the image. For each component, two subsampling factors s and s
i, x i,
define the spacing between sample points of component i relative to the sample grid and the size of
y
the component array. If X and Y are the dimensions of the sample grid, the size of component i with
subsampling factors s and s is
i, x i, y
Xs/ andY/ s
i, x i, y
Upsampling by interpolation from surrounding samples as specified in Annex A generates then sample
values on all grid points of the sample grid.
The subsampling factors s and s are not directly represented in the binary codestream or any of its
i, x i, y
markers, but shall be derived from the parameters H and V recorded in the frame header. If Nf equals
i i
one, i.e. the image consists of a single component, H and V shall be one, and s and s are both one.
1 1 1, x 1, y
If Nf equals three, Table A.1 defines the relation between H , V and s and s . No other combinations
i i i, x i, y
of H and V than those listed in Table A.1 shall be used.
i i
Table A.1 — Subsampling values
H V H V H V s s s s s s
1 1 2 2 3 3 1, x 1, y 2, x 2, y 3, x 3, y
1 1 1 1 1 1 1 1 1 1 1 1
2 2 2 1 2 1 1 1 1 2 1 2
2 2 1 2 1 2 1 1 2 1 2 1
2 2 1 1 1 1 1 1 2 2 2 2
All other values reserved for ITU/ISO purposes.
NOTE Rec. ITU-T T.81 | ISO/IEC 10918-1 allowed other component arrangements and relations between grid
positions and sample positions that are not valid in this part of ISO/IEC 18477. However, the definitions given here
are special cases of the more general relations provided in Rec. ITU-T T.81 | ISO/IEC 10918-1 and both definitions
agree whenever both are defined.
A.2 Expansion of subsampled components
Whenever the subsampling factors s and s are not both one, interpolation is required to populate
i, x i, y
all grid positions of the image sample grid. The precise algorithm of how this interpolation has to be
performed is not specified in this part of ISO/IEC 18477 and various implementation choices exist. This
© ISO/IEC 2015 – All rights reserved 7
---------------------- Page: 12 ----------------------
ISO/IEC 18477-1:2015(E)
subclause provides an example for a simple bi-linear interpolation that is sufficient for most applications,
though more sophisticated choices may provide better subjective image quality.
A.3 Bilinear expansion of subsampled components (informative)
In a first step, check for each component i whether s is two or one. If s is one, perform no activity. If
i, y i, y
s equals two, interpolate samples v at odd lines from even lines as follows:
i, y
v =[(v +v +(y mod 2))/2]
x, 2y+1 x, 2y x, 2y+2
If 2y+2 is larger or equal to Y, the horizontal sample grid dimension, set v equal to v .
x, 2y+1 x, 2y
In a second step, check for each component i whether s is two or one. If s is one, no further activity
i, x i, x
is required and the algorithm terminates. Otherwise, if s is equal to two, interpolate samples at odd
i, x
positions from samples at even positions accordingly:
v = [(v +v +(x mod 2))/2]
2x+1, y 2x, y 2x+2, y
Again, if 2x+2 is larger or equal than X, the vertical sample grid dimension, set v to v .
2x+1, y 2x, y
A.4 Downsampling of components (informative)
This part of ISO/IEC 18477 does not define a normative procedure by which the resolution of components
whose s and s factors are not both one shall be reduced. Any procedure that generates components
i, x i, y
of the size X/ s and Y/ s is acceptable as long as it is compatible with the upsampling procedure
i, x i, y
defined above. A very simple downsampling filter is given in the next subclause.
A.5 Downsampling by a box filter (informative)
The box filter is the simplest possible downsampling filter and provides only poor quality. Even though
better alternatives exist, the box filter is nevertheless presented here as an example. The input of the box
filter is a X×Y component array of samples, where the sample value at position x, y is denoted by v . Set now
x, y
x :=s ×x x :=min(s ×x +s −1, X−1) y := s ×y y :=min(s ×y +s −1, Y−1)
min i, x s max i, x s i, x min i, y max i, y s i, y
The output of the box filter at position x , y is then defined as
s s
s
v :=(Σ Σ v )/((x −x −1)× (y −y −1))
x, y x=xmin.xmax y=xmin.ymax x, y max min max min
s
i.e. the average over the box x , y to x , y . The array of downsampled sample values v is
min min max max x, y
then subject to further processing, e.g. DCT transformation and entropy coding.
8 © ISO/IEC 2015 – All rights reserved
---------------------- Page: 13 ----------------------
ISO/IEC 18477-1:2015(E)
Annex B
(normative)
Codestream syntax
This Annex defines the compressed bitstream syntax which, structurally, consists of an ordered
collection of marker segments and entropy coded data segments. Marker segments specify parameters
necessary to reconstruct the sample values from the entropy coded data segments. Because all of these
constituent parts are represented with byte-aligned codes, each compressed data format consists of an
ordered sequence of 8-bit bytes. For each byte, a most significant bit (MSB) and a least significant bit
(LSB) are defined.
NOTE The codestream syntax defined here agrees mostly with the “interchange format” defined in Rec.
ITU-T T.81 | ISO/IEC 10918-1, with some additional constraints on the parameters in the marker segments and
some additional markers carrying information that is irrelevant for the older standard.
B.1 Parameters
Parameters are integers, with values specific to the encoding process, source image characteristics, and
other features selectable by the application. Parameters are assigned either 4-bit, 1-byte, 2-byte or 4-byte
codes. Except for certain optional groups of parameters, parameters encode critical information without
which the decoding process cannot properly reconstruct the image. The code assignment for a parameter
shall be an unsigned integer of the specified length in bits with the particular value of the parameter.
For parameters which are 2 bytes (16 bits) in length, the most significant byte shall come first in the
compressed data’s ordered sequence of bytes. The same holds for parameters that are 4 bytes (32 bit)
in length, where bits are ordered in the codestream in decreasing significance. Parameters which are
4 bits in length always come in pairs, and the pair shall always be encoded in a single byte. The first 4-bit
parameter of the pair shall occupy the most significant 4 bits of the byte. Within any 32-, 16-, 8-, or 4-bit
parameter, the MSB shall come first and LSB shall come last. This encoding is commonly known as “big
endian” representation of unsigned integers.
B.2 Markers
Markers serve to identify the various structural parts of the compressed data formats. Most markers
start marker segments containing a related group of parameters; some markers stand alone. All markers
are assigned two-byte codes: an 0xff byte followed by a byte which is not equal to 0x00 or 0xff. Any
marker may optionally be preceded by any number of fill bytes, which are bytes of the value 0xff.
NOTE Because of this special code-assignment structure, markers make it possible for a decoder to parse the
compressed data and locate its various parts without having to decode other segments of image data.
B.3 Marker assignments
All markers shall be assigned two-byte codes: a 0xff byte followed by a second byte w
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.