Electronic imaging -- Guidance for selection of document image compression methods

Imagerie électronique -- Guide pour la sélection des méthodes de compression d'image

General Information

Status
Withdrawn
Publication Date
21-Nov-2001
Withdrawal Date
21-Nov-2001
Current Stage
6060 - International Standard published
Start Date
01-Nov-2001
Completion Date
22-Nov-2001
Ref Project

RELATIONS

Buy Standard

Technical specification
ISO/TS 12033:2001 - Electronic imaging -- Guidance for selection of document image compression methods
English language
12 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (sample)

TECHNICAL ISO/TS
SPECIFICATION 12033
First edition
2001-11-15
Electronic imaging — Guidance for
selection of document image compression
methods
Imagerie électronique — Guide pour la sélection des méthodes de
compression d'image
Reference number
ISO/TS 12033:2001(E)
ISO 2001
---------------------- Page: 1 ----------------------
ISO/TS 12033:2001(E)
PDF disclaimer

This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but shall not

be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading this

file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat accepts no liability in this

area.
Adobe is a trademark of Adobe Systems Incorporated.

Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters

were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely event

that a problem relating to it is found, please inform the Central Secretariat at the address given below.

© ISO 2001

All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic

or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISO's member body

in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.ch
Web www.iso.ch
Printed in Switzerland
ii © ISO 2001 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/TS 12033:2001(E)
Contents Page

Foreword.....................................................................................................................................................................iv

Introduction.................................................................................................................................................................v

1 Scope..............................................................................................................................................................1

2 Normative references....................................................................................................................................1

3 Terms and definitions ...................................................................................................................................1

4 General............................................................................................................................................................3

5 Type of document and digitization parameters..........................................................................................3

5.1 General............................................................................................................................................................3

5.2 Types of documents......................................................................................................................................3

5.3 Document classification and digitization....................................................................................................4

5.3.1 General............................................................................................................................................................4

5.3.2 Black and white documents .........................................................................................................................4

5.3.3 Greyscale documents...................................................................................................................................4

5.3.4 Pseudo-grey documents...............................................................................................................................5

5.3.5 Colour documents.........................................................................................................................................5

5.3.6 Mixed documents..........................................................................................................................................5

6 Compression methods and standards ........................................................................................................6

6.1 RLE compression (Run-Length Encoding).................................................................................................6

6.2 LZW compression (Lempel-Ziv-Welch) .......................................................................................................6

6.3 ITU-T algorithms............................................................................................................................................6

6.3.1 General............................................................................................................................................................6

6.3.2 Group 3 one-dimensional method (G3 1D) .................................................................................................6

6.3.3 Group 3 two-dimensional method (G3 2D) and Group 4 method .............................................................7

6.4 JBIG compression.........................................................................................................................................7

6.5 JPEG compression........................................................................................................................................7

6.5.1 General............................................................................................................................................................7

6.5.2 Discrete Cosine Transform (DCT)................................................................................................................8

6.5.3 JPEG steps.....................................................................................................................................................8

6.5.4 Components of JPEG....................................................................................................................................8

6.6 Fractal compression......................................................................................................................................9

6.7 Wavelet compression....................................................................................................................................9

7 Selecting compression parameters.............................................................................................................9

7.1 Pertinence of compression ..........................................................................................................................9

7.2 Selecting a compression method ..............................................................................................................10

7.3 Adjusting JPEG compression ....................................................................................................................10

8 Conclusion...................................................................................................................................................11

Bibliography..............................................................................................................................................................12

© ISO 2001 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO/TS 12033:2001(E)
Foreword

ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISO

member bodies). The work of preparing International Standards is normally carried out through ISO technical

committees. Each member body interested in a subject for which a technical committee has been established has

the right to be represented on that committee. International organizations, governmental and non-governmental, in

liaison with ISO, also take part in the work. ISO collaborates closely with the International Electrotechnical

Commission (IEC) on all matters of electrotechnical standardization.

International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 3.

The main task of technical committees is to prepare International Standards. Draft International Standards adopted

by the technical committees are circulated to the member bodies for voting. Publication as an International

Standard requires approval by at least 75 % of the member bodies casting a vote.

In other circumstances, particularly when there is an urgent market requirement for such documents, a technical

committee may decide to publish other types of normative document:

— an ISO Publicly Available Specification (ISO/PAS) represents an agreement between technical experts in an

ISO working group and is accepted for publication if it is approved by more than 50 % of the members of the

parent committee casting a vote;

— an ISO Technical Specification (ISO/TS) represents an agreement between the members of a technical

committee and is accepted for publication if it is approved by 2/3 of the members of the committee casting a

vote.

An ISO/PAS or ISO/TS is reviewed after three years with a view to deciding whether it should be confirmed for a

further three years, revised to become an International Standard, or withdrawn. In the case of a confirmed ISO/PAS

or ISO/TS, it is reviewed again after six years at which time it has to be either transposed into an International

Standard or withdrawn.

Attention is drawn to the possibility that some of the elements of this Technical Specification may be the subject of

patent rights. ISO shall not be held responsible for identifying any or all such patent rights.

ISO/TS 12033 was prepared by Technical Committee ISO/TC 171, Document imaging applications, Subcommittee

SC 2, Application issues.
iv © ISO 2001 – All rights reserved
---------------------- Page: 4 ----------------------
ISO/TS 12033:2001(E)
Introduction

With respect to the rapid increase of applications using digitization techniques, the role of compression methods

has become a factor of growing importance for the management of the volumes of stored data.

The effects of the available compression methods vary greatly, depending on the source documents. For example,

an Electronic Image Management (EIM) system configured for scanning and storing continuous tone images will

have different image compression requirements as compared to an application involving only text.

Practical methods for analyzing user requirements for image compression in order to select accurate and optimal

image compression schemes are complex. It was evidently useful to issue this Technical Specification in order to

guide users and system developers in their selection of these methods.
© ISO 2001 – All rights reserved v
---------------------- Page: 5 ----------------------
TECHNICAL SPECIFICATION ISO/TS 12033:2001(E)
Electronic imaging — Guidance for selection of document image
compression methods
1 Scope

This Technical Specification provides information to enable a user or EIM integrator to make an informed decision

on selecting compression methods for digital images of business documents. It is designed to provide technical

guidance to analyze the type of documents and which compression methods are most suitable for particular

documents in order to optimize their storage and use.

For the user, this Technical Specification provides information on image compression methods incorporated in

hardware or software in order to help this user during the selection of equipment in which the methods are

embedded.
For the equipment or software designer, it provides planning information.

This Technical Specification is applicable only to still images in bit-map mode. It only takes into account

compression algorithms based on well-tested mathematical work.
2 Normative references

The following normative documents contain provisions which, through reference in this text, constitute provisions of

this Technical Specification. For dated references, subsequent amendments to, or revisions of, any of these

publications do not apply. However, parties to agreements based on this Technical Specification are encouraged to

investigate the possibility of applying the most recent editions of the normative documents indicated below. For

undated references, the latest edition of the normative document referred to applies. Members of ISO and IEC

maintain registers of currently valid International Standards.
ISO 12651:1999, Electronic imaging — Vocabulary

ITU-T Recommendation T.4:1999, Standardization of Group 3 facsimile terminals for document transmission

ITU-T Recommendation T.6:1988, Facsimile coding schemes and coding control functions for group 4 facsimile

apparatus
3 Terms and definitions

For the purposes of this Technical Specification, the terms and definitions given in ISO 12651 and the following

apply.
3.1
lossless compression

compression algorithm that is capable of recalling all of the original information of a compressed image

3.2
lossy compression

compression algorithm which loses some of the original information during compression, so that the decompressed

image is only an approximation of the original
© ISO 2001 – All rights reserved 1
---------------------- Page: 6 ----------------------
ISO/TS 12033:2001(E)

NOTE This type of algorithm is especially useful in image compression where details can be eliminated, because these

details are not perceptible, or are minimally perceptible, to the human eye. In this case, the compression ratio is dramatically

increased.
3.3
resolution
number of pixels per unit of length
3.4
dots per inch
dpi

number of dots that a scanner (printer) can scan (print) per inch both horizontally or vertically

3.5
brightness
visual sensation that enables an observer to detect luminance
3.6
contrast
difference between the highest and the lowest densities of an image
3.7
bit level
number of bits used to define a pixel
3.8
luminance
luminous flux emitted from a surface
NOTE The former term was photometric brightness.
3.9
chrominance
Cr,Cb

colour portion of the video signal including hue and saturation but not brightness

NOTE Low chroma means the colour picture looks pale or washed out; high chroma means intense colour; black, grey and

white have a chrominance equal to zero.
3.10
ITU-T Group 3 and Group 4
standard compression algorithms set by the ITU-T
3.11
Joint Photographic Experts Group
JPEG
popular name of ISO/IEC 10994 standard
3.12
Comité Consultatif International pour le Télégraphe et le Téléphone
CCITT

former name of the International Telecommunication Union – Telecommunication Standardization sector (ITU-T)

3.13
compression ratio
ratio between image size before compression and image size after compression
2 © ISO 2001 – All rights reserved
---------------------- Page: 7 ----------------------
ISO/TS 12033:2001(E)
4 General

In a document imaging system, users are concerned about the quality of archived images, for two reasons: first,

because it can affect the imaging system's future in the medium or even long term; and second because they must

choose the imaging tools based on an evolving technology.

The digitization process, which by nature transforms an image conveying comprehensible information into a

dematerialized one, changes the observer's perception of that image. The observer may consider the image as

being improved, though more frequently he considers it degraded. In fact, images undergo a number of successive

transformations at different points during the digitization process. At each of these stages, attempts are made to

keep the image within acceptable legibility limits, but also to restrict its size to within acceptable economic limits.

The specific role of one of the digitization stages ― compression ― is to reduce the size of the image. Some

compression methods are reversible in that the decompression algorithm restores the initial digital information.

These methods are lossless and have no impact on the quality of the image as it is perceived by the human eye.

Other methods are lossy, and may cause degradation perceptible to the eye. By adjusting parameters, the user can

bring a lossy method within acceptable limits.

While numerous compression methods are described in technical literature, few are stable according to industrial

standards. These are based on a limited number of principles: dominance of certain patterns, pattern repetition,

and noticeable mathematical properties. In any individual method, the number of parameters the user can modify is

small.

The choice of a method and compression parameters are for a large part determined by the characteristics of the

document. Obviously, the graphical contents of a document play a key role in determining the method and its

parameters. However, other factors characterizing the application context are also very important (see diagram).

A document's graphical contents are themselves important to the digitization process. Thus, a photograph cannot

be digitized in the same way if it is in greyscale or based on a “pseudo-grey” process. In the first case, JPEG

compression is used, while the second would require ITU or JBIG compression.

Before discussing compression methods, therefore, we need to review the types of documents and how they are

represented following digitization. See Figure 1.
5 Type of document and digitization parameters
5.1 General

A document is a set of organized information intended for presentation to a human user. Documents can be a

single page or a set of pages, and can contain arbitrary contents types, such as character content, graphical

content, and various types of image content.

The following document content may be founded in various types of documents. The classification list hereafter is

somewhat arbitrary, but for a given application, these distinctions may be used to understand how to handle a

given document.
5.2 Types of documents

Here we will present only those documents (generally called “word processing documents”) that are most likely to

be archived electronically. These documents include:

 black text on a white background, or less frequently, coloured text or a coloured background;

 photographs, black and white or colour;

 mixed documents containing both text and photographs reproduced by a printing process — black and white or

colour.
© ISO 2001 – All rights reserved 3
---------------------- Page: 8 ----------------------
ISO/TS 12033:2001(E)
Figure 1 — Interactions with the compression method
5.3 Document classification and digitization
5.3.1 General

For the purpose of determining a compression scheme, documents may be described in the following five ways.

For each type of document, digitization methods are briefly described.
5.3.2 Black and white documents

Digitizing pages printed in black and white (primarily text) generates bi-level images where each pixel is

represented by a bit. This form of representation can also be applied to images in text documents with a coloured

background or characters, as well as to line drawings.
The most important digitization parameter is resolution.

Resolution must be determined according to visual perception needs and on the limits of the complete imaging

process (e.g. 200 dpi for word processing documents, 300 dpi for digitized books).

There are also other parameters, related to image processing, which vary according to the kind of image. If we

know, for example, that the images to be digitized are text, we will try to produce black characters that are sharply

defined against a white background. Thus, we have brightness (adjusting the colour of a pixel against a threshold)

and contrast parameters (adjusting the colour of a pixel against that of the surrounding pixels).

5.3.3 Greyscale documents

This form of representation is applied to photographic documents, printed on paper from a black and white film.

4 © ISO 2001 – All rights reserved
---------------------- Page: 9 ----------------------
ISO/TS 12033:2001(E)

Digitization changes an initially continuous document into a matrix of pixels whose intensity is encoded in a range

of levels. Thus, 8-bit encoding produces 256 greyscales.

The number of greyscales or the bit level must be determined according to visual perception needs and the limits of

the complete imaging process.
5.3.4 Pseudo-grey documents

This category includes images that simulate grey using a variable arrangement of black and white pixels. There can

be two cases:

1) the source document is a photographic reproduction in a text; it was produced using a printing technique

and is itself a pseudo-grey document (rastering uses black pixels of variable size);

2) the source document is a true photograph, but was digitized in pseudo-grey for performance reasons: to

reduce the storage volume or transmission times on a network (the “half-tone” technique involves

arranging a variable number of black pixe
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.