ISO/IEC 22091:2002
(Main)Information technology -- Streaming Lossless Data Compression algorithm (SLDC)
Information technology -- Streaming Lossless Data Compression algorithm (SLDC)
ISO/IEC 22091:2002 specifies a lossless compression algorithm to reduce the number of 8-bit bytes required to represent data records and File Marks. The algorithm is known as Streaming Lossless Data Compression algorithm (SLDC). ISO/IEC 22091:2002 is based on ISO/IEC 15220. It extends that algorithm with the addition of control symbols that allow records of different sizes and compressibility, along with File Marks, to be efficiently encoded into an output stream which requires little or no additional control information for later decoding. The numerical identifier according to ISO/IEC 11576 allocated to this algorithm is 6.
Technologies de l'information -- Algorithme de compression sans perte de données en mode continu (SLDC)
General Information
Standards Content (sample)
INTERNATIONAL ISO/IEC
STANDARD 22091
First edition
2002-09-15
Information technology — Streaming
Lossless Data Compression algorithm
(SLDC)
Technologies de l’information — Algorithme de compression sans perte de
données en mode continu (SDLC)
Reference number
ISO/IEC 22091:2002(E)
ISO/IEC 2002
---------------------- Page: 1 ----------------------
ISO/IEC 22091:2002(E)
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but shall not
be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading
this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat accepts no liability in
this area.Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the
unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.
© ISO/IEC 2002All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic
or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISO's member body
in the country of the requester.ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.ch
Web www.iso.ch
Printed in Switzerland
ii © ISO/IEC 2002 – All rights reserved
---------------------- Page: 2 ----------------------
ISO/IEC 22091:2002(E)
Contents
1 Scope 1
2 Conformance 1
3 Normative reference 1
4 Terms and definitions 1
4.1 Access Point 1
4.2 Control Symbol 1
4.3 Copy Pointer 1
4.4 data byte 1
4.5 Data Symbol 1
4.6 Displacement Field 1
4.7 Encoded Data Stream 1
4.8 Encoded Record 1
4.9 End Marker 2
4.10 End Of Record Symbol (EOR Symbol) 2
4.11 File Mark 2
4.12 File Mark Symbol 2
4.13 Flush Symbol 2
4.14 History Buffer 2
4.15 Literal 1 2
4.16 Literal 2 2
4.17 Matching String 2
4.18 Match Count 2
4.19 Match Count Field 2
4.20 Pad 2
4.21 Record 2
4.22 Record Segment 2
4.23 Reset X Symbol 2
4.24 Reset 1 Symbol 2
4.25 Reset 2 Symbol 2
4.26 scheme 1 2
4.27 Scheme 1 Symbol 2
4.28 scheme 2 3
4.29 Scheme 2 Symbol 3
4.30 user data 3
5 Conventions and Notations 3
5.1 Representation of numbers 3
5.2 Names 3
6 Acronyms 3
7 Algorithm Overview 3
7.1 Scheme 1 Encoding 3
7.2 Scheme 2 Encoding 3
7.3 History Buffer 4
8 Encoding Specification 4
8.1 User Data 4
8.2 History Buffer 4
8.3 Encoded Data Stream 4
© ISO/IEC 2002 – All rights reserved iii
---------------------- Page: 3 ----------------------
ISO/IEC 22091:2002(E)
8.3.1 Access Point 5
8.4 Data Symbols 5
8.4.1 Literal 1 Data Symbols 5
8.4.2 Copy Pointer Data Symbols 5
8.4.3 Literal 2 Data Symbols 6
8.5 Control Symbols 7
8.6 Pad 8
iv © ISO/IEC 2002 – All rights reserved
---------------------- Page: 4 ----------------------
ISO/IEC 22091:2002(E)
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the
specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the
development of International Standards through technical committees established by the respective organization to deal with
particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other
international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work. In the
field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 3.
The main task of the joint technical committee is to prepare International Standards. Draft International Standards adopted by
the joint technical committee are circulated to national bodies for voting. Publication as an International Standard requires
approval by at least 75 % of the national bodies casting a vote.Attention is drawn to the possibility that some of the elements of this International Standard may be the subject of patent rights.
ISO and IEC shall not be held responsible for identifying any or all such patent rights.
ISO/IEC 22091 was prepared by ECMA (as ECMA-321) and was adopted, under a special “fast-track procedure”, by Joint
Technical Committee ISO/IEC JTC 1, Information Technology, in parallel with its approval by national bodies of ISO and IEC.
© ISO/IEC 2002 – All rights reserved v---------------------- Page: 5 ----------------------
INTERNATIONAL STANDARD ISO/IEC 22091:2002(E)
Information technology — Streaming Lossless Data Compression algorithm (SLDC)
1 Scope
This International Standard specifies a lossless compression algorithm to reduce the number of 8-bit bytes required to represent
data records and File Marks. The algorithm is known as Streaming Lossless Data Compression algorithm (SLDC).
One buffer size (1 024 bytes) is specified.The numerical identifier according to ISO/IEC 11576 allocated to this algorithm is 6.
2 ConformanceA compression algorithm shall be in conformance with this International Standard if its Encoded Data Stream satisfies the
requirements of this International Standard.3 Normative reference
The following normative document contains provisions which, through reference in this text, constitute provisions of this
International Standard. For dated references, subsequent amendments to, or revisions of, any of these publications do not apply.
However, parties to agreements based on this International Standard are encouraged to investigate the possibility of applying
the most recent editions of the normative document indicated below. For undated references, the latest edition of the normative
document referred to applies. Members of ISO and IEC maintain registers of currently valid International Standards.
ISO/IEC 11576:1994 Information technology — Procedure for the registration of algorithms for the lossless compression
of data4 Terms and definitions
For the purpose of this International Standard the following terms and definitions apply.
4.1 Access PointA location in the Encoded Data Stream at which data may be decoded.
4.2 Control Symbol
A Control Symbol may change the compression scheme, reset the History Buffer, mark the end of a Record, indicate a File
Mark, or indicate the termination of an Encoded Data Stream.4.3 Copy Pointer
A part of the Encoded Data Stream output in scheme 1 that replaces a string of data bytes with a specification of a Matching
String.4.4 data byte
An element of user data that is to be encoded.
4.5 Data Symbol
An element of an Encoded Record that represents one or more data bytes.
4.6 Displacement Field
A field in the Copy Pointer that specifies the location within the History Buffer of the first byte of a Matching String.
4.7 Encoded Data StreamThe output stream after encoding User Data.
4.8 Encoded Record
The output stream after encoding one Record of user data.
© ISO/IEC 2002 – All rights reserved
---------------------- Page: 6 ----------------------
ISO/IEC 22091:2002(E)
4.9 End Marker
A Control Symbol that denotes termination of an Encoded Data Stream.
4.10 End Of Record Symbol (EOR Symbol)
A Control Symbol that denotes the end of a Record in the Encoded Data Stream.
4.11 File Mark
A recorded element used to mark organisational boundaries (e.g. directory boundaries) in user data.
4.12 File Mark SymbolA Control Symbol in Encoded Data Stream that denotes a File Mark in user data.
4.13 Flush Symbol
A Control Symbol that, if required, is followed by Pad to make the size of the Encoded Data Stream an integer multiple of
32 bits.4.14 History Buffer
A data structure where incoming data bytes are stored for use by scheme 1 compression and decompression.
4.15 Literal 1A part of the Encoded Data Stream, output in scheme 1, that represents a single data byte not encoded into any Copy Pointer.
4.16 Literal 2A part of the Encoded Data Stream, output in scheme 2, that represents a single data byte.
4.17 Matching StringA sequence of two or more bytes in the History Buffer that is identical with a sequence of bytes in the user data.
4.18 Match CountThe length, in bytes, of a Matching String.
4.19 Match Count Field
That part of a Copy Pointer that specifies the Match Count.
4.20 Pad
A number of bits inserted into the Encoded Data Stream so that the size of Encoded Data Stream is an integer multiple of
32 bits.4.21 Record
An element of user data that contains at least one data byte.
4.22 Record Segment
A section of a Record encoded in a given scheme.
4.23 Reset X Symbol
A generic reference to either the Reset 1 Symbol or the Reset 2 Symbol.
4.24 Reset 1 Symbol
A Control Symbol that indicates History Buffer reset, and that subsequent symbols are encoded in scheme 1.
4.25 Reset 2 SymbolA Control Symbol that indicates History Buffer reset, and that subsequent symbols are encoded in scheme 2.
4.26 scheme 1A compression scheme that uses a History Buffer to achieve data compression.
4.27 Scheme 1 Symbol
A Control Symbol that indicates subsequent Data Symbols are either Copy Pointers or Literal 1s.
© ISO/IEC 2002 – All rights reserved---------------------- Page: 7 ----------------------
ISO/IEC 22091:2002(E)
4.28 scheme 2
A packing scheme designed to encode uncompressible data with minimal expansion.
4.29 Scheme 2 Symbol
A Control Symbol that indicates subsequent Data Symbols are encoded in scheme 2.
4.30 user data
Information that is to be encoded, according to this compression algorithm.
5 Conventions and Notations
5.1 Representation of numbers
The following conventions and notations apply in this document unless otherwise stated.
− The setting of bits is denoted by ZERO or ONE.− Numbers in binary notation and bit combinations are strings of digits represented by ZEROs and ONEs with the most
significant bit to the left.− Letters and digits in parentheses represent numbers in hexadecimal notation.
− All other numbers are in decimal form.
5.2 Names
The names of basic elements, e.g. specific fields, are written with a capital initial letter.
6 AcronymsEOR End Of Record
lsb least significant bit
msb most significant bit
7 Algorithm Overview
User data that is to be compressed according to this International Standard consists of Records and File Marks. Records consist
of 8-bit data bytes, and may be of any non-zero length.Data bytes may be encoded in either scheme 1 or scheme 2.
7.1 Scheme 1 Encoding
There may exist within Records repeating strings of two or more data bytes such that information about the length and position
of one string may be substituted in place of a subsequent copy or copies of that same string. This information is known as a
Copy Pointer. This International Standard allows Copy Pointer substitution when corresponding bytes of the two strings are
offset by 1 to 1 023 data bytes within user data. Where string matches occur, data compression is possible, and the number of
bits of encoded data can be less than the number of bits of user data, and data compression is possible. Any data bytes that are
part of a repeated string may be encoded as a Copy Pointer. Any data byte that is not encoded as a Copy Pointer is encoded as a
Literal 1, in which a leading bit set to ZERO is added to the data byte, thereby indicating that this is a Literal 1. Regions over
which Copy Pointers and literal values are encoded are defined as being encoded according to scheme 1. Scheme 1 encoding is
identical with that of ISO/IEC 15200, except for the addition of Control Symbols.
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.