Information and documentation — Romanization of Chinese

ISO 7098:2015 explains the principles of the Romanization of Modern Chinese Putonghua (Mandarin Chinese), the official language of the People's Republic of China as defined in the Directives for the Promotion of Putonghua, promulgated on 1956-02-06 by the State Council of China. This International Standard can be applied in documentation of bibliographies, catalogues, indices, toponymic lists, etc.

Information et documentation — Romanisation du chinois

Informatika in dokumentacija - Latinični zapis kitajščine

Ta mednarodni standard razlaga načela latiničnega zapisa sodobne kitajščine putonghua
(mandarinščina), uradnega jezika Ljudske republike Kitajske, kot določajo direktive za promocijo kitajščine putonghua, ki jih je državni svet Kitajske sprejel 6. 2. 1956. Ta mednarodni standard se lahko uporablja v bibliografijah, katalogih, indeksih, seznamih toponimov itd.

Third edition
Information and documentation —
Romanization of Chinese
Information et documentation — Romanisation du chinois
Reference number
ISO 7098:2015(E)
ISO 2015

ISO 7098:2015(E)

ISO 7098:2015(E)

ISO 7098:2015(E)

ISO 7098:2015(E)

The first edition of ISO 7098 was published in 1982 after ISO/TC 46 recognized the need for an
International Standard specifying the Chinese phonetic alphabet. The second edition was published in
This third edition is in response to new application needs, for instance to reflect current Chinese
romanization practice and new developments in China and the rest of the world.
Information and documentation — Romanization of Chinese
1 Scope
This International Standard explains the principles of the Romanization of Modern Chinese Putonghua
(Mandarin Chinese), the official language of the People’s Republic of China as defined in the Directives for
the Promotion of Putonghua, promulgated on 1956-02-06 by the State Council of China. This International
Standard can be applied in documentation of bibliographies, catalogues, indices, toponymic lists, etc.
2 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
element of a writing system, whether or not alphabetical, that represents a phoneme, a syllable, a word
or even prosodic characteristics of the language, by using graphical symbols (letters, diacritical marks,
syllabic signs, punctuation marks, prosodic accents, etc.) or a combination of these signs (a letter having
an accent or a diacritical mark)
EXAMPLE a, B, ω or Γ are, therefore, characters as well as basic letters.
ordered character set, the order of which has been agreed upon
alphabetical characters
character set that contains letters (2.8)
alphanumeric characters
character set that contains both letters (2.8) and digits
graphic character
character that has a visual representation and is normally produced by writing, printing or displaying
ideophonographical character
graphic character (2.6) that represents an object or a concept and is associated with a sound element in
a natural language
EXAMPLE Chinese hanzi 鹤(crane), Japanese kanji 戦(war) and Korean hanja 册(book) are
ideophonographical characters.
Chinese characters
ideophonographical character set for recording the Chinese language
Note 1 to entry: Chinese characters (hanzi) are also used in the writing systems of other languages.
graphic character (2.6) that, when appearing alone or combined with others, is primarily used to
represent a sound element of a spoken language
ISO 7098:2015(E)

word segmentation
process of splitting text into a sequence of word segmentation unit
[SOURCE: ISO 24614-1:2010, 2.25]
3 General principles of conversion of writing systems
3.1 The words in a language, which are written according to a given script (the converted system),
sometimes have to be rendered according to a different system (the conversion system), normally used
for a different language.
This operation is often performed for historical or geographical texts, cartographical documents and,
in particular, for bibliographical work in every case where it is necessary to write words supplied in
various alphabets in a manner that allows intercalation with other words in a single alphabet so as
to enable a uniform alphabetization to be made in bibliographies, catalogues, indices, toponymic lists,
etc. It is indispensable in that it permits the univocal transmission of a written message between two
countries using different writing systems or exchanging a message, the writing of which is different
from their own. It, thereby, permits transmission by manual as well as mechanical or electronic means.
The two basic methods of conversion of a system of writing are transliteration and transcription.
3.2 Transliteration is the operation which consists of representing the characters of an entirely
alphabetical character or alphanumeric character system of writing by the characters of the
conversion alphabet.
In principle, this conversion should be made character by character: each character of the converted
alphabet is rendered by one character, and one only of the conversion alphabet, to ensure the complete
and unambiguous reversibility of the conversion alphabet into the converted alphabet.
When the number of characters used in the conversion system is smaller than the number of characters
of the converted system, it is necessary to use digraphs or diacritical marks. In this case, one shall avoid
as far as possible arbitrary choices and the use of purely conventional marks and try to maintain a
certain phonetic logic in order to give the system a wide acceptance.
However, it shall be accepted that the graphism obtained may not always be correctly pronounced
according to the phonetic habits of the language (or of all the languages) which usually use(s) the
conversion alphabet. On the other hand, this graphism shall be such that the reader who knows the
converted language may mentally restore unequivocally the original graphism and, thus, pronounce
it correctly.
3.3 Retransliteration is the operation which consists of converting the characters of a conversion
alphabet to those of the converted alphabet.
This operation is the exact opposite of transliteration; it is carried out by applying the rules of a system
of transliteration in reverse order so as to reconstitute the transliterated word to its original form.
3.4 Transcription is the operation which consists of representing the characters of a language, whatever
the original system of writing, by the phonetic system of letters or signs of the conversion language.
A transcription system is of necessity based on the orthographical conventions of a conversion language
and its alphabet. The users of a transcription system shall, therefore, have a knowledge of the conversion
language to be able to pronounce the characters correctly. Transcription is not strictly reversible.
Transcription may be used for the conversion of all writing systems. It is the only method that can
be used for systems that are not entirely alphabetical and for all ideophonographic writing systems
(Chinese, Japanese, etc.).
ISO 7098:2015(E)

3.5 Romanization is the conversion of non-Latin writing systems to the Latin alphabet by means of
transliteration or transcription.
To carry out Romanization, it is possible to use either transliteration or transcription or a combination
of these two methods, according to the nature of the converted system.
3.6 A conversion system proposed for international use may call for compromise and the sacrifice of
certain national customs.
It is, therefore, necessary for each national community of users to accept concessions, fully abstaining
in every case from imposing as a matter of course solutions that are actually justified only by national
practice (for example, regarding pronunciation, orthography, etc.). However, these concessions would
obviously not relate to the use that a country makes of its national writing system: when this national
system is not converted, the characters constituting it shall be accepted in the form in which they are
written in the national language.
When a country uses two systems univocally, converting one into the other to write its own language,
the system of transliteration thus implemented shall be taken a priori as a basis for the international
standardized system, as far as it is compatible with the other principles mentioned hereafter.
3.7 Where necessary, the conversion systems should specify an equivalent for each character, not only
the letters but also the punctuation marks, numbers, etc.
They should similarly take into account the arrangement of the sequence of characters that make up the
text, for example, the direction of the script, and specify the way of distinguishing words and of using
separation signs and capital letters, following as closely as possible the customs of the language(s)
which use the converted writing system.
4 Principles for converting ideophonographic characters
4.1 The structure of ideophonographic characters, where conveyance of meaning is of greater
importance than that of pronunciation, entails the existence of a large number of characters (more than
60 000 in the case of Chinese), thus, making sign by sign transliteration impossible and resulting in the
need to devise a system of transcription.
Each character shall, therefore, be transcribed by one or more Latin letters standing for the
pronunciation or pronunciations of the character in question. This means that the transcriber shall be
familiar with the reading or readings of the text to be transcribed.
4.2 In as much as the transcription of ideophonographic characters is merely a matter of phonetic
notation in Latin letters of characters of the languages which use them, identical characters will require
different transcriptions depending on whether they are found in Chinese, Japanese or Korean texts.
4.3 On the other hand, the same character within the same language shall always be transcribed in the
same way regardless of the type of graphic representation utilized (traditional form or simplified form of
a Chinese character), except where a single character has more than one pronunciation.
4.4 Reversibility of Romanization systems of ideophonographic characters is impossible due to the
following factors:
— the disparity in pronunciation of a given character in two different languages or within a single
— the high frequency of homophones within the same language (see Annex C);
— the possible coexistence of several writing systems within a given text.
© ISO 2015 – All rights reserved 3

4.5 In the case of those languages which use, even within the same text, more than one kind of script
(for example Kana and Chinese characters in Japanese, Hangul and Chinese characters in Korean), both
the transcription of the ideophonographic characters and the conversion of the other types of characters
(for example Kana/Hangul) should yield a consistent and homogeneous system of Romanization.
4.6 Although, as a rule, spacing between syllables of Chinese is regular, it is usual to transcribe the
different characters (or syllables) forming a single word by linking them together, in order to separate
the different words by the space.
The principles and rules for formation of words (orthography) shall be standardized to the language
4.7 Although there are no capital letters in ideophonographic characters, it is usual when romanizing
to capitalize some words, following the national uses.
5 Pinyin
The Scheme of the Chinese Phonetic Alphabet (Hanyu Pinyin Fang’an or Pinyin Fang’an), which was
officially adopted on 1958-02-11 by the National People’s Congress of the People’s Republic of China,
is used to transcribe Chinese. The transcriber writes down the pronunciation of Chinese characters
according to their readings in Standard Chinese (Putonghua).
6 Syllabic forms
6.1 Each Chinese character generally represents one syllable. One word may consist of one or more
6.2 A Chinese syllable can be divided into two parts: initial and final.
6.2.1 Initial
— Bilabial: b p m;
— Labio-dental: f;
— Dorso-prepalatal: d t n l;
— Dorso-velar: g k h;
— Apico-alveolar: z c s;
— Apico-postalveolar: zh ch sh r;
— Dorso-palatal: j q x;
— Zero initial: nothing before the far left of the final.
6.2.2 Final
— Articulation A: Articulation with a, o, e as medial or main vowel. For example, a, o, e, ei, ao, ou, an,
ang, en, eng, ong, er, and with i in zi, ci, si, zhi, chi, shi, ri as main vowel.
— Articulation B: Articulation with u as medial or main vowel. For example, u, ua, uo, uai, ui, uan,
uang, un, ueng.
— Articulation C: Articulation with i as medial or main vowel. For example, i, ia, ie, iao, iu, ian, iang,
in, ing, iong.
ISO 7098:2015(E)

— Articulation D: Articulation with ü as medial or main vowel. For example, ü, üe, üan, ün. Hanyu
Pinyin simplifies the spellings of syllables with ü by using the u form instead in cases where no
ambiguity could result.
6.3 Table of syllabic forms
The table of Chinese syllabic forms is given in Annex A. This table covers all syllables of Chinese
Putonghua except syllable ê and retroflexion syllable.
6.4 Reference dictionaries
Among reference books of modern Chinese are the following dictionaries.
— 中国社会科学院语言研究所词典编辑室编《. 现代汉语词典》(第6版). 北京: 商务印书馆, 2012.
Dictionary Compilation Division, Institute of Linguistics, Chinese Academy of Social Sciences, The
Contemporary Chinese Dictionary (6 Edition). Beijing: The Commercial Press, 2012.
This dictionary gives the transcriptions in Pinyin of more than 69 000 words.
— 《现代汉语词典(汉英双语)》. 北京: 外语教学与研究出版社, 2002.
The Contemporary Chinese Dictionary (Chinese-English). Beijing: Foreign Language Teaching and
Research Press, 2002.
This dictionary includes equivalent English explanations for Chinese words.
— 德范克主编. 《ABC 汉英大词典》. 夏威夷: 夏威夷大学出版社, 2003.
John DeFrancis. ABC Chinese-English Comprehensive Dictionary. Hawai’i: University of Hawai’i Press,
This dictionary includes 71 344 words, arranged in Pinyin alphabet order. It is easy to check by Pinyin.
— 《新华字典》(第11版). 北京: 商务印书馆,2011.
Xinhua Zidian (11 Edition). Beijing: The Commercial Press, 2011.
This dictionary includes the transcriptions in Pinyin of more than 10 000 characters.
These dictionaries can be complemented by the following list of Chinese characters.
— 中华人民共和国国务院《. 通用规范汉字表》. 北京:语文出版社, 2013.
State Council of People’s Republic of China. List of Standard Chinese Characters for General Use.
Beijing: Language and Culture Press, 2013.
This list includes 8 105 commonly-used Chinese characters. In addition, it has a concordance table
of simplified characters and non-simplified characters.
7 Tones
7.1 Chinese is a tonal language.
This means that the tone affects meaning. The same sound pronounced in different tones can mean
very different concepts.
© ISO 2015 – All rights reserved 5

ISO 7098:2015(E)

Each syllable may have one of four tones or may be toneless. The four tones are marked by the following
diacritic signs (every diacritic sign has a special hexadecimal code):
— 1 tone (high and level tone) ˉ (hex: 0304);
— 2 tone (rising tone) ˊ (hex: 0301);
— 3 tone (falling-rising tone) ˇ (hex: 030C);
— 4 tone (falling tone) ˋ (hex: 0300).
Here is a graphical representation of the four tones.
Figure 1 — Graphical representation of the four tones of Putonghua (superposed)
1  tone
1st tone nd
2  tone
2nd tone
4  tone4th tone
3rd tone
st nd rd th
a) 1 tone b) 2 tone c) 3 tone d) 4 tone
Figure 2 — Graphical representation of the four tones of Putonghua (separate)
7.2 In the table of Chinese syllabic forms (see Annex A), the syllables do not carry tone marks. But in
the text, it is usual to indicate the tone of a syllable by placing the diacritic sign on a vowel.
EXAMPLE ē, é, ě, è.
The diacritic sign for tone is placed on the main vowel in the final part of a syllable.
EXAMPLE /béi/, /què/.
In the final part /éi/ of syllable /béi/, /é/ is the main vowel; and in the final part /uè/ of syllable /què/,
/è/ is the main vowel. Therefore, the tone sign is placed on /é/ and /è/, accordingly.
In this case, Latin script letters have to be extended to indicate the Chinese vowels with different tones.
The hexadecimal codes of these extended Latin script letters are given in Annex B.
ISO 7098:2015(E)

7.3 Neutral tone (atony) is indicated by the lack of a diacritic sign.
7.4 Changes of tone induced by the tone of the next syllable in a word are not shown.
7.5 For practical or technical reason, tones can also be expressed by numbers or letters.
For example, Arabic numbers 1, 2, 3, 4 and 5 usually express respectively 1st tone, 2nd tone, 3rd tone,
4th tone and atony of Chinese.
7.6 Tone marks may be written as a learning tool; however, they can be omitted for convenience.
8 Punctuation
Punctuation marks similar to those existing in the sets of Latin characters are transcribed as their
Latin counterparts. Chinese specific punctuation marks are transcribed as follows.
Table 1 — Romanization of Chinese punctuation marks
Chinese mark Latin mark Note
° hex: 3002 . hex: 002E full stop
special comma used to set off a short
ˋ hex: 3001 , hex: 002C
pause in the series
• hex: 2022 Space hex: 0020 disconnect mark
…… hex: 2026 2026 … hex: 2026 horizontal ellipsis
9 Numerals
Numerals written in Chinese characters are transcribed in Pinyin. Numerals written in Arabic or
Roman characters are kept as such.
10 Chinese Pinyin Orthography
10.1 Most of commonly used Chinese words are polysyllabic words.
In international documentation and information, it is reasonable to link different Pinyin monosyllables
to form a polysyllabic Chinese word (see Annex C).
10.2 Before the Middle Age, the Greeks and Romans always knew what a word was, and they were able
to identify words even if the texts were written without spaces between neighbouring words at that time.
Afterwards, the spaces between words were invented in Europe. The use of spaces implies the concept
of word, it has become the standard for all modes of writing alphabetical languages to insert spaces
between words, and the publishers and librarians in the world apply this common standard.
10.3 In Chinese Pinyin, it is also necessary to use the spaces to separate words, not syllables.
The word segmentation is a very good tradition of world civilization. In the Romanization of Chinese, it
is beneficial to respect this good tradition.
10.4 In Chinese Pinyin, monosyllable is ambiguous.
One syllable can represent several Chinese characters. Therefore, Pinyin syllable is ambiguous in
representation of Chinese characters. In Chinese Pinyin, the ambiguity index of monosyllables is big.
In average, one Chinese syllable has to represent more than 20 Chinese characters for general use.
ISO 7098:2015(E)

However, if different Chinese monosyllables are linked to form the polysyllabic Chinese word, the
ambiguity index of Pinyin syllable will be reduced. In order to disambiguate Pinyin syllables, it is
necessary to link different monosyllables to form a polysyllabic Chinese word.
10.5 The further description of the ambiguity index for Chinese syllables is given in Annex C.
10.6 Basic Rules for Chinese Pinyin Orthography (GB/T 16159-2012, Chinese Standard, 2012) contains
rules for separating or joining syllables to form a word: rules for spelling common words (nouns, verbs,
adjectives, pronouns, etc.), rules for spelling fused phrase expressions, rules for spelling personal names
and place names, rules for representing tones, rules for hyphenation at the end of line, etc.
10.7 At present, in Chinese linguistics, there is no clear common definition of a Chinese word yet, so it
is difficult to decide the boundary (dividing line) of a common Chinese word sometime, and, of course, it
poses difficulty to link the monosyllables to form a common polysyllabic Chinese word.
However, the boundary of a Chinese proper noun is relatively clear. It is not so difficult to link different
monosyllables to form a Chinese polysyllabic proper noun (the named entity as personal name,
geographic name, language name, ethnic name, tribal name, religion name, etc.), because the boundary
of a Chinese polysyllabic named entity is easy to decide according to the standards or regulations of
Chinese. In international documentation and information, it is necessary and possible to link different
Pinyin monosyllables to form a Chinese polysyllabic named entity in order to avoid ambiguity.
11 Transcription rules for named entities
11.1 Chinese personal names are to be written separately with the surname first, followed by the given
name written as one word, with the initial letters of both capitalized.
EXAMPLE 1 Li Hua (李华).
EXAMPLE 2 Wang Jianguo (王建国).
The traditional compound surnames are to be written together.
EXAMPLE 3 Zhuge Kongming (诸葛孔明).
The two-character or multi-character double surnames without traditional permanence are to be
written separately with the initial letters of both capitalized.
EXAMPLE 4 Zhang Wang Shufang (张王淑芳).
EXAMPLE 5 Xiang Situ Wenliang (项司徒文良).
EXAMPLE 6 Ouyang Meng Xiang (欧阳孟翔).
The pen names and other aliases are to be treated in the same manner.
EXAMPLE 7 Lu Xun (鲁迅).
EXAMPLE 8 Mao Dun (茅盾).
EXAMPLE 9 Zhang San (张三).
EXAMPLE 10 Wang Pangzi (王胖子).
11.2 A surname, given name or seniority order after the adjuncts “xiao”, “lao”, “da” and “a” is to be
written separately and with the initial letter of the last name capitalized.
Adjuncts such as “xiao”, “lao”, “da” and “a” should not be capitalized unless they appear at the beginning
of a sentence.
EXAMPLE 1 xiao Liu (小刘, younger Liu).
8 © ISO 2015 – All rights reserved

---------------------- Page: 13 ----------------------
ISO 7098:2015(E)

EXAMPLE 2 lao Qian (老钱, older Qian).
EXAMPLE 3 da Li (大李, older Li).
EXAMPLE 4 a Gui (阿贵, Mr. Gui).
If the character “xiao”, “lao”, “da” and “a” is part of a given name, follow the same practice for given names.
EXAMPLE 5 Wang Xiaojuan (王小娟).
EXAMPLE 6 Zhao Laoshan (赵老山).
EXAMPLE 7 Li Daqin (李大勤).
EXAMPLE 8 Lou Ashu (娄阿鼠).
11.3 Certain Chinese personal names and titles have already fused traditionally and are written as one
word with the initial letter capitalized.
EXAMPLE 1 Kongzi (孔子, Master Confucius).
EXAMPLE 2 Baogong (包公, Duke Bao).
EXAMPLE 3 Xishi (西施,acme of beauty, 5 cent. B.C.).
11.4 In Chinese place names, a geographical proper name should be separated from the name of
jurisdiction or the geographical feature name.
The multi-character geographical proper name, the name of jurisdiction or the geographical feature name
should be separately written together as one word. The first letters of each element should be capitalized.
EXAMPLE 1 Beijing Shi (北京市, Beijing Municipality).
EXAMPLE 2 Hebei Sheng (河北省, Hebei Province).
EXAMPLE 3 Xikou Zhen (溪口镇, Xikou Town).
EXAMPLE 4 Shenzhen Tequ (深圳特区, Shenzhen Special E

SIST ISO 7098:2017

SIST ISO 7098:2017
Third edition
Information and documentation —
Romanization of Chinese
Information et documentation — Romanisation du chinois
Reference number
ISO 7098:2015(E)
ISO 2015

SIST ISO 7098:2017
ISO 7098:2015(E)

SIST ISO 7098:2017
ISO 7098:2015(E)

SIST ISO 7098:2017
ISO 7098:2015(E)

SIST ISO 7098:2017
ISO 7098:2015(E)

The first edition of ISO 7098 was published in 1982 after ISO/TC 46 recognized the need for an
International Standard specifying the Chinese phonetic alphabet. The second edition was published in
This third edition is in response to new application needs, for instance to reflect current Chinese
romanization practice and new developments in China and the rest of the world.
© ISO 2015 – All rights reserved v

SIST ISO 7098:2017

SIST ISO 7098:2017
Information and documentation — Romanization of Chinese
1 Scope
This International Standard explains the principles of the Romanization of Modern Chinese Putonghua
(Mandarin Chinese), the official language of the People’s Republic of China as defined in the Directives for
the Promotion of Putonghua, promulgated on 1956-02-06 by the State Council of China. This International
Standard can be applied in documentation of bibliographies, catalogues, indices, toponymic lists, etc.
2 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
element of a writing system, whether or not alphabetical, that represents a phoneme, a syllable, a word
or even prosodic characteristics of the language, by using graphical symbols (letters, diacritical marks,
syllabic signs, punctuation marks, prosodic accents, etc.) or a combination of these signs (a letter having
an accent or a diacritical mark)
EXAMPLE a, B, ω or Γ are, therefore, characters as well as basic letters.
ordered character set, the order of which has been agreed upon
alphabetical characters
character set that contains letters (2.8)
alphanumeric characters
character set that contains both letters (2.8) and digits
graphic character
character that has a visual representation and is normally produced by writing, printing or displaying
ideophonographical character
graphic character (2.6) that represents an object or a concept and is associated with a sound element in
a natural language
EXAMPLE Chinese hanzi 鹤(crane), Japanese kanji 戦(war) and Korean hanja 册(book) are
ideophonographical characters.
Chinese characters
ideophonographical character set for recording the Chinese language
Note 1 to entry: Chinese characters (hanzi) are also used in the writing systems of other languages.
graphic character (2.6) that, when appearing alone or combined with others, is primarily used to
represent a sound element of a spoken language
SIST ISO 7098:2017
ISO 7098:2015(E)

word segmentation
process of splitting text into a sequence of word segmentation unit
[SOURCE: ISO 24614-1:2010, 2.25]
3 General principles of conversion of writing systems
3.1 The words in a language, which are written according to a given script (the converted system),
sometimes have to be rendered according to a different system (the conversion system), normally used
for a different language.
This operation is often performed for historical or geographical texts, cartographical documents and,
in particular, for bibliographical work in every case where it is necessary to write words supplied in
various alphabets in a manner that allows intercalation with other words in a single alphabet so as
to enable a uniform alphabetization to be made in bibliographies, catalogues, indices, toponymic lists,
etc. It is indispensable in that it permits the univocal transmission of a written message between two
countries using different writing systems or exchanging a message, the writing of which is different
from their own. It, thereby, permits transmission by manual as well as mechanical or electronic means.
The two basic methods of conversion of a system of writing are transliteration and transcription.
3.2 Transliteration is the operation which consists of representing the characters of an entirely
alphabetical character or alphanumeric character system of writing by the characters of the
conversion alphabet.
In principle, this conversion should be made character by character: each character of the converted
alphabet is rendered by one character, and one only of the conversion alphabet, to ensure the complete
and unambiguous reversibility of the conversion alphabet into the converted alphabet.
When the number of characters used in the conversion system is smaller than the number of characters
of the converted system, it is necessary to use digraphs or diacritical marks. In this case, one shall avoid
as far as possible arbitrary choices and the use of purely conventional marks and try to maintain a
certain phonetic logic in order to give the system a wide acceptance.
However, it shall be accepted that the graphism obtained may not always be correctly pronounced
according to the phonetic habits of the language (or of all the languages) which usually use(s) the
conversion alphabet. On the other hand, this graphism shall be such that the reader who knows the
converted language may mentally restore unequivocally the original graphism and, thus, pronounce
it correctly.
3.3 Retransliteration is the operation which consists of converting the characters of a conversion
alphabet to those of the converted alphabet.
This operation is the exact opposite of transliteration; it is carried out by applying the rules of a system
of transliteration in reverse order so as to reconstitute the transliterated word to its original form.
3.4 Transcription is the operation which consists of representing the characters of a language, whatever
the original system of writing, by the phonetic system of letters or signs of the conversion language.
A transcription system is of necessity based on the orthographical conventions of a conversion language
and its alphabet. The users of a transcription system shall, therefore, have a knowledge of the conversion
language to be able to pronounce the characters correctly. Transcription is not strictly reversible.
Transcription may be used for the conversion of all writing systems. It is the only method that can
be used for systems that are not entirely alphabetical and for all ideophonographic writing systems
(Chinese, Japanese, etc.).
SIST ISO 7098:2017
ISO 7098:2015(E)

3.5 Romanization is the conversion of non-Latin writing systems to the Latin alphabet by means of
transliteration or transcription.
To carry out Romanization, it is possible to use either transliteration or transcription or a combination
of these two methods, according to the nature of the converted system.
3.6 A conversion system proposed for international use may call for compromise and the sacrifice of
certain national customs.
It is, therefore, necessary for each national community of users to accept concessions, fully abstaining
in every case from imposing as a matter of course solutions that are actually justified only by national
practice (for example, regarding pronunciation, orthography, etc.). However, these concessions would
obviously not relate to the use that a country makes of its national writing system: when this national
system is not converted, the characters constituting it shall be accepted in the form in which they are
written in the national language.
When a country uses two systems univocally, converting one into the other to write its own language,
the system of transliteration thus implemented shall be taken a priori as a basis for the international
standardized system, as far as it is compatible with the other principles mentioned hereafter.
3.7 Where necessary, the conversion systems should specify an equivalent for each character, not only
the letters but also the punctuation marks, numbers, etc.
They should similarly take into account the arrangement of the sequence of characters that make up the
text, for example, the direction of the script, and specify the way of distinguishing words and of using
separation signs and capital letters, following as closely as possible the customs of the language(s)
which use the converted writing system.
4 Principles for converting ideophonographic characters
4.1 The structure of ideophonographic characters, where conveyance of meaning is of greater
importance than that of pronunciation, entails the existence of a large number of characters (more than
60 000 in the case of Chinese), thus, making sign by sign transliteration impossible and resulting in the
need to devise a system of transcription.
Each character shall, therefore, be transcribed by one or more Latin letters standing for the
pronunciation or pronunciations of the character in question. This means that the transcriber shall be
familiar with the reading or readings of the text to be transcribed.
4.2 In as much as the transcription of ideophonographic characters is merely a matter of phonetic
notation in Latin letters of characters of the languages which use them, identical characters will require
different transcriptions depending on whether they are found in Chinese, Japanese or Korean texts.
4.3 On the other hand, the same character within the same language shall always be transcribed in the
same way regardless of the type of graphic representation utilized (traditional form or simplified form of
a Chinese character), except where a single character has more than one pronunciation.
4.4 Reversibility of Romanization systems of ideophonographic characters is impossible due to the
following factors:
— the disparity in pronunciation of a given character in two different languages or within a single
— the high frequency of homophones within the same language (see Annex C);
— the possible coexistence of several writing systems within a given text.
SIST ISO 7098:2017
ISO 7098:2015(E)

4.5 In the case of those languages which use, even within the same text, more than one kind of script
(for example Kana and Chinese characters in Japanese, Hangul and Chinese characters in Korean), both
the transcription of the ideophonographic characters and the conversion of the other types of characters
(for example Kana/Hangul) should yield a consistent and homogeneous system of Romanization.
4.6 Although, as a rule, spacing between syllables of Chinese is regular, it is usual to transcribe the
different characters (or syllables) forming a single word by linking them together, in order to separate
the different words by the space.
The principles and rules for formation of words (orthography) shall be standardized to the language
4.7 Although there are no capital letters in ideophonographic characters, it is usual when romanizing
to capitalize some words, following the national uses.
5 Pinyin
The Scheme of the Chinese Phonetic Alphabet (Hanyu Pinyin Fang’an or Pinyin Fang’an), which was
officially adopted on 1958-02-11 by the National People’s Congress of the People’s Republic of China,
is used to transcribe Chinese. The transcriber writes down the pronunciation of Chinese characters
according to their readings in Standard Chinese (Putonghua).
6 Syllabic forms
6.1 Each Chinese character generally represents one syllable. One word may consist of one or more
6.2 A Chinese syllable can be divided into two parts: initial and final.
6.2.1 Initial
— Bilabial: b p m;
— Labio-dental: f;
— Dorso-prepalatal: d t n l;
— Dorso-velar: g k h;
— Apico-alveolar: z c s;
— Apico-postalveolar: zh ch sh r;
— Dorso-palatal: j q x;
— Zero initial: nothing before the far left of the final.
6.2.2 Final
— Articulation A: Articulation with a, o, e as medial or main vowel. For example, a, o, e, ei, ao, ou, an,
ang, en, eng, ong, er, and with i in zi, ci, si, zhi, chi, shi, ri as main vowel.
— Articulation B: Articulation with u as medial or main vowel. For example, u, ua, uo, uai, ui, uan,
uang, un, ueng.
— Articulation C: Articulation with i as medial or main vowel. For example, i, ia, ie, iao, iu, ian, iang,
in, ing, iong.
SIST ISO 7098:2017
ISO 7098:2015(E)

— Articulation D: Articulation with ü as medial or main vowel. For example, ü, üe, üan, ün. Hanyu
Pinyin simplifies the spellings of syllables with ü by using the u form instead in cases where no
ambiguity could result.
6.3 Table of syllabic forms
The table of Chinese syllabic forms is given in Annex A. This table covers all syllables of Chinese
Putonghua except syllable ê and retroflexion syllable.
6.4 Reference dictionaries
Among reference books of modern Chinese are the following dictionaries.
— 中国社会科学院语言研究所词典编辑室编《. 现代汉语词典》(第6版). 北京: 商务印书馆, 2012.
Dictionary Compilation Division, Institute of Linguistics, Chinese Academy of Social Sciences, The
Contemporary Chinese Dictionary (6 Edition). Beijing: The Commercial Press, 2012.
This dictionary gives the transcriptions in Pinyin of more than 69 000 words.
— 《现代汉语词典(汉英双语)》. 北京: 外语教学与研究出版社, 2002.
The Contemporary Chinese Dictionary (Chinese-English). Beijing: Foreign Language Teaching and
Research Press, 2002.
This dictionary includes equivalent English explanations for Chinese words.
— 德范克主编. 《ABC 汉英大词典》. 夏威夷: 夏威夷大学出版社, 2003.
John DeFrancis. ABC Chinese-English Comprehensive Dictionary. Hawai’i: University of Hawai’i Press,
This dictionary includes 71 344 words, arranged in Pinyin alphabet order. It is easy to check by Pinyin.
— 《新华字典》(第11版). 北京: 商务印书馆,2011.
Xinhua Zidian (11 Edition). Beijing: The Commercial Press, 2011.
This dictionary includes the transcriptions in Pinyin of more than 10 000 characters.
These dictionaries can be complemented by the following list of Chinese characters.
— 中华人民共和国国务院《. 通用规范汉字表》. 北京:语文出版社, 2013.
State Council of People’s Republic of China. List of Standard Chinese Characters for General Use.
Beijing: Language and Culture Press, 2013.
This list includes 8 105 commonly-used Chinese characters. In addition, it has a concordance table
of simplified characters and non-simplified characters.
7 Tones
7.1 Chinese is a tonal language.
This means that the tone affects meaning. The same sound pronounced in different tones can mean
very different concepts.
SIST ISO 7098:2017
ISO 7098:2015(E)

Each syllable may have one of four tones or may be toneless. The four tones are marked by the following
diacritic signs (every diacritic sign has a special hexadecimal code):
— 1 tone (high and level tone) ˉ (hex: 0304);
— 2 tone (rising tone) ˊ (hex: 0301);
— 3 tone (falling-rising tone) ˇ (hex: 030C);
— 4 tone (falling tone) ˋ (hex: 0300).
Here is a graphical representation of the four tones.
Figure 1 — Graphical representation of the four tones of Putonghua (superposed)
1  tone
1st tone nd
2  tone
2nd tone
4  tone4th tone
3rd tone
st nd rd th
a) 1 tone b) 2 tone c) 3 tone d) 4 tone
Figure 2 — Graphical representation of the four tones of Putonghua (separate)
7.2 In the table of Chinese syllabic forms (see Annex A), the syllables do not carry tone marks. But in
the text, it is usual to indicate the tone of a syllable by placing the diacritic sign on a vowel.
EXAMPLE ē, é, ě, è.
The diacritic sign for tone is placed on the main vowel in the final part of a syllable.
EXAMPLE /béi/, /què/.
In the final part /éi/ of syllable /béi/, /é/ is the main vowel; and in the final part /uè/ of syllable /què/,
/è/ is the main vowel. Therefore, the tone sign is placed on /é/ and /è/, accordingly.
In this case, Latin script letters have to be extended to indicate the Chinese vowels with different tones.
The hexadecimal codes of these extended Latin script letters are given in Annex B.
SIST ISO 7098:2017
ISO 7098:2015(E)

7.3 Neutral tone (atony) is indicated by the lack of a diacritic sign.
7.4 Changes of tone induced by the tone of the next syllable in a word are not shown.
7.5 For practical or technical reason, tones can also be expressed by numbers or letters.
For example, Arabic numbers 1, 2, 3, 4 and 5 usually express respectively 1st tone, 2nd tone, 3rd tone,
4th tone and atony of Chinese.
7.6 Tone marks may be written as a learning tool; however, they can be omitted for convenience.
8 Punctuation
Punctuation marks similar to those existing in the sets of Latin characters are transcribed as their
Latin counterparts. Chinese specific punctuation marks are transcribed as follows.
Table 1 — Romanization of Chinese punctuation marks
Chinese mark Latin mark Note
° hex: 3002 . hex: 002E full stop
special comma used to set off a short
ˋ hex: 3001 , hex: 002C
pause in the series
• hex: 2022 Space hex: 0020 disconnect mark
…… hex: 2026 2026 … hex: 2026 horizontal ellipsis
9 Numerals
Numerals written in Chinese characters are transcribed in Pinyin. Numerals written in Arabic or
Roman characters are kept as such.
10 Chinese Pinyin Orthography
10.1 Most of commonly used Chinese words are polysyllabic words.
In international documentation and information, it is reasonable to link different Pinyin monosyllables
to form a polysyllabic Chinese word (see Annex C).
10.2 Before the Middle Age, the Greeks and Romans always knew what a word was, and they were able
to identify words even if the texts were written without spaces between neighbouring words at that time.
Afterwards, the spaces between words were invented in Europe. The use of spaces implies the concept
of word, it has become the standard for all modes of writing alphabetical languages to insert spaces
between words, and the publishers and librarians in the world apply this common standard.
10.3 In Chinese Pinyin, it is also necessary to use the spaces to separate words, not syllables.
The word segmentation is a very good tradition of world civilization. In the Romanization of Chinese, it
is beneficial to respect this good tradition.
10.4 In Chinese Pinyin, monosyllable is ambiguous.
One syllable can represent several Chinese characters. Therefore, Pinyin syllable is ambiguous in
representation of Chinese characters. In Chinese Pinyin, the ambiguity index of monosyllables is big.
In average, one Chinese syllable has to represent more than 20 Chinese characters for general use.
SIST ISO 7098:2017
ISO 7098:2015(E)

However, if different Chinese monosyllables are linked to form the polysyllabic Chinese word, the
ambiguity index of Pinyin syllable will be reduced. In order to disambiguate Pinyin syllables, it is
necessary to link different monosyllables to form a polysyllabic Chinese word.
10.5 The further description of the ambiguity index for Chinese syllables is given in Annex C.
10.6 Basic Rules for Chinese Pinyin Orthography (GB/T 16159-2012, Chinese Standard, 2012) contains
rules for separating or joining syllables to form a word: rules for spelling common words (nouns, verbs,
adjectives, pronouns, etc.), rules for spelling fused phrase expressions, rules for spelling personal names
and place names, rules for representing tones, rules for hyphenation at the end of line, etc.
10.7 At present, in Chinese linguistics, there is no clear common definition of a Chinese word yet, so it
is difficult to decide the boundary (dividing line) of a common Chinese word sometime, and, of course, it
poses difficulty to link the monosyllables to form a common polysyllabic Chinese word.
However, the boundary of a Chinese proper noun is relatively clear. It is not so difficult to link different
monosyllables to form a Chinese polysyllabic proper noun (the named entity as personal name,
geographic name, language name, ethnic name, tribal name, religion name, etc.), because the boundary
of a Chinese polysyllabic named entity is easy to decide according to the standards or regulations of
Chinese. In international documentation and information, it is necessary and possible to link different
Pinyin monosyllables to form a Chinese polysyllabic named entity in order to avoid ambiguity.
11 Transcription rules for named entities
11.1 Chinese personal names are to be written separately with the surname first, followed by the given
name written as one word, with the initial letters of both capitalized.
EXAMPLE 1 Li Hua (李华).
EXAMPLE 2 Wang Jianguo (王建国).
The traditional compound surnames are to be written together.
EXAMPLE 3 Zhuge Kongming (诸葛孔明).
The two-character or multi-character double surnames without traditional permanence are to be
written separately with the initial letters of both capitalized.
EXAMPLE 4 Zhang Wang Shufang (张王淑芳).
EXAMPLE 5 Xiang Situ Wenliang (项司徒文良).
EXAMPLE 6 Ouyang Meng Xiang (欧阳孟翔).
The pen names and other aliases are to be treated in the same manner.
EXAMPLE 7 Lu Xun (鲁迅).
EXAMPLE 8 Mao Dun (茅盾).
EXAMPLE 9 Zhang San (张三).
EXAMPLE 10 Wang Pangzi (王胖子).
11.2 A surname, given name or seniority order after the adjuncts “xiao”, “lao”, “da” and “a” is to be
written separately and with the initial letter of the last name capitalized.
Adjuncts such as “xiao”, “lao”, “da” and “a” should not be capitalized unless they appear at the beginning
of a sentence.
EXAMPLE 1 xiao Liu (小刘, younger Liu).
SIST ISO 7098:2017
ISO 7098:2015(E)

EXAMPLE 2 lao Qian (老钱, older Qian).
EXAMPLE 3 da Li (大李, older Li).
EXAMPLE 4 a Gui (阿贵, Mr. Gui).
If the character “xiao”, “lao”, “da” and “a” is part

