Translation-oriented writing — Text production and text evaluation

This document provides recommendations for the production and evaluation of technical texts intended for translation into other languages by means of human and/or machine translation. In addition, this document also addresses text production and translation interface requirements in connection with technical texts. This document is intended for authors and editors of technical texts intended for translation. This document also enables translators and translation service providers to assess the suitability of technical texts for translation. This document can also be used by tool manufacturers to develop automatic language testing and verification procedures. This document does not provide recommendations for the specific requirements of producing fiction, journalistic, advertising and other non-technical texts.

Rédaction adaptée à la traduction — Production et évaluation de textes

General Information

Status: Not Published

ICS: 01.020 - Terminology (principles and coordination)
: 01.140.10 - Writing and transliteration
: 35.240.30 - IT applications in information, documentation and publishing

Technical Committee: ISO/TC 37 - Language and terminology
Drafting Committee: ISO/TC 37 - Language and terminology

Current Stage: 5020 - FDIS ballot initiated: 2 months. Proof sent to secretariat
Start Date: 14-Apr-2026
Completion Date: 14-Apr-2026

Overview

ISO/FDIS 18968:2026, "Translation-oriented writing - Text production and text evaluation," is an international standard developed by ISO. This standard provides comprehensive recommendations for the production, formatting, and evaluation of technical texts intended for translation. It addresses best practices for authors and editors creating technical documentation that will be translated using human and/or machine translation processes. Additionally, ISO/FDIS 18968 covers the interface requirements between text creation and translation, facilitating higher translation quality, reduced errors, and efficient workflows. The document is particularly relevant for technical writers, translation service providers, language technology tool developers, and quality assurance professionals engaged in multilingual communication.

Key Topics

ISO/FDIS 18968 centers on practical guidelines for enhancing translatability and supporting seamless integration with translation technologies. Key areas include:

Text Production: Guidance for creating technical texts that are clear, consistent, and free of ambiguities. Emphasis is placed on source language quality, domain-appropriate terminology, and suitable document structures.
Formatting for Translation: Recommendations on the correct use of formatting features (such as hard breaks, soft breaks, manual hyphenation, and lists) to ensure compatibility with Computer-Assisted Translation (CAT) tools and Translation Memory (TM) systems.
Handover Procedures: Best practices for preparing and transferring source texts to translators, including considerations for file formats, context provision (reference materials, terminology aids), and document templates.
Text Evaluation: Criteria and methods for evaluating the suitability of technical texts for translation, helping translators and language service providers identify potential issues before translation begins.

Applications

The practical value of ISO/FDIS 18968 spans the entire translation workflow for technical content in globalized industries:

Technical Authors and Editors: Create source documents that are easily understood and efficiently processed by human translators and machine translation systems. The standard helps reduce the need for rework and translator queries, streamlining the translation process.
Translators and Language Service Providers: Assess incoming technical documents for translatability, identify potential issues, and improve translation quality by referencing standardized criteria.
Tool Manufacturers: Develop or enhance CAT tools, language checking software, and verification procedures based on standard-compliant formatting and language checks. This optimizes integration of automatic language testing and verification into translation workflows.
Quality Assurance Teams: Use the standard’s evaluation criteria and checklists to assess whether technical content meets established guidelines for translation readiness, reducing localization errors and ensuring consistency across languages.

Related Standards

ISO/FDIS 18968 references and complements several important international standards in the fields of technical communication, language technology, and translation services:

ISO 20539: Vocabulary for translation, interpreting, and related technology.
ISO 17100: Requirements for translation services.
IEC/IEEE 82079-1: Preparation of instructions for use of products.
ISO 639: Codes for the representation of language names and identifiers.
ISO 11669: Translation projects - Guidelines for managing translation processes.
ISO 21720: XML Localization Interchange File Format (XLIFF), supporting interoperability in translation workflows.

Conclusion

ISO/FDIS 18968 delivers a framework for producing and evaluating technical texts explicitly designed for translation. By following its recommendations, organizations can ensure that their documentation is internationally adaptable, reduces translation costs, and minimizes post-editing and error correction. Implementing this standard supports best practices in multilingual content management, raises the quality of translations, and enhances overall efficiency in technical communication processes.

For further details, consult the ISO website or your national standardization body.

Buy Documents

ISO/FDIS 18968 - Translation-oriented writing — Text production and text evaluation - Page 1 preview

ISO/FDIS 18968 - Translation-oriented writing — Text production and text evaluation - Page 2 preview

ISO/FDIS 18968 - Translation-oriented writing — Text production and text evaluation - Page 3 preview

Draft

ISO/FDIS 18968 - Translation-oriented writing — Text production and text evaluation

Release Date:31-Mar-2026

English language (49 pages)

sale 15% off

REDLINE ISO/FDIS 18968 - Translation-oriented writing — Text production and text evaluation - Page 1 preview

REDLINE ISO/FDIS 18968 - Translation-oriented writing — Text production and text evaluation - Page 2 preview

REDLINE ISO/FDIS 18968 - Translation-oriented writing — Text production and text evaluation - Page 3 preview

Draft

REDLINE ISO/FDIS 18968 - Translation-oriented writing — Text production and text evaluation

Release Date:31-Mar-2026

English language (49 pages)

sale 15% off

Get Certified

Connect with accredited certification bodies for this standard

BSI Group

BSI (British Standards Institution) is the business standards company that helps organizations make excellence a habit.

UKAS United Kingdom Verified

Visit Website

NYCE

Mexican standards and certification body.

EMA Mexico Verified

Visit Website

Frequently Asked Questions

What is ISO/FDIS 18968?

ISO/FDIS 18968 is a draft published by the International Organization for Standardization (ISO). Its full title is "Translation-oriented writing — Text production and text evaluation". This standard covers: This document provides recommendations for the production and evaluation of technical texts intended for translation into other languages by means of human and/or machine translation. In addition, this document also addresses text production and translation interface requirements in connection with technical texts. This document is intended for authors and editors of technical texts intended for translation. This document also enables translators and translation service providers to assess the suitability of technical texts for translation. This document can also be used by tool manufacturers to develop automatic language testing and verification procedures. This document does not provide recommendations for the specific requirements of producing fiction, journalistic, advertising and other non-technical texts.

What is the scope of ISO/FDIS 18968?

What ICS categories does ISO/FDIS 18968 belong to?

ISO/FDIS 18968 is classified under the following ICS (International Classification for Standards) categories: 01.020 - Terminology (principles and coordination); 01.140.10 - Writing and transliteration; 35.240.30 - IT applications in information, documentation and publishing. The ICS classification helps identify the subject area and facilitates finding related standards.

How can I access ISO/FDIS 18968?

ISO/FDIS 18968 is available in PDF format for immediate download after purchase. The document can be added to your cart and obtained through the secure checkout process. Digital delivery ensures instant access to the complete standard document.

Standards Content (Sample)

FINAL DRAFT
International
Standard
ISO/TC 37
Translation-oriented writing — Text
Secretariat: SAC
production and text evaluation
Voting begins on:
Rédaction adaptée à la traduction — Production et évaluation de 2026-04-14
textes
Voting terminates on:
2026-06-09
RECIPIENTS OF THIS DRAFT ARE INVITED TO SUBMIT,
WITH THEIR COMMENTS, NOTIFICATION OF ANY
RELEVANT PATENT RIGHTS OF WHICH THEY ARE AWARE
AND TO PROVIDE SUPPOR TING DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO
LOGICAL, COMMERCIAL AND USER PURPOSES, DRAFT
INTERNATIONAL STANDARDS MAY ON OCCASION HAVE
TO BE CONSIDERED IN THE LIGHT OF THEIR POTENTIAL
TO BECOME STAN DARDS TO WHICH REFERENCE MAY BE
MADE IN NATIONAL REGULATIONS.
Reference number
FINAL DRAFT
International
Standard
ISO/TC 37
Translation-oriented writing — Text
Secretariat: SAC
production and text evaluation
Voting begins on:
Rédaction adaptée à la traduction — Production et évaluation de
textes
Voting terminates on:
RECIPIENTS OF THIS DRAFT ARE INVITED TO SUBMIT,
WITH THEIR COMMENTS, NOTIFICATION OF ANY
RELEVANT PATENT RIGHTS OF WHICH THEY ARE AWARE
AND TO PROVIDE SUPPOR TING DOCUMENTATION.
© ISO 2026
IN ADDITION TO THEIR EVALUATION AS
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO
LOGICAL, COMMERCIAL AND USER PURPOSES, DRAFT
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
INTERNATIONAL STANDARDS MAY ON OCCASION HAVE
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
TO BE CONSIDERED IN THE LIGHT OF THEIR POTENTIAL
or ISO’s member body in the country of the requester.
TO BECOME STAN DARDS TO WHICH REFERENCE MAY BE
MADE IN NATIONAL REGULATIONS.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland Reference number
ii
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Recommendations for specialized content intended for translation . 4
4.1 General .4
4.2 Formatting .4
4.2.1 General .4
4.2.2 Formatting in terms of text flow and segmentation .5
4.2.3 Formatting graphic layout of texts . .14
4.2.4 Formatting text content and references .16
4.3 Terminology .18
4.3.1 General .18
4.3.2 Synonymy .18
4.3.3 Uncommon or unknown abbreviations .19
4.3.4 Orthographic variants . 20
4.3.5 Ambiguities . 20
4.3.6 Compound terms and constructions .21
4.3.7 Assuring correct use of terminology . 22
4.4 Grammar, syntax and style . 23
4.4.1 General . 23
4.4.2 Sentence structure . 23
4.4.3 Word choice and word formation . 25
4.4.4 Unambiguous references . 29
4.4.5 Style and reader engagement .31
4.4.6 Gender-sensitive language .32
4.5 Presentation of content . 33
4.5.1 Culture . 33
4.5.2 Logic .37
5 Recommendations for the handover from text production to translation .37
5.1 General .37
5.2 File formats . .37
5.3 Layout . 39
5.4 Contextual information, reference material and terminology aids . 39
5.5 Locales, document templates and styles . 40
5.6 Lists, indices and glossaries .41
5.7 Reviewing source language content prior to providing it to the translation service
provider .41
6 Translation-oriented texts: Evaluation . 41
6.1 General .41
6.2 Criteria .42
6.3 Evaluation methods .42
Annex A (informative) List of evaluation criteria .43
Annex B (informative) Checklist of recommendations .46
Bibliography .49

iii
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent
rights in respect thereof. As of the date of publication of this document, ISO had not received notice of (a)
patent(s) which may be required to implement this document. However, implementers are cautioned that
this may not represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO’s adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.

iv
Introduction
0.1 Overview
As goods and services are increasingly distributed across international markets, specialized content must
be made available in the languages required for each target market. Such content may be translated by
human translators, with or without computer-assisted translation (CAT) tools, or by machine translation
(MT) systems, with or without subsequent post-editing by a human translator. In this context, translation-
oriented writing can reduce the number of translator queries, lower costs, minimize translation errors, and
shorten delivery times, particularly when translation technologies are used.
Technical documentation in particular (e.g. product manuals, online help, safety data sheets or assembly
instructions) as a form of specialized content is intended to ensure that products are used safely, efficiently
and effectively. Standards such as IEC/IEEE 82079-1 define the requirements for technical documentation.
In-house style guides supplement these requirements by codifying enterprise-specific writing conventions.
This document provides recommendations for authoring or editing specialized content in the source
language intended for translation into one or more target languages and for the evaluation of translation-
oriented texts. Some clauses can, however, also be useful to non-specialized texts intended for translation.
0.2 Notations
This document uses structured examples to illustrate textual features to avoid because they can cause
problems in translation. The examples in this document have been chosen for illustrative purposes and are
not necessarily equally applicable to every language pair (e.g. recommendation in 4.2.2.4. is not applicable to
Japanese).
The examples are each laid out and formatted to support the recommendations in question. Due to the
variety of issues covered, the layout varies throughout this document. However, an attempt has been made
to use as consistent a layout as possible.
— Every example contains at least source language content and an explanation. The explanation points out
any issues with the source language content and their possible effect on translation. Target language
examples are provided only when they provide added value for the explanation and understanding of
the issue at hand. The explanation gives information on errors in the target language content that would
otherwise not be understood due to readers’ lack of proficiency in the target language.
— The symbol (+) is used for positive examples and the symbol (–) is used for negative examples.
— Numbers in parentheses such as (1), (2), (3) indicate possible variants of target language content.
— Characters in parentheses are only used to specify the example type (positive, negative, variant). They
do not represent a part of the source or target language content displayed.
— No specific symbol is used for neutral examples.
— The source language of examples is always English; therefore, the source language is not indicated.
— The target languages of examples vary, but are mainly German, French and Italian. The target language
is indicated in square brackets in the column or line marking the target language content. Italian target
language content, for example, is marked with [it] in accordance with ISO 639.

v
EXAMPLE 1
Source language content Target language content [it]
(–) A negative example in English to highlight source (–) A negative example (in Italian) to highlight possible
language content issues. translation errors resulting from the issues in the source
language content.
(+) A positive example in English of translation-oriented (+) A positive example (in Italian) containing no
writing avoiding the issues of the negative example. translation error.
Explanation: An explanation pointing out the issues with the source language content and their possible effect on
translation.
EXAMPLE 2
Source language content Target language content [fr]
(–) A negative example in English to highlight source (1) One possible translation (in French) to highlight
language content issues. possible translation variants.
(2) Another possible translation (in French) to highlight
possible translation variants.
(+) A positive example in English of translation-oriented (+) A positive example (in French), containing
writing avoiding the issues of the negative example. unambiguous target language content.
Explanation: An explanation pointing out the issues with the source language content and their possible effect on
translation.
Examples in 4.2 are more complex than other examples, since they display the effects of formatting in the
text editor on translation with CAT tools. In addition, segments are displayed for the target language content
if segmentation is relevant.
EXAMPLE 3
Source language (–) A negative example in English to highlight source language content formatting errors as they
content in text would appear in a text editor.
editor
(+) A positive example in English, containing no formatting errors.
Source language Source language content examples as they would appear in a TM system during translation.
content in TM
(–) A negative example to highlight the issues CAT tools have with incorrect formatting.
system
(+) A positive example to show how correct formatting avoids negative effects on translation
using CAT tools.
Target language (–) A negative example (in German) to highlight possible translation errors in the target language
content in TM content in the TM system resulting from the formatting errors in the source language content.
system [de]
(+) A positive example (in German) of correct target language content in the TM system resulting
from correctly formatted source language content.
Target language (–) A negative example (in German) to highlight possible translation errors in the target language
content in text content in the text editor resulting from the formatting errors in the source language content.
editor [de]
(+) A positive example (in German) of correct target language content in the text editor resulting
from correctly formatted source language content.
Explanation An explanation pointing out the formatting issues with the source language content and their
possible effect on translation with CAT tools.

vi
FINAL DRAFT International Standard ISO/FDIS 18968:2026(en)
Translation-oriented writing — Text production and text
evaluation
1 Scope
This document gives guidance on authoring, editing and evaluating specialized source language content
intended for translation. In addition, this document addresses handover recommendations in connection
with the production and translation of specialized content. This document is meant as a practical guidance
and recognizes that not all clauses are equally applicable to every use case.
This document is applicable to authors and editors of specialized content intended for translation. It also
enables translators and translation service providers to assess the suitability of specialized content for
translation. This document is also applicable to tool providers (e.g. to develop and improve automatic source
language testing and verification procedures).
This document does not apply to authoring, editing and evaluating fictional, journalistic, advertising and
other non-specialized content.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO 20539, Translation, interpreting and related technology — Vocabulary
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 20539 and the following apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
automatic line break
software feature that automatically adapts the line break (3.5) of digital texts (3.14) to the layout and creates
an approximately uniform line length throughout the text
3.2
exchange format
machine-readable format for representing information that is intended to facilitate exchange of the
information between different applications
1)
[SOURCE: ISO 25964-1:—, 3.24, modified — Note 1 to entry deleted.]
1) Under preparation. Stage at the time of publication: ISO/FDIS 25964-1:2026.

3.3
match value
value of correspondence between a segment in the source text (3.12) and a segment found in a translation
memory
Note 1 to entry: Match value is expressed as a percentage. Higher percentages indicate higher similarities between
segments, while lower percentages indicate lower similarities between segments.
3.4
hard break
hard return
hard line break
paragraph break
line break (3.5) interpreted by text editors (3.15) as the end of a paragraph
Note 1 to entry: Hard breaks are represented by control characters and interrupt segmentation.
3.5
line break
point at which one line of text (3.14) ends and another begins
3.6
language checker
language checking tool
software or software feature that supports users in creating orthographically and grammatically correct,
consistent, comprehensible and corporate language-compliant content
3.7
language identifier
language symbol
string of characters assigned to an individual language or a language group for the purpose of identifying it
unequivocally
Note 1 to entry: In the ISO 639 language code, the string of characters consists of a string of letters.
EXAMPLE The individual language “Dutch” is assigned the two-letter language identifier “nl”, a three-letter
identifier “nld” for use in the field of terminology and other language applications, and another three-letter identifier
“dut” for use in the field of librarianship and documentation. The individual language “Polish” is assigned the two-
letter language identifier “pl” and the three-letter identifier “pol”. The language group “Khoisan languages” is assigned
the three-letter language identifier “khi”.
[SOURCE: ISO 639:2023, 3.7.10, modified — Note 2 to entry deleted.]
3.8
segment end
segment boundary
punctuation mark or other marker used to identify the end of a segment
3.9
segment pair
translation unit
TU
segment of source language content matched with its corresponding target language content
3.10
segmentation
process of splitting a text (3.14) into segments

3.11
soft break
soft return
soft line break
line break (3.5) which retains paragraph formatting
Note 1 to entry: Soft breaks are represented by control characters and retain segmentation.
3.12
source text
text (3.14) to be translated into one or more target languages
3.13
target text
translated text (3.14) written in the intended target language
3.14
text
content in written form
[SOURCE: ISO 17100:2015, 2.3.4]
3.15
text editor
software that enables a user to create and revise text (3.14)
[SOURCE: ISO/IEC 2382:2015, 2126196, modified — Notes 1 and 2 to entry deleted.]
3.16
termbase
terminology database
terminological database
database comprising a terminological data collection
[SOURCE: ISO 26162-3:2023, 3.11, modified — “terminological database” added as an admitted term.]
3.17
TM system
translation memory system
CAT tool that uses a translation memory
3.18
translation-oriented writing
writing for translation
text production resulting in content that lends itself to translation
3.19
translation project specifications
set of agreed upon and defined requirements for producing translation output
Note 1 to entry: ISO 17100:2015, Annex B, lists a set of sample project specifications.
Note 2 to entry: For detailed information on developing project specifications, see ISO 11669:2024 Clauses 4 and 5 and
Annexes B and C.
[SOURCE: ISO 5060:2024, 3.3.2. modified — Note 2 to entry added.]
3.20
XLIFF
XML Localization Interchange File Format
XML file format for the interchange of translation data or localization data, or both
Note 1 to entry: For more information on XLIFF, see ISO 21720.

4 Recommendations for specialized content intended for translation
4.1 General
Specialized content must fulfil its intended communicative function by having a specific purpose and
addressing one or more target audiences.
Specialized content is characterized by certain features such as:
— credibility and relevance;
— appropriateness and correctness;
— formal grammatical cohesion and functional semantic coherence of meaning to ensure that the content
is comprehensible in a given language;
— correct, consistent and domain-appropriate terminology;
— due consideration of the specific communicative situation between author and target audience;
— relationship with other specialized content;
— clear document structure.
Additional consideration should be given to any specialized content intended for translation. On the formal
side, the fact that specialized content will be processed using a computer is critical. Prior to translation,
specialized source language content is usually imported into TM systems. See 4.2.1 for additional detail. On
the content side, both country-specific and culture-specific elements should be avoided, as these kinds of
elements limit the international applicability of specialized content. See 4.2 to 4.5 for additional detail.
4.2 Formatting
4.2.1 General
Often, specialized source language content is translated using CAT tools. TM systems divide texts into
individual segments, matching source and target segment pairs and storing them for future retrieval in a
translation memory (TM). In order to prevent incorrect segmentation or the need for manual interference in
the TM system’s pre-set segmentation, such as merging and splitting of segments, texts should be formatted
correctly. For example, incorrectly placed line breaks and the like can result in incorrect segmentation of
the source text. Incorrect segmentation of the source text will cause the formation of inappropriate segment
pairs in the TM and will therefore render them unusable for future translation. If such inappropriate segment
pairs are accepted without validation, they can even cause translation errors. To avoid unpredictable
and irregular formatting results, the built-in text editor formatting and style functions should be used as
opposed to manual formatting. Failure to use built-in style features can result in unnecessary corrective
work, which in turn demands extra time and effort, and entails extra cost, cutting into overall translation
process efficiency.
The text editor’s built-in setting should be set to display paragraph marks and formatting symbols in order
to ensure that the formatting is visible. When these marks and symbols are visible, unnecessary, incorrect
or inconsistent formatting can be identified and removed prior to translation.
When a source text is drafted, it should also be borne in mind that the text can expand or contract in length
when translated into other languages and that extra space should therefore be included in the layout, where
necessary. Because of the expansion and contraction factor, line breaks, tabs or other manual interventions
intended to optimize the source-language layout can also be rendered in completely different positions in
the target language.
4.2.2 Formatting in terms of text flow and segmentation
4.2.2.1 Hard breaks
Recommendation: Hard breaks should not be manually inserted within sentences.
Line breaks are used to influence where the line ends. Software applications are designed to automatically
break the line as close to the margin as possible at or toward the end of the defined line. Line breaks can
include a blank space (e.g. space character), an existing hyphen, manually predetermined hyphenation (soft
hyphen, zero-width space), or predetermined, automatic hyphenation.
When automatic line breaks are used, the text is displayed across several lines in the text editor, but will not
be split into several segments by the TM system. However, manually inserted line breaks, especially hard
breaks, can be interpreted by segmentation utilities as segment breaks, which has a negative impact on
segmentation in CAT tools and can result in translation errors.
A hard break (symbol: ¶) is an end-of-line marker interpreted by text editors as the end of a paragraph.
A hard break always marks a segment end for automatic processes. Inappropriately placed hard breaks
cause incorrect segment pairs to be formed in the text editor and unnecessary corrective work, such as
merging segments manually. Especially where the sentence structure in the target language requires a
different word order than in the source language, a hard break can cause the source language and target
language content to misalign if the segmentation is not corrected manually.
EXAMPLE 1
Source language (–) Large distances and thick walls reduce the coverage of the radio¶
content in text
signal.
editor
Source language Segment 1 Large distances and thick walls reduce the coverage of the radio
content in TM
Segment 2 signal.
system
Target language Segment 1 Große Entfernungen und dicke Wände reduzieren die Reichweite des
content in TM
Segment 2 Funksignals.
system [de]
Source language (+) Large distances and thick walls reduce the coverage of the radio signal.
content in text
editor
Source language Segment 1 Large distances and thick walls reduce the coverage of the radio signal.
content in TM
system
Target language Segment 1 Große Entfernungen und dicke Wände reduzieren die Reichweite des Funksignals.
content in TM
system [de]
Explanation The two segments created by the hard break contain two incomplete text fragments: The first
segment is lacking an end of sentence character, and the second segment starts with a lower case
letter and contains no verb.
— This makes it more difficult (for humans and computers) to interpret the source
language content as one message in one sentence.
— This suggests an inappropriate segmentation for CAT tools.
— This divides the multi-word term “radio signal” into two separate components. This
division can prevent language technology processes from working correctly, such as
machine translation or terminology look-up during translation, term extraction or
named entity recognition.
The sentence without hard breaks conveys one message, produces no segmentation errors and
displays the multi-word term correctly.

EXAMPLE 2
Source language (–) Switch off the warning tone using¶
content in text
the alarm button.
editor
Source language Segment 1 Switch off the warning tone using
content in TM
Segment 2 the alarm button.
system
Target language Variant (1) Variant (2)
content in TM
Segment 1 Den Warnton ausschalten unter Ver- Den Warnton mit
system [de]
wendung
Segment 2 der Alarmtaste. der Alarmtaste ausschalten.
Explanation In this case, the translator has no good option to translate the source text split into two segments
into the target language.
Variant (1):
The language pairs contain the same content both in the source and the target language content
segments. In future translations, the content of identical pre-translated segments will most likely
match the intended meaning.
However, the complete sentence of the translation does not comply with the German grammar
rules and will irritate readers.
Variant (2):
The verb is translated and positioned correctly to comply with the German grammar rules.
However, when the wording “the alarm button.” is used in another context, the TM system will
identify it as a 100 % match, which then can result in an incorrect translation, because the
German target language content in Variant (2) also contains the content “switch off”.
EXAMPLE 3
Source language (–) Switch off the warning tone using¶ After 10 seconds the display is switched off.¶
content in text
the alarm button.
editor
Source language Segment 1 Switch off the warning tone using
content in TM
Segment 2 After 10 seconds the display is switched off.
system
Segment 3 the alarm button.
Explanation The order of segments in the TM system is not always the same as in the text editor. Finding the
matching segments takes time and is a potential source of errors.
4.2.2.2 Soft breaks
Recommendations:
— Soft breaks should be avoided in the middle of a sentence if they serve no purpose other than for layout.
— Soft breaks should be avoided at the end of a sentence.
A soft break (symbol: ↵ ) forces a new line, but does not interrupt paragraph formatting or segmentation.
Soft breaks are therefore preferable to a hard break within a sentence.
However, the soft break is also stored in the TM and thus reduces the match value of the segment. The
probability that the sentence will occur in exactly the same way, i.e. with the line break in exactly the same
place in another text, is low, and an adjustment by the translator will therefore be necessary.
Because the target text and source text can vary in length, it is often necessary to remove or move the soft
breaks in the target language content to keep the layout as it is in the source language content.

EXAMPLE 1
Source language (–) Artificial neural networks are a fundamental
content in text component of deep learning, mimicking the ↵
editor human brain to process complex data.
Source language Artificial neural networks are a fundamental component of deep learning, mimicking the ↵
content in TM human brain to process complex data.
system
Target language Künstliche neuronale Netze sind ein grundlegender Bestandteil des Deep Learning und ahmen
content in TM das ↵
system menschliche Gehirn nach, um komplexe Daten zu verarbeiten.
Target language Künstliche neuronale Netze sind ein
content in text grundlegender Bestandteil des Deep Learning
editor [de] und ahmen das ↵
menschliche Gehirn nach, um komplexe Daten
zu verarbeiten.
Explanation The soft break in the source language content in the text editor helps to keep “human brain”
together in one line. Since the soft break is displayed in the TM system’s editor during
translation, it is very likely that it will be placed in the same location in the target language
content. However, this will cause unwanted line breaks in the translated texts due to the
different text lengths of English and German.
Since soft breaks do not interrupt segmentation, they should not be used to separate entire source language
content into different lines. Instead, a hard break should be used.
EXAMPLE 2
Source language (–) Email: info@ example .com↵
content in text Office hours: Monday to Friday from 8:00 am to 6:30 pm
editor
Source language Segment 1 Email:
content in TM
Segment 2 info@ example .com↵
system
Office hours:
Segment 3 Monday to Friday from 8:00 am to 6:30 pm
Source language (+) Email: info@ example .com¶
content in text Office hours: Monday to Friday from 8:00 am to 6:30 pm
editor
Source language Segment 1 Email:
content in TM
Segment 2 info@ example .com
system
Segment 3 Office hours:
Segment 4 Monday to Friday from 8:00 am to 6:30 pm
Explanation In the negative example, the soft break after the email address causes parts of the two lines to
appear in one segment because there is no segment end marker (e.g. full stop) following the email
address. In the TM system, both the email address and all content before the next segment end
marker (colon) of the next line appear in one segment. This reduces the value of the segment for
later reuse. If both lines are separated by a hard break as in the positive example, then the
segments are separated correctly with respect to their status as information units.
4.2.2.3 Manual hyphenation
Recommendations:
— Manually inserting hyphens in order to influence layout should be avoided. Instead, the automatic
hyphenation feature should be used.
— Non-breaking hyphens should be used instead of hard or soft breaks in order to prevent an unwanted
line break after a hyphen.
Hyphens are used to join words and to separate syllables of a single word. “Automatic hyphenation” is a
software feature that automatically hyphenates words at the end of lines to improve the document’s
appearance.
When automatic hyphenation is used, the hyphens shown in the text editor at the end of the line do not
appear in the TM system or in the TM as hyphens. However, manually inserting hyphens to influence
hyphenation can lead to issues in the TM system.
Additional hyphens manually inserted for layout purposes (e.g. to change where the line breaks automatically)
do not influence segmentation, but will be imported into the TM system as a regular character.
EXAMPLE 1
Source language (–) The economics and history departments at the university are offering an interdisci-
content in text plinary seminar on Asia.
editor
Source language The economics and history departments at the university are offering an interdisci-plinary
content in TM seminar on Asia.
system
Target language Les départements d’économie et d’histoire de l’université proposent un séminaire
content in TM interdisciplinaire sur l’Asie.
system [fr]
Explanation The hyphen inserted manually for layout reasons is displayed as a hyphen in the middle of the
word “interdisciplinary” in the source segment. It does not affect the target language content,
but if the source text is later updated and the hyphen deleted, it will lead to the TM match being
reduced to a fuzzy match in a subsequent translation.
Also, if “interdisciplinary” and its translations are saved in the termbase, term recognition will
likely not work for “interdisci-plinary”. This can lead to incorrect or inconsistent use of
terminology in the target language content.
In order to keep the source text within the TM clean and reusable, the automatic hyphenation feature of the
text editor should be used instead of manual hyphenation. Automatic hyphens will be displayed in the text
editor, but will not be imported into the TM system as regular characters.
For certain hyphenated words or units of words separated by hyphens, automatic line breaks should be
avoided. Disabling automatic line breaks can facilitate reading fluency and prevent related units from being
placed on different lines.
Where a line break after a hyphen is unwanted, a so-called “non-breaking hyphen” or “no-break hyphen” can
be inserted instead of a normal hyphen. Non-breaking hyphens should be used where hyphenated character
strings should not be separated at the end of a line. This applies, for instance, to character strings containing
single letters, abbreviations, numbers, or to formulae.
Non-breaking hyphens are usually displayed as such in the TM system and can be used also in the target
language content, if applicable.

EXAMPLE 2
Source language (–) The teacher has provided the students with information sheets on how to calculate the k-
content in text factor.
editor
(+) The teacher has provided the students with information sheets on how to calculate the
k-factor.
Source language The teacher has provided the students with information sheets on how to calculate the k-
content in TM factor.
system
The teacher has provided the students with information sheets on how to calculate the
k[-]factor.
Target language (–) Die Lehrkraft hat Studierenden Informationsblätter bereitgestellt, mit denen sie den k-
content in text Faktor berechnen können.
editor [de]
(+) Die Lehrkraft hat Studierenden Informationsblätter bereitgestellt, mit denen sie den
k-Faktor berechnen können.
Explanation In the positive example, a non-breaking hyphen is used within “k-factor”. It prevents the auto-
matic line break behind “k-”. Instead, the whole expression is moved to the next line. The same
applies for the target language content if the non-breaking hyphen is used in the translation as
well.
4.2.2.4 Lists
Recommendation: Lists in the middle of a sentence should be avoided.
Where a sentence begins before and continues after a list, segments in many languages fail to align properly
and misaligned segment pairs of no use for future translations will be stored in the TM. Misaligned segment
pairs in the TM can lead to translation errors when reused without checking.
EXAMPLE
Source language (–) When mounting the holding plate
content in text
— bolts,
editor
— fasteners, and
— glue
must be used.
Source language Segment 1 When mounting the holding plate
content in TM
Segment 2 bolts,
system
Segment 3 fasteners, and
Segment 4 glue
Segment 5 must be used.
Target language Segment 1 Lors du montage de la plaque de maintien,
content in TM
Segment 2 des boulons,
system [fr]
Segment 3 des fixations, et
Segment 4 de la colle
Segment 5 doivent être utilisés.
Explanation In some languages, the target language content can be structured in the same way. However, in
this French example, the verb “doivent” must be plural, because the list contains several items.
The segment pair stored in the TM is “must be used – doivent être utilisés”. “must” can be used
with singular and plural parts of speech, but “doivent” can only be used in a plural context.
Additionally, “utilisés” will likely have to be inflected depending on the parts of speech to which
it refers.
For more information and another example on lists inside sentences, see 4.4.2.6.

4.2.2.5 Tabs
Recommendation: Tabs should be avoided within sente
...

ISO/DIS FDIS 18968:2026(en)
ISO/TC 37/WG 12
Secretariat: SAC
Date: 2026-02-03-27
Translation-oriented writing — Text production and text evaluation
Rédaction adaptée à la traduction — Production et évaluation de textes
FDIS stage
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication
may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying,
or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO
at the address below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: + 41 22 749 01 11
E-mail: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii
Contents
Foreword . iv
Introduction . v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Recommendations for specialized content intended for translation . 4
4.1 General. 4
4.2 Formatting . 4
4.3 Terminology . 24
4.4 Grammar, syntax and style . 30
4.5 Presentation of content . 45
5 Recommendations for the handover from text production to translation . 50
5.1 General. 50
5.2 File formats . 50
5.3 Layout . 53
5.4 Contextual information, reference material and terminology aids . 54
5.5 Locales, document templates and styles . 55
5.6 Lists, indices and glossaries . 55
5.7 Reviewing source language content prior to providing it to the translation service
provider . 55
6 Translation-oriented texts: Evaluation . 56
6.1 General. 56
6.2 Criteria . 56
6.3 Evaluation methods . 57
Annex A (informative) List of evaluation criteria . 58
Annex B (informative) Checklist of recommendations . 61
Bibliography . 65

iii
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee has been
established has the right to be represented on that committee. International organizations, governmental and
non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the
International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types of
ISO document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent rights
in respect thereof. As of the date of publication of this document, ISO had not received notice of (a) patent(s)
which may be required to implement this document. However, implementers are cautioned that this may not
represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO’s adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
iv
Introduction
0.1 Overview
As goods and services are increasingly distributed across international markets, specialized content must be
made available in the languages required for each target market. Such content may be translated by human
translators, with or without computer-assisted translation (CAT) tools, or by machine translation (MT)
systems, with or without subsequent post-editing by a human translator. In this context, translation-oriented
writing can reduce the number of translator queries, lower costs, minimize translation errors, and shorten
delivery times, particularly when translation technologies are used.
Technical documentation in particular (e.g. product manuals, online help, safety data sheets or assembly
instructions) as a form of specialized content is intended to ensure that products are used safely, efficiently
and effectively. Standards such as IEC/IEEE 82079-1 define the requirements for technical documentation. In-
house style guides supplement these requirements by codifying enterprise-specific writing conventions.
This document provides recommendations for authoring or editing specialized content in the source language
intended for translation into one or more target languages and for the evaluation of translation-oriented texts.
Some clauses can, however, also be useful to non-specialized texts intended for translation.
0.2 Notations
This document uses structured examples to illustrate textual features to avoid because they can cause
problems in translation. The examples in this document have been chosen for illustrative purposes and are
not necessarily equally applicable to every language pair (e.g. recommendation in 4.2.2.4. is not applicable to
Japanese).
The examples are each laid out and formatted to support the recommendations in question. Due to the variety
of issues covered, the layout varies throughout this document. However, an attempt has been made to use as
consistent a layout as possible.
— Every example contains at least source language content and an explanation. The explanation points out
any issues with the source language content and their possible effect on translation. Target language
examples are provided only when they provide added value for the explanation and understanding of the
issue at hand. The explanation gives information on errors in the target language content that would
otherwise not be understood due to readers’ lack of proficiency in the target language.
— The symbol (+) is used for positive examples and the symbol (–) is used for negative examples.
— Numbers in parentheses such as (1), (2), (3) indicate possible variants of target language content.
— Characters in parentheses are only used to specify the example type (positive, negative, variant). They do
not represent a part of the source or target language content displayed.
— No specific symbol is used for neutral examples.
— The source language of examples is always English; therefore, the source language is not indicated.
— The target languages of examples vary, but are mainly German, French and Italian. The target language is
indicated in square brackets in the column or line marking the target language content. Italian target
language content, for example, is marked with [it] in accordance with ISO 639.

v
EXAMPLE 1
Source language content Target language content [it]
(–) A negative example in English to highlight source (–) A negative example (in Italian) to highlight possible
language content issues. translation errors resulting from the issues in the source
language content.
(+) A positive example in English of translation-oriented (+) A positive example (in Italian) containing no
writing avoiding the issues of the negative example. translation error.
Explanation: An explanation pointing out the issues with the source language content and their possible effect on
translation.
EXAMPLE 2
Source language content Target language content [fr]
(–) A negative example in English to highlight source (1) One possible translation (in French) to highlight
language content issues. possible translation variants.
(2) Another possible translation (in French) to highlight
possible translation variants.
(+) A positive example in English of translation-oriented (+) A positive example (in French), containing
writing avoiding the issues of the negative example. unambiguous target language content.
Explanation: An explanation pointing out the issues with the source language content and their possible effect on
translation.
Examples in 4.2 are more complex than other examples, since they display the effects of formatting in the text
editor on translation with CAT tools. In addition, segments are displayed for the target language content if
segmentation is relevant.
EXAMPLE 3
Source (–) A negative example in English to highlight source language content formatting errors as they
language would appear in a text editor.
content in text
(+) A positive example in English, containing no formatting errors.
editor
Source Source language content examples as they would appear in a TM system during translation.
language
(–) A negative example to highlight the issues CAT tools have with incorrect formatting.
content in TM
(+) A positive example to show how correct formatting avoids negative effects on translation
system
using CAT tools.
Target (–) A negative example (in German) to highlight possible translation errors in the target language
language content in the TM system resulting from the formatting errors in the source language content.
content in TM
(+) A positive example (in German) of correct target language content in the TM system resulting
system [de]
from correctly formatted source language content.
Target (–) A negative example (in German) to highlight possible translation errors in the target language
language content in the text editor resulting from the formatting errors in the source language content.
content in text
(+) A positive example (in German) of correct target language content in the text editor resulting
editor [de]
from correctly formatted source language content.
Explanation An explanation pointing out the formatting issues with the source language content and their
possible effect on translation with CAT tools.
vi
Translation-oriented writing — Text production and text evaluation
1 Scope
This document gives guidance on authoring, editing and evaluating specialized source language content
intended for translation. In addition, this document addresses handover recommendations in connection with
the production and translation of specialized content. This document is meant as a practical guidance and
recognizes that not all clauses are equally applicable to every use case.
This document is applicable to authors and editors of specialized content intended for translation. It also
enables translators and translation service providers to assess the suitability of specialized content for
translation. This document is also applicable to tool providers (e.g. to develop and improve automatic source
language testing and verification procedures).
This document does not apply to authoring, editing and evaluating fictional, journalistic, advertising and other
non-specialized content.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO 20539, Translation, interpreting and related technology — Vocabulary
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 20539 and the following apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https://www.iso.org/obp
— IEC Electropedia: available at https://www.electropedia.org/
3.1
automatic line break
software feature that automatically adapts the line break (3.5) of digital texts (3.14) to the layout and creates
an approximately uniform line length throughout the text
3.2
exchange format
machine-readable format for representing information that is intended to facilitate exchange of the
information between different applications
11)
[SOURCE: ISO 25964-1:—, 3.24, modified — Note 1 to entry deleted.]

Under preparation.
1)
Under preparation. Stage at the time of publication: ISO/FDIS 25964-1:2026.

3.3
match value
value of correspondence between a segment in the source text (3.12) and a segment found in a translation
memory
Note 1 to entry: Match value is expressed as a percentage. Higher percentages indicate higher similarities between
segments, while lower percentages indicate lower similarities between segments.
3.4
hard break
hard return
hard line break
paragraph break
line break (3.5) interpreted by text editors (3.15) as the end of a paragraph
Note 1 to entry: Hard breaks are represented by control characters and interrupt segmentation.
3.5
line break
point at which one line of text (3.14) ends and another begins
3.6
language checker
language checking tool
software or software feature that supports users in creating orthographically and grammatically correct,
consistent, comprehensible and corporate language-compliant content
3.7
language identifier
language symbol
string of characters assigned to an individual language or a language group for the purpose of identifying it
unequivocally
Note 1 to entry: In the ISO 639 language code, the string of characters consists of a string of letters.
EXAMPLE The individual language “Dutch” is assigned the two-letter language identifier “nl”, a three-letter
identifier “nld” for use in the field of terminology and other language applications, and another three-letter identifier
“dut” for use in the field of librarianship and documentation. The individual language “Polish” is assigned the two-letter
language identifier “pl” and the three-letter identifier “pol”. The language group “Khoisan languages” is assigned the
three-letter language identifier “khi”.
[SOURCE: ISO 639:2023, 3.7.10, modified — Note 2 to entry deleted.]
3.8
segment end
segment boundary
punctuation mark or other marker used to identify the end of a segment
3.9
segment pair
translation unit
TU
segment of source language content matched with its corresponding target language content
3.10
segmentation
process of splitting a text (3.14) into segments
3.11
soft break
soft return
soft line break
line break (3.5) which retains paragraph formatting
Note 1 to entry: Soft breaks are represented by control characters and retain segmentation.
3.12
source text
text (3.14) to be translated into one or more target languages
3.13
target text
translated text (3.14) written in the intended target language
3.14
text
content in written form
[SOURCE: ISO 17100:2015, 2.3.4]
3.15
text editor
software that enables a user to create and revise text (3.14)
[SOURCE: ISO/IEC 2382:2015, 2126196, modified — Notes 1 and 2 to entry deleted.]
3.16
termbase
terminology database
terminological database
database comprising a terminological data collection
[SOURCE: ISO 26162-3:2023, 3.11, modified — “terminological database” added as an admitted term.]
3.17
TM system
translation memory system
CAT tool that uses a translation memory
3.18
translation-oriented writing
writing for translation
text production resulting in content that lends itself to translation
3.19
translation project specifications
set of agreed upon and defined requirements for producing translation output
Note 1 to entry: ISO 17100:2015, Annex B, lists a set of sample project specifications.
Note 2 to entry: For detailed information on developing project specifications, see ISO 11669:2024 Clauses 4 and 5 and
Annexes B and C.
[SOURCE: ISO 5060:2024, 3.3.2. modified — Note 2 to entry added.]
3.20
XLIFF
XML Localization Interchange File Format
XML file format for the interchange of translation data or localization data, or both
Note 1 to entry: For more information on XLIFF, see ISO 21720.
4 Recommendations for specialized content intended for translation
4.1 General
Specialized content must fulfil its intended communicative function by having a specific purpose and
addressing one or more target audiences.
Specialized content is characterized by certain features such as:
— credibility and relevance;
— appropriateness and correctness;
— formal grammatical cohesion and functional semantic coherence of meaning to ensure that the content is
comprehensible in a given language;
— correct, consistent and domain-appropriate terminology;
— due consideration of the specific communicative situation between author and target audience;
— relationship with other specialized content;
— clear document structure.
Additional consideration should be given to any specialized content intended for translation. On the formal
side, the fact that specialized content will be processed using a computer is critical. Prior to translation,
specialized source language content is usually imported into TM systems. See 4.2.1 for additional detail. On
the content side, both country-specific and culture-specific elements should be avoided, as these kinds of
elements limit the international applicability of specialized content. See 4.2 to 4.5 for additional detail.
4.2 Formatting
4.2.1 General
Often, specialized source language content is translated using CAT tools. TM systems divide texts into
individual segments, matching source and target segment pairs and storing them for future retrieval in a
translation memory (TM). In order to prevent incorrect segmentation or the need for manual interference in
the TM system’s pre-set segmentation, such as merging and splitting of segments, texts should be formatted
correctly. For example, incorrectly placed line breaks and the like can result in incorrect segmentation of the
source text. Incorrect segmentation of the source text will cause the formation of inappropriate segment pairs
in the TM and will therefore render them unusable for future translation. If such inappropriate segment pairs
are accepted without validation, they can even cause translation errors. To avoid unpredictable and irregular
formatting results, the built-in text editor formatting and style functions should be used as opposed to manual
formatting. Failure to use built-in style features can result in unnecessary corrective work, which in turn
demands extra time and effort, and entails extra cost, cutting into overall translation process efficiency.
The text editor’s built-in setting should be set to display paragraph marks and formatting symbols in order to
ensure that the formatting is visible. When these marks and symbols are visible, unnecessary, incorrect or
inconsistent formatting can be identified and removed prior to translation.
When a source text is drafted, it should also be borne in mind that the text can expand or contract in length
when translated into other languages and that extra space should therefore be included in the layout, where
necessary. Because of the expansion and contraction factor, line breaks, tabs or other manual interventions
intended to optimize the source-language layout can also be rendered in completely different positions in the
target language.
4.2.2 Formatting in terms of text flow and segmentation
4.2.2.1 Hard breaks
Recommendation: Hard breaks should not be manually inserted within sentences.
Line breaks are used to influence where the line ends. Software applications are designed to automatically
break the line as close to the margin as possible at or toward the end of the defined line. Line breaks can
include a blank space (e.g. space character), an existing hyphen, manually predetermined hyphenation (soft
hyphen, zero-width space), or predetermined, automatic hyphenation.
When automatic line breaks are used, the text is displayed across several lines in the text editor, but will not
be split into several segments by the TM system. However, manually inserted line breaks, especially hard
breaks, can be interpreted by segmentation utilities as segment breaks, which has a negative impact on
segmentation in CAT tools and can result in translation errors.
A hard break (symbol: ¶) is an end-of-line marker interpreted by text editors as the end of a paragraph.
A hard break always marks a segment end for automatic processes. Inappropriately placed hard breaks cause
incorrect segment pairs to be formed in the text editor and unnecessary corrective work, such as merging
segments manually. Especially where the sentence structure in the target language requires a different word
order than in the source language, a hard break can cause the source language and target language content to
misalign if the segmentation is not corrected manually.

EXAMPLE 1
Source (–) Large distances and thick walls reduce the coverage of the radio¶
language
signal.
content in text
editor
Source Segment 1 Large distances and thick walls reduce the coverage of the radio
language
Segment 2 signal.
content in TM
system
Target
Segment 1 Große Entfernungen und dicke Wände reduzieren die Reichweite des
language
Segment 2 Funksignals.
content in TM
system [de]
Source (+) Large distances and thick walls reduce the coverage of the radio signal.
language
content in text
editor
Source Segment 1 Large distances and thick walls reduce the coverage of the radio signal.
language
content in TM
system
Target Segment 1 Große Entfernungen und dicke Wände reduzieren die Reichweite des Funksignals.
language
content in TM
system [de]
Explanation The two segments created by the hard break contain two incomplete text fragments: The first
segment is lacking an end of sentence character, and the second segment starts with a lower case
letter and contains no verb.
— This makes it more difficult (for humans and computers) to interpret the source language

content as one message in one sentence.
— This suggests an inappropriate segmentation for CAT tools.
— This divides the multi-word term “radio signal” into two separate components. This
division can prevent language technology processes from working correctly, such as
machine translation or terminology look-up during translation, term extraction or named
entity recognition.
The sentence without hard breaks conveys one message, produces no segmentation errors and
displays the multi-word term correctly.

EXAMPLE 2
Source (–) Switch off the warning tone using¶
language
the alarm button.
content in text
editor
Source Segment 1 Switch off the warning tone using
language
Segment 2 the alarm button.
content in TM
system
Target
Variant (1) Variant (2)
language
Segment 1 Den Warnton ausschalten unter Den Warnton mit
content in TM
Verwendung
system [de]
Segment 2 der Alarmtaste. der Alarmtaste ausschalten.
Explanation In this case, the translator has no good option to translate the source text split into two segments
into the target language.
Variant (1):
The language pairs contain the same content both in the source and the target language content
segments. In future translations, the content of identical pre-translated segments will most likely
match the intended meaning.
However, the complete sentence of the translation does not comply with the German grammar
rules and will irritate readers.
Variant (2):
The verb is translated and positioned correctly to comply with the German grammar rules.
However, when the wording “the alarm button.” is used in another context, the TM system will
identify it as a 100 % match, which then can result in an incorrect translation, because the
German target language content in Variant (2) also contains the content “switch off”.
EXAMPLE 3
Source (–) Switch off the warning tone using¶ After 10 seconds the display is switched off.¶
language
the alarm button.
content in text
editor
Source Segment 1 Switch off the warning tone using
language
Segment 2 After 10 seconds the display is switched off.
content in TM
system Segment 3 the alarm button.
Explanation The order of segments in the TM system is not always the same as in the text editor. Finding the
matching segments takes time and is a potential source of errors.
4.2.2.2 Soft breaks
Recommendations:
— Soft breaks should be avoided in the middle of a sentence if they serve no purpose other than for layout.
— Soft breaks should be avoided at the end of a sentence.
A soft break (symbol: ↵) forces a new line, but does not interrupt paragraph formatting or segmentation. Soft
breaks are therefore preferable to a hard break within a sentence.
However, the soft break is also stored in the TM and thus reduces the match value of the segment. The
probability that the sentence will occur in exactly the same way, i.e. with the line break in exactly the same
place in another text, is low, and an adjustment by the translator will therefore be necessary.
Because the target text and source text can vary in length, it is often necessary to remove or move the soft
breaks in the target language content to keep the layout as it is in the source language content.
EXAMPLE 1
Source (–) Artificial neural networks are a fundamental
language component of deep learning, mimicking the ↵
content in text human brain to process complex data.
editor
Source Artificial neural networks are a fundamental component of deep learning, mimicking the ↵
language human brain to process complex data.
content in TM
system
Target Künstliche neuronale Netze sind ein grundlegender Bestandteil des Deep Learning und ahmen
language das ↵
content in TM menschliche Gehirn nach, um komplexe Daten zu verarbeiten.
system
Target Künstliche neuronale Netze sind ein
language grundlegender Bestandteil des Deep Learning
content in text und ahmen das ↵
editor [de] menschliche Gehirn nach, um komplexe Daten
zu verarbeiten.
Explanation The soft break in the source language content in the text editor helps to keep “human brain”
together in one line. Since the soft break is displayed in the TM system’s editor during
translation, it is very likely that it will be placed in the same location in the target language
content. However, this will cause unwanted line breaks in the translated texts due to the
different text lengths of English and German.
Since soft breaks do not interrupt segmentation, they should not be used to separate entire source language
content into different lines. Instead, a hard break should be used.

EXAMPLE 2
Source (–) Email: info@example.com↵
language Office hours: Monday to Friday from 8:00 am to 6:30 pm
content in text
editor
Source Segment 1 Email:
language
Segment 2 info@example.com↵
content in TM
Office hours:
system
Segment 3 Monday to Friday from 8:00 am to 6:30 pm
Source (+) Email: info@example.com¶
language Office hours: Monday to Friday from 8:00 am to 6:30 pm
content in text
editor
Source Segment 1 Email:
language
Segment 2 info@example.com
content in TM
system Segment 3 Office hours:
Segment 4 Monday to Friday from 8:00 am to 6:30 pm
Explanation In the negative example, the soft break after the email address causes parts of the two lines to
appear in one segment because there is no segment end marker (e.g. full stop) following the email
address. In the TM system, both the email address and all content before the next segment end
marker (colon) of the next line appear in one segment. This reduces the value of the segment for
later reuse. If both lines are separated by a hard break as in the positive example, then the
segments are separated correctly with respect to their status as information units.
4.2.2.3 Manual hyphenation
Recommendations:
— Manually inserting hyphens in order to influence layout should be avoided. Instead, the automatic
hyphenation feature should be used.
— Non-breaking hyphens should be used instead of hard or soft breaks in order to prevent an unwanted line
break after a hyphen.
Hyphens are used to join words and to separate syllables of a single word. “Automatic hyphenation” is a
software feature that automatically hyphenates words at the end of lines to improve the document’s
appearance.
When automatic hyphenation is used, the hyphens shown in the text editor at the end of the line do not appear
in the TM system or in the TM as hyphens. However, manually inserting hyphens to influence hyphenation can
lead to issues in the TM system.
Additional hyphens manually inserted for layout purposes (e.g. to change where the line breaks automatically)
do not influence segmentation, but will be imported into the TM system as a regular character.

EXAMPLE 1
Source (–) The economics and history departments at the university are offering an interdisci-
language plinary seminar on Asia.
content in text
editor
Source The economics and history departments at the university are offering an interdisci-plinary
language seminar on Asia.
content in TM
system
Target Les départements d’économie et d’histoire de l’université proposent un séminaire
language interdisciplinaire sur l’Asie.
content in TM
system [fr]
Explanation The hyphen inserted manually for layout reasons is displayed as a hyphen in the middle of the
word “interdisciplinary” in the source segment. It does not affect the target language content, but
if the source text is later updated and the hyphen deleted, it will lead to the TM match being
reduced to a fuzzy match in a subsequent translation.
Also, if “interdisciplinary” and its translations are saved in the termbase, term recognition will
likely not work for “interdisci-plinary”. This can lead to incorrect or inconsistent use of
terminology in the target language content.
In order to keep the source text within the TM clean and reusable, the automatic hyphenation feature of the
text editor should be used instead of manual hyphenation. Automatic hyphens will be displayed in the text
editor, but will not be imported into the TM system as regular characters.
For certain hyphenated words or units of words separated by hyphens, automatic line breaks should be
avoided. Disabling automatic line breaks can facilitate reading fluency and prevent related units from being
placed on different lines.
Where a line break after a hyphen is unwanted, a so-called “non-breaking hyphen” or “no-break hyphen” can
be inserted instead of a normal hyphen. Non-breaking hyphens should be used where hyphenated character
strings should not be separated at the end of a line. This applies, for instance, to character strings containing
single letters, abbreviations, numbers, or to formulae.
Non-breaking hyphens are usually displayed as such in the TM system and can be used also in the target
language content, if applicable.

EXAMPLE 2
Source (–) The teacher has provided the students with information sheets on how to calculate the k-
language factor.
content in text
(+) The teacher has provided the students with information sheets on how to calculate the
editor
k-factor.
Source The teacher has provided the students with information sheets on how to calculate the k-
language factor.
content in TM
The teacher has provided the students with information sheets on how to calculate the
system
k[-]factor.
Target (–) Die Lehrkraft hat Studierenden Informationsblätter bereitgestellt, mit denen sie den k-
language Faktor berechnen können.
content in text
(+) Die Lehrkraft hat Studierenden Informationsblätter bereitgestellt, mit denen sie den
editor [de]
k-Faktor berechnen können.
Explanation In the positive example, a non-breaking hyphen is used within “k-factor”. It prevents the
automatic line break behind “k-”. Instead, the whole expression is moved to the next line. The
same applies for the target language content if the non-breaking hyphen is used in the translation
as well.
4.2.2.4 Lists
Recommendation: Lists in the middle of a sentence should be avoided.
Where a sentence begins before and continues after a list, segments in many languages fail to align properly
and misaligned segment pairs of no use for future translations will be stored in the TM. Misaligned segment
pairs in the TM can lead to translation errors when reused without checking.

EXAMPLE
Source (–) When mounting the holding plate
language
— bolts,
content in text
— fasteners, and
editor
— glue
must be used.
Source Segment 1 When mounting the holding plate
language
Segment 2 bolts,
content in TM
system Segment 3 fasteners, and
Segment 4 glue
Segment 5 must be used.
Target Segment 1 Lors du montage de la plaque de maintien,
language
Segment 2 des boulons,
content in TM
system [fr] Segment 3 des fixations, et
Segment 4 de la colle
Segment 5 doivent être utilisés.
In some languages, the target language content can be structured in the same way. However, in
Explanation
this French example, the verb “doivent” must be plural, because the list contains several items.
The segment pair stored in the TM is “must be used – doivent être utilisés”. “must” can be used
with singular and plural parts of speech, but “doivent” can only be used in a plural context.
Additionally, “utilisés” will likely have to be inflected depending on the parts of speech to which it
refers.
For more information and another example on lists inside sentences, see 4.4.2.6.
4.2.2.5 Tabs
Recommendation: Tabs should be avoided within sentences or lines of text when inserted for layout
purposes. Instead, built-in text editor formatting features (columns, tables, indents, numbering, bullets, etc.)
should be used.
Tabs inserted manually for layout purposes (e.g. to change where the line breaks automatically) do not
influence segmentation, but will be imported into the TM system.
Also, source text and target text can vary in length, therefore the tabs can appear in a completely different
position within the target text and fail to fulfil their intended purpose.

EXAMPLE
Source (–) When changing tyres, make sure that your vehicle is on a level, stable surface,  →  →
language use the jack correctly and tighten the wheel nuts securely.
content in text
editor
Source When changing tyres, make sure that your vehicle is on a level, stable surface,  →  →  use the
language jack correctly and tighten the wheel nuts securely.
content in TM
system
Target Achten Sie beim Reifenwechsel darauf, dass Ihr Fahrzeug auf einer ebenen, stabilen Fläche steht,
language →  →  verwenden Sie den Wagenheber richtig und ziehen Sie die Radmuttern fest an.
content in TM
system [de]
Target (–) Achten Sie beim Reifenwechsel darauf, dass Ihr Fahrzeug auf einer ebenen, stabilen Fläche
language steht,  →  →  verwenden Sie den Wagenheber richtig und ziehen Sie die Radmuttern fest an.
content in text
editor [de]
Explanation The tabs in the source language content in text editor help to increase the readability, by bringing
the next task into a new line. However, they will cause unwanted gaps in the target language
content in the text editor due to the different text lengths of German and English content.
4.2.2.6 Spaces
Recommendations:
— Spaces should not be inserted for layout purposes. Instead, built-in text editor formatting features
(columns, tables, indents, numbering, bullets, etc.) should be used.
— Non-breaking spaces should be used instead of normal spaces in order to prevent unwanted line breaks
after a space.
— Additional spaces should be avoided following full stops (within acronyms and initialisms or dates).
Additional spaces manually inserted for layout purposes (e.g. to change where the line breaks automatically)
do not influence segmentation, but will be imported into the TM system as a regular character.
Also, source text and target text can vary in length, or the order of components in content can move around
due to different text conventions in the target text. Therefore, the spaces can appear in a completely different
position within the target text and fail to fulfil their intended purpose.

EXAMPLE 1
Source (–) When changing tyres, make sure that your vehicle is on a level, ························
language stable surface, use the jack correctly and tighten the wheel nuts securely.
content in text
editor
Source When changing tyres, make sure that your vehicle is on a level, ························stable surface, use
language the jack correctly and tighten the wheel nuts securely.
content in TM
system
Target Lorsque vous changez les pneus, assurez-vous que votre véhicule se trouve sur une surface plane
language ························et stable, utilisez correctement le cric et serrez bien les écrous de roue.
content in TM
system [fr]
Target (–) Lorsque vous changez les pneus, assurez-vous que votre véhicule se trouve sur une surface
language plane ························et stable, utilisez correctement le cric et serrez bien les écrous de roue.
content in text
editor
Explanation The spaces in the source language content in the text editor were perhaps intended to increase
the readability by moving the next task to a new line. However, they will cause unwanted gaps in
the target language content in the text editor due to the different text lengths for French and
English.
For certain lexical units separated by spaces, automatic line breaks should be avoided. Disabling automatic
line breaks can facilitate reading fluency and prevent related units from being placed on different lines.
Where a line break after a space is unwanted, a so-called “non-breaking space” or “hard space” can be inserted
instead of a normal space. Non-breaking spaces should be used when the text fragments separated by a space
belong together and should not be separated across two lines for better readability or understanding, e.g.
between the title and personal name (Ms Meyer, Dr Miller), between numeric values and units (44 mm, 5 %,
66 min), with special characters (3 + 5, ≤ 3), with specifications of time and dates (14th century,
30 June 2020), with document names or product names (ISO 17100) and in other cases (Version 3,
Fig.Figure 23, p. 18).
Non-breaking spaces are usually displayed as such in the TM system and can be used also in the target
language content, if applicable.

EXAMPLE 2
Source (–) The professor has provided the students with detailed information on how King Louis
language XIV died.
content in text
(+) The professor has provided the students with detailed information on how King Louis°XIV
editor
died.
Source The professor has provided the students with detailed information on how King Louis XIV died.
language
The professor has provided the students with detailed info
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...