Text-to-speech for television - General requirements

IEC 62731:2018(E) specifies the text-to-speech functionality for a (broadcast) receiver with a text-to-speech system. Such a system may be one device, i.e. a receiver with an integrated text-to-speech generator, or may be two devices, i.e. a receiver interfacing with an external text-to-speech device. This document applies only to completely functional stationary (or semi-stationary) digital TV receivers such as set top boxes, integrated digital TVs, recorders and other products whose primary function is to receive TV content. Where this document refers to TV, this will be shorthand for all such receivers. This document does not apply to products that are capable of receiving TV as a secondary function (e.g. PCs or game consoles with digital television receivers). It also does not apply to sub-assemblies (e.g. PC tuner cards). This edition includes the following significant technical changes with respect to the previous edition: a) in 6.2, the levels of announcement quality were revised as well as considerations for ways in which device users can provide service providers with feedback on incorrectly announced terms. b) in 6.3, the following TV functionality was added: the TV can receive updated words, associated conversions and updated conversion rules for the TTS engine via a network connection.

Text-zu-Sprache für Fernsehen - Allgemeine Anforderungen

Synthèse vocale pour télévision - Exigences générales

L'IEC 62731:2018 définit la fonctionnalité de synthèse vocale pour un récepteur (de radiodiffusion) avec système de synthèse vocale. Un tel système peut être constitué d'un dispositif, à savoir un récepteur avec un générateur de synthèse vocale intégré, ou être constitué de deux dispositifs, à savoir un récepteur interfacé avec un dispositif de synthèse vocale extérieur. Le présent document s'applique uniquement à des récepteurs de télévision numérique fixes (ou semi-fixes) entièrement fonctionnels, tels que des boîtiers décodeurs, des téléviseurs numériques intégrés, des enregistreurs et d'autres produits dont la fonction principale est de recevoir un contenu de télévision. Le présent document ne s'applique pas à des produits capables de recevoir la télévision en tant que fonction secondaire (par exemple, PC ou consoles de jeux avec récepteurs de télévision numérique). Il ne s'applique pas non plus aux sous-ensembles (par exemple, cartes de syntoniseur pour PC). Dans le présent document, l’abréviation TV désigne tous les récepteurs de ce type. L'IEC 62731:2018 annule et remplace la première édition parue en 2013. Cette édition constitue une révision technique. Cette édition inclut les modifications techniques majeures suivantes par rapport à l'édition précédente: a) en 6.2, les niveaux de qualité d'énonciation ont été révisés, de même que les considérations relatives aux moyens par lesquels les utilisateurs des dispositifs peuvent soumettre des retours d'informations aux fournisseurs de services en ce qui concerne les termes énoncés de manière incorrecte; b) au 6.3, la fonctionnalité de télévision suivante a été ajoutée: le téléviseur peut recevoir des mots mis à jour, des conversions associées et des règles de conversion mises à jour destinés au moteur de synthèse vocale par l'intermédiaire d'une connexion réseau.

Pretvorba besedila v govor (govorna sinteza) za televizijo - Splošne zahteve

Ta mednarodni standard določa funkcijo pretvorbe besedila v govor (govorne sinteze) za (televizijske) sprejemnike s sistemom pretvorbe besedila v govor. Tak sistem je lahko ena naprava, npr. sprejemnik z integriranim generatorjem za pretvorbo besedila v govor, ali pa dve napravi, npr. sprejemnik, ki se povezuje z zunanjo napravo za pretvorbo besedila v govor. Ta dokument se uporablja samo za povsem funkcionalne stacionarne (ali delno stacionarne) digitalne TV-sprejemnike, kot so komunikatorji, integrirani digitalni televizijski sprejemniki, snemalniki in drugi izdelki, katerih glavna naloga je sprejemanje TV-vsebine. V tem dokumentu je TV okrajšava za vse take sprejemnike.
Ta dokument se ne uporablja za izdelke, pri katerih je sprejemanje TV-signala sekundarna funkcija (npr. osebni računalniki ali igralne konzole z digitalnimi televizijskimi sprejemniki). Prav tako se ne uporablja za podsklope (npr. kartice sprejemnika za osebne računalnike).

General Information

Status
Published
Publication Date
22-Mar-2018
Current Stage
6060 - Document made available
Due Date
12-Feb-2018

RELATIONS

Buy Standard

Standard
EN IEC 62731:2018
English language
21 pages
sale 10% off
Preview
sale 10% off
Preview
e-Library read for
1 day
Standard
EN 62731:2018
English language
21 pages
sale 10% off
Preview
sale 10% off
Preview
e-Library read for
1 day

Standards Content (sample)

SLOVENSKI STANDARD
SIST EN IEC 62731:2018
01-november-2018
Nadomešča:
SIST EN 62731:2013
Pretvorba besedila v govor (govorna sinteza) za televizijo - Splošne zahteve
Text to speech for television - General requirements
Text-zu-Sprache für Fernsehen - Allgemeine Anforderungen
Synthèse vocale pour télévision - Exigences générales
Ta slovenski standard je istoveten z: EN IEC 62731:2018
ICS:
33.160.25 Televizijski sprejemniki Television receivers
SIST EN IEC 62731:2018 en,fr,de

2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

---------------------- Page: 1 ----------------------
SIST EN IEC 62731:2018
---------------------- Page: 2 ----------------------
SIST EN IEC 62731:2018
EUROPEAN STANDARD EN IEC 62731
NORME EUROPÉENNE
EUROPÄISCHE NORM
March 2018
ICS 33.160.25; 33.160.99 Supersedes EN 62731:2013
English Version
Text-to-speech for television - General requirements
(IEC 62731:2018)

Synthèse vocale pour télévision - Exigences générales Text-zu-Sprache für Fernsehen - Allgemeine

(IEC 62731:2018) Anforderungen
(IEC 62731:2018)

This European Standard was approved by CENELEC on 2018-02-14. CENELEC members are bound to comply with the CEN/CENELEC

Internal Regulations which stipulate the conditions for giving this European Standard the status of a national standard without any alteration.

Up-to-date lists and bibliographical references concerning such national standards may be obtained on application to the CEN-CENELEC

Management Centre or to any CENELEC member.

This European Standard exists in three official versions (English, French, German). A version in any other language made by translation

under the responsibility of a CENELEC member into its own language and notified to the CEN-CENELEC Management Centre has the

same status as the official versions.

CENELEC members are the national electrotechnical committees of Austria, Belgium, Bulgaria, Croatia, Cyprus, the Czech Republic,

Denmark, Estonia, Finland, Former Yugoslav Republic of Macedonia, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Latvia,

Lithuania, Luxembourg, Malta, the Netherlands, Norway, Poland, Portugal, Romania, Serbia, Slovakia, Slovenia, Spain, Sweden,

Switzerland, Turkey and the United Kingdom.
European Committee for Electrotechnical Standardization
Comité Européen de Normalisation Electrotechnique
Europäisches Komitee für Elektrotechnische Normung
CEN-CENELEC Management Centre: Rue de la Science 23, B-1040 Brussels

© 2018 CENELEC All rights of exploitation in any form and by any means reserved worldwide for CENELEC Members.

Ref. No. EN IEC 62731:2018 E
---------------------- Page: 3 ----------------------
SIST EN IEC 62731:2018
EN IEC 62731:2018 (E)
European foreword

The text of document 100/2989/FDIS, future edition 2 of IEC 62731, prepared by IEC/TC 100 "Audio,

video and multimedia systems and equipment" was submitted to the IEC-CENELEC parallel vote and

approved by CENELEC as EN IEC 62731:2018.
The following dates are fixed:
(dop) 2018-11-14
• latest date by which the document has to be
implemented at national level by
publication of an identical national
standard or by endorsement
• latest date by which the national (dow) 2021-02-14
standards conflicting with the
document have to be withdrawn
This document supersedes EN 62731:2013.

Attention is drawn to the possibility that some of the elements of this document may be the subject of

patent rights. CENELEC shall not be held responsible for identifying any or all such patent rights.

Endorsement notice

The text of the International Standard IEC 62731:2018 was approved by CENELEC as a European

Standard without any modification.

In the official version, for Bibliography, the following note has to be added for the standard indicated:

ISO/IEC 13818-1 NOTE Harmonized as EN ISO/IEC 13818-1.
---------------------- Page: 4 ----------------------
SIST EN IEC 62731:2018
IEC 62731
Edition 2.0 2018-01
INTERNATIONAL
STANDARD
Text-to-speech for television – General requirements
INTERNATIONAL
ELECTROTECHNICAL
COMMISSION
ICS 33.160.25; 33.160.99 ISBN 978-2-8322-5125-6

Warning! Make sure that you obtained this publication from an authorized distributor.

® Registered trademark of the International Electrotechnical Commission
---------------------- Page: 5 ----------------------
SIST EN IEC 62731:2018
– 2 – IEC 62731:2018 © IEC 2018
CONTENTS

FOREWORD ........................................................................................................................... 3

1 Scope .............................................................................................................................. 5

2 Normative references ...................................................................................................... 5

3 Terms, definitions and abbreviated terms ........................................................................ 5

3.1 Terms and definitions .............................................................................................. 5

3.2 Abbreviated terms ................................................................................................... 6

4 Guiding principles and conventions ................................................................................. 7

5 User requirements of visually impaired people ................................................................. 7

5.1 Users' needs ........................................................................................................... 7

5.2 Navigating channels ................................................................................................ 8

5.3 Navigating TV inputs ............................................................................................... 8

5.4 Additional data services .......................................................................................... 8

5.5 Operating the TV .................................................................................................... 8

5.6 TV use .................................................................................................................... 8

6 Functional requirements .................................................................................................. 9

6.1 Functionality for TV, TTS device combination ......................................................... 9

6.2 Functionality: TTS device/engine .......................................................................... 10

6.3 Functionality: TV ................................................................................................... 10

6.4 Setting up: TV, TTS device combination ................................................................ 11

7 TV events and TTS data ................................................................................................ 11

7.1 TV context and events .......................................................................................... 11

7.2 TTS data per event ............................................................................................... 12

7.2.1 Details ........................................................................................................... 12

7.2.2 Channel change ............................................................................................ 12

7.2.3 Additional information .................................................................................... 13

7.2.4 Navigation and selection ................................................................................ 13

7.2.5 Context switch ............................................................................................... 14

7.2.6 Pop-up message ............................................................................................ 16

8 TTS profiles ................................................................................................................... 16

8.1 Basic profile .......................................................................................................... 16

8.2 Main profile ........................................................................................................... 16

8.3 Enhanced profile ................................................................................................... 17

8.4 Summary .............................................................................................................. 18

Bibliography .......................................................................................................................... 19

Figure 1 – TV – TTS device system diagram ........................................................................... 9

Figure 2 – Context event state diagram ................................................................................. 12

Table 1 – Overview of profiles ............................................................................................... 18

---------------------- Page: 6 ----------------------
SIST EN IEC 62731:2018
IEC 62731:2018 © IEC 2018 – 3 –
INTERNATIONAL ELECTROTECHNICAL COMMISSION
____________
TEXT-TO-SPEECH FOR TELEVISION –
GENERAL REQUIREMENTS
FOREWORD

1) The International Electrotechnical Commission (IEC) is a worldwide organization for standardization comprising

all national electrotechnical committees (IEC National Committees). The object of IEC is to promote

international co-operation on all questions concerning standardization in the electrical and electronic fields. To

this end and in addition to other activities, IEC publishes International Standards, Technical Specifications,

Technical Reports, Publicly Available Specifications (PAS) and Guides (hereafter referred to as "IEC

Publication(s)"). Their preparation is entrusted to technical committees; any IEC National Committee interested

in the subject dealt with may participate in this preparatory work. International, governmental and non-

governmental organizations liaising with the IEC also participate in this preparation. IEC collaborates closely

with the International Organization for Standardization (ISO) in accordance with conditions determined by

agreement between the two organizations.

2) The formal decisions or agreements of IEC on technical matters express, as nearly as possible, an international

consensus of opinion on the relevant subjects since each technical committee has representation from all

interested IEC National Committees.

3) IEC Publications have the form of recommendations for international use and are accepted by IEC National

Committees in that sense. While all reasonable efforts are made to ensure that the technical content of IEC

Publications is accurate, IEC cannot be held responsible for the way in which they are used or for any

misinterpretation by any end user.

4) In order to promote international uniformity, IEC National Committees undertake to apply IEC Publications

transparently to the maximum extent possible in their national and regional publications. Any divergence

between any IEC Publication and the corresponding national or regional publication shall be clearly indicated in

the latter.

5) IEC itself does not provide any attestation of conformity. Independent certification bodies provide conformity

assessment services and, in some areas, access to IEC marks of conformity. IEC is not responsible for any

services carried out by independent certification bodies.

6) All users should ensure that they have the latest edition of this publication.

7) No liability shall attach to IEC or its directors, employees, servants or agents including individual experts and

members of its technical committees and IEC National Committees for any personal injury, property damage or

other damage of any nature whatsoever, whether direct or indirect, or for costs (including legal fees) and

expenses arising out of the publication, use of, or reliance upon, this IEC Publication or any other IEC

Publications.

8) Attention is drawn to the Normative references cited in this publication. Use of the referenced publications is

indispensable for the correct application of this publication.

9) Attention is drawn to the possibility that some of the elements of this IEC Publication may be the subject of

patent rights. IEC shall not be held responsible for identifying any or all such patent rights.

International Standard IEC 62731 has been prepared by IEC technical committee 100: Audio,

video and multimedia systems and equipment.

This second edition cancels and replaces the first edition published in 2013. This edition

constitutes a technical revision.

This edition includes the following significant technical changes with respect to the previous

edition:

a) in 6.2, the levels of announcement quality were revised as well as considerations for ways

in which device users can provide service providers with feedback on incorrectly
announced terms.

b) in 6.3, the following TV functionality was added: the TV can receive updated words,

associated conversions and updated conversion rules for the TTS engine via a network

connection.
---------------------- Page: 7 ----------------------
SIST EN IEC 62731:2018
– 4 – IEC 62731:2018 © IEC 2018
The text of this standard is based on the following documents:
FDIS Report on voting
100/2989/FDIS 100/3013/RVD

Full information on the voting for the approval of this International Standard can be found in

the report on voting indicated in the above table.

This document has been drafted in accordance with the ISO/IEC Directives, Part 2.

The committee has decided that the contents of this document will remain unchanged until the

stability date indicated on the IEC website under "http://webstore.iec.ch" in the data related to

the specific document. At this date, the document will be
• reconfirmed,
• withdrawn,
• replaced by a revised edition, or
• amended.
A bilingual version of this publication may be issued at a later date.
---------------------- Page: 8 ----------------------
SIST EN IEC 62731:2018
IEC 62731:2018 © IEC 2018 – 5 –
TEXT-TO-SPEECH FOR TELEVISION –
GENERAL REQUIREMENTS
1 Scope

This International Standard specifies the text-to-speech functionality for a (broadcast) receiver

with a text-to-speech system. Such a system may be one device, i.e. a receiver with an

integrated text-to-speech generator, or may be two devices, i.e. a receiver interfacing with an

external text-to-speech device. This document applies only to completely functional stationary

(or semi-stationary) digital TV receivers such as set top boxes, integrated digital TVs,

recorders and other products whose primary function is to receive TV content. Where this

document refers to TV, this will be shorthand for all such receivers.

This document does not apply to products that are capable of receiving TV as a secondary

function (e.g. PCs or game consoles with digital television receivers). It also does not apply to

sub-assemblies (e.g. PC tuner cards).
2 Normative references
There are no normative references in this document.
3 Terms, definitions and abbreviated terms
3.1 Terms and definitions
For the purposes of this document, the following terms and definitions apply.

ISO and IEC maintain terminological databases for use in standardization at the following

addresses:
• IEC Electropedia: available at http://www.electropedia.org/
• ISO Online browsing platform: available at http://www.iso.org/obp
3.1.1
context
one specific function of a TV
EXAMPLE: Watching TV, EPG.
3.1.2
DTV broadcast event classification
general category of programme/event content, or its classification
EXAMPLES: Movie (drama), news/current affairs, talk show, sports (football).
3.1.3
EPG filter

filter that organises or reduces the list of displayed EPG items according to certain criteria

EXAMPLES Of criteria are to show only:
• programmes with a certain content type;
• favourites;
• programmes that are audio described;

• programmes for a given time period (for instance "today", "tomorrow", "next 7 days").

---------------------- Page: 9 ----------------------
SIST EN IEC 62731:2018
– 6 – IEC 62731:2018 © IEC 2018
3.1.4
event
trigger to start an action
3.1.5
list
collection of items
3.1.6
menu
subsequent order of items
3.1.7
receiver
device capable of receiving or handling digital television signals
3.1.8
service

sequence of programmes under the control of a broadcaster which can be broadcast as part

of a schedule
3.1.9
subtitle

textual representation of the dialogue (and frequently additional auditory information),

typically shown at the bottom of the screen

Note 1 to entry: Subtitles can be a textual rendering in the same language as the spoken dialogue, or can provide

a written translation in a different language.

Note 2 to entry: In some parts of the world, subtitles are called "(closed) captions", and subtitling is referred to as

"(closed) captioning".
Note 3 to entry: This document uses the term subtitles throughout.
3.1.10
TTS audio
audio generated by the TTS engine based on the TTS data

Note 1 to entry: If the TV uses an external TTS converter, TTS audio is interpreted as TTS data.

3.1.11
priority audio information

audio information for immediate output typically related to emergency information

3.1.12
TTS data
(text) data converted into TTS audio information by the text-to-speech engine
3.2 Abbreviated terms
DTV digital television
EPG electronic programme guide
STB set top box
TTS text-to-speech
TV television
UI user interface
---------------------- Page: 10 ----------------------
SIST EN IEC 62731:2018
IEC 62731:2018 © IEC 2018 – 7 –
4 Guiding principles and conventions

This document describes the required basic behaviour for a TV text-to-speech combination in

a basic profile, but also provides for enhanced profiles. It also gives a short introduction into

the basic problems of visually impaired people: i.e. what are the problems visually impaired

people experience when using and watching TV?

Providing text-to-speech functionality for a broadcast receiver, for example a TV or an STB,

can be of great help to (visually) disabled people. Such speech functionality may be

integrated in the receiver or may be external to the receiver in a separate device.

In general as the guiding principle, when building a TTS interface in the context of this

document, implementers should aspire to achieve functional equivalence of the user

experience. This means that a person operating the device using the speech interface should

have access to similar information and be able to accomplish tasks similar to those they

would with a graphical UI.
The main features of this document are:

• basic functional description for a TV-TTS device combination or TV with integrated TTS;

• profiles for different levels of TV-TTS functionality;
• targeted towards the digital TV application.

In this document, mandatory requirements are specified; optional and informative features are

also included.

A claim of conformity with this document requires conformity with all mandatory requirements.

A TV-TTS device combination or a TV with a TTS that is integrated may provide options for a

user to enable or disable each feature of a product. While options for user configurations may

be provided, the product shall meet the mandatory requirements.
5 User requirements of visually impaired people
5.1 Users' needs

This subclause explains the needs of visually impaired people as the primary target users for

a TV with TTS. Unless these needs are met, the system is not accessible to this user group.

Visually impaired people experience access barriers in the course of the following activities

when watching TV:
a) following TV programming, e.g. the TV series;
b) using a remote control;
c) not being able to see subtitles;
d) navigating channels;
e) navigating TV inputs;

f) using additional data (text) services provided by the broadcaster, e.g. an EPG;

g) daily operation of the TV and initial setup of the TV for use.

Items a), b) and c) are outside the scope of this document. Item c) further relates to the fact

that, in some countries, foreign language programmes are translated via subtitles. For users

who cannot see the subtitles, supplementary audio services are sometimes used to deliver an

audio version of the subtitles. This document elaborates on the remaining four items, i.e. d),

e), f) and g), in 5.2 to 5.6.
---------------------- Page: 11 ----------------------
SIST EN IEC 62731:2018
– 8 – IEC 62731:2018 © IEC 2018

NOTE 1 For DVB systems, item a) is already solved by audio description. Also, the use case of providing

supplementary audio services to deliver an audio version of the subtitles is covered in the DVB-SI specification

ETSI EN 300 468.

NOTE 2 For ATSC systems, the audio system includes a visually impaired (VI) associate service, which allows a

complete programme mix containing music, effects, dialogue, and additionally a narrative description of the picture

content, see ATSC A/53 part 5 and part 6.
5.2 Navigating channels

The problem is that a user does not know which channel the TV displays, i.e. the user gets

"lost during navigation". The TV is displaying navigation data on the screen but the user is

unable to see it. Such data are, for example:
• channel number,
• service name,
• (DTV broadcast) event name.
5.3 Navigating TV inputs

The problem is that a user is unable to select the required input to the TV, e.g. the user

wishes to select DTV or a specific external input linked to a recording or other device. The

choice is shown on the screen but the user is unable to see it.
5.4 Additional data services

With digital TV, a broadcaster may transmit additional data (text) services to augment TV

programming, provide additional information on programming, or provide news. Such

additional data are:
• information about whether audio description, subtitling is available,
• (next) (DTV broadcast) event name,
• (DVB-) event information (enhanced description of the (DTV broadcast) event),
• EPG data.

The items above are listed in order of importance with the most important item appearing first.

It is noted that this data provides additional convenience in using the TV, but that is non-

essential for the primary function of watching TV, and selecting channels.
5.5 Operating the TV

User settings are another needed function besides navigation. This can be done through

buttons on the remote control (out of scope of this specification), but also via on-screen

menus. For visually impaired people, on-screen menus are typically of little use.

A distinction exists between initial setup and daily operation of the TV. Initial setup is typically

a one-time operation during the lifetime of a TV. Daily operation is more frequent and more

important. Consequently, a distinction among menu items for daily operation exists, those

addressing specific accessibility functions, and TV setup menu items. However, the most

frequently used keys are "volume", "channel up/down", and number keys.
5.6 TV use

Use characterization of a TV helps in determining implementation profiles. Navigating

channels, for example, is done most often when watching TV, as well as commands such as

volume up and down. This may be supported by additional data services, but does not affect

the primary functions of the TV. Changing the TV's system settings is not done very often,

except perhaps for changing sound or video settings or switching audio description on and off.

Such settings may have an easy access mode through a special menu. TV installation is

typically performed only once during the lifetime of the TV. Often, visually impaired people

---------------------- Page: 12 ----------------------
SIST EN IEC 62731:2018
IEC 62731:2018 © IEC 2018 – 9 –

can benefit from specialized support for installing the TV, i.e. it is part of the service when

buying a new TV. Understanding this life- and usage cycle of a TV helps with defining the

most effective and efficient solutions and is reflected in the profiles. In the following

paragraphs, we refer to "basic", "main" and "enhanced" profiles, which are further defined and

detailed in Clause 8.

Key operations for a minimum TTS implementation on a receiver for TV use are as set out in

the basic profile defined in this document. This basic profile shall include:

a) channel number, name and event information – key for a user to identify which service

has been selected;

b) availability of audio description – key for a user to know about the availability of this

service feature;

c) availability of subtitles – key for a user to know about the availability of this service

feature;

d) basic EPG – allow the user to navigate through the EPG, if such data is present in the

broadcast, to identify which future events are available to them;

e) context changes – key for a user to understand if the TV went to another state or when a

pop-up message appears.

Additional operations that shall be included in the main profile, in addition to all those in the

basic profile are:

f) receiver menu functions allows the user to navigate receiver operations and functions.

Additional operations that shall be included in the enhanced profile, in addition to all those

from the basic and main profiles, are:
g) event information – provide the event synopsis;

h) additional EPG data – allows the user to get more info on the service or event;

i) operations of a recording device – allows the user to record future events, possibly

selected via the EPG; play/pause a recorded event.
6 Functional requirements
6.1 Functionality for TV, TTS device combination

The TV-TTS device system diagram is illustrated in Figure 1. As shown in the figure, the TTS

device is a separate function from the TV, which can be implemented on a device connected

with a TV-TTS device interface, or may also be integrated in the TV.
Text to
speech
device
IEC
Figure 1 – TV – TTS device system diagram
The functionality requirements for a TV with TTS combination are:
• the TV, in principle, only outputs text strings towards the TTS device;

• the delay between an event and the resulting TTS audio related to that event shall be such

that they are perceived as belonging tied together;
---------------------- Page: 13 ----------------------
SIST EN IEC 62731:2018
– 10 – IEC 62731:2018 © IEC 2018

• the user should be able to move to the next operation even in the middle of currently

playing TTS audio;
• the user should be able to stop currently playing TTS audio;
• the user shall be able to repeat the current or previous TTS audio;
• the user shall be able to mute the TTS audio;
• the user shall be able to switch on/off the TTS function;

• the language of the TTS audio shall be the same as set for the TV’s UI, except when

signalled differently; the TTS device/engine may choose to pronounce the text or to

indicate failure in case it does not support the signalled language;

• TTS audio may not need to literally represent the related visual information on the screen

as long as the meaning of the visual information stays intact;

• priority audio information shall overrule currently playing TTS audio information.

NOTE For example, EAS (Emergency Alert System) [USA] or EWS (Emergency Warning System) [JPN] are higher

priority than TTS audio information.
6.2 Functionality: TTS device/engine
The functionality requirements for the TTS device/engine are:

• an external TTS device should be designed to be fully accessible to visually impaired

users without being dependent on the TV;

• the volume level of the TTS device/engine shall be changeable by the user: the TTS

device shall announce the new volume level; it should be possible to do this independently

of the TV volume;

• the user should be able to adjust speech characteristics like speed, pitch, voice type,

when applicable;

• the TTS engine should announce abbreviations in a manner suited to the context, for

example: "IEC" should be announced "I E C"; and "IEEE" should be announced "I triple E".

The TTS engine may also pronounce common acronyms in full, e.g. "sub" could be spoken

as "subtitles" where appropriate;

• the TTS engine should announce numbers in a manner suited to the context, e.g. as a

natural number, digit-by-digit, etc.;

• the TTS engine should announce proper nouns (e.g. names, place names) correctly;

• levels of announcement quality are at the discretion of the implementers; the following are

examples of the implementation in order to achieve good levels of quality:

– consider ways to update the vocabulary database and conversion rules of the TTS

engine;

– consider provision of additional information (= metadata) to help the TTS engine for

pronunciations or readings;

– consider ways in which device users can provide service providers with feedback on

incorrectly announced terms.
6.3 Functionality: TV

The TV determines the user interface, i.e. what is displayed on the screen, and how the TV

interacts with the user. The TV therefore also determines which text is sent to the TTS engine.

The TV can receive updated words, associated conversions and updated conversion rules for

the TTS engine via a network connection.

The user should be able to control, via the TV TTS, settings such as volume, speed, pitch,

voice type.
---------------------- Page: 14 ----------------------
SIST EN IEC 62731:2018
IEC 62731:20
...

SLOVENSKI STANDARD
SIST EN 62731:2018
01-november-2018
1DGRPHãþD
SIST EN 62731:2013
Pretvorba besedila v govor (govorna sinteza) za televizijo - Splošne zahteve
Text to speech for television - General requirements
Text-zu-Sprache für Fernsehen - Allgemeine Anforderungen
Synthèse vocale pour télévision - Exigences générales
Ta slovenski standard je istoveten z: EN IEC 62731:2018
ICS:
33.160.25 Televizijski sprejemniki Television receivers
SIST EN 62731:2018 en,fr,de

2003-01.Slovenski inštitut za standardizacijo. Razmnoževanje celote ali delov tega standarda ni dovoljeno.

---------------------- Page: 1 ----------------------
SIST EN 62731:2018
---------------------- Page: 2 ----------------------
SIST EN 62731:2018
EUROPEAN STANDARD EN IEC 62731
NORME EUROPÉENNE
EUROPÄISCHE NORM
March 2018
ICS 33.160.25; 33.160.99 Supersedes EN 62731:2013
English Version
Text-to-speech for television - General requirements
(IEC 62731:2018)

Synthèse vocale pour télévision - Exigences générales Text-zu-Sprache für Fernsehen - Allgemeine

(IEC 62731:2018) Anforderungen
(IEC 62731:2018)

This European Standard was approved by CENELEC on 2018-02-14. CENELEC members are bound to comply with the CEN/CENELEC

Internal Regulations which stipulate the conditions for giving this European Standard the status of a national standard without any alteration.

Up-to-date lists and bibliographical references concerning such national standards may be obtained on application to the CEN-CENELEC

Management Centre or to any CENELEC member.

This European Standard exists in three official versions (English, French, German). A version in any other language made by translation

under the responsibility of a CENELEC member into its own language and notified to the CEN-CENELEC Management Centre has the

same status as the official versions.

CENELEC members are the national electrotechnical committees of Austria, Belgium, Bulgaria, Croatia, Cyprus, the Czech Republic,

Denmark, Estonia, Finland, Former Yugoslav Republic of Macedonia, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Latvia,

Lithuania, Luxembourg, Malta, the Netherlands, Norway, Poland, Portugal, Romania, Serbia, Slovakia, Slovenia, Spain, Sweden,

Switzerland, Turkey and the United Kingdom.
European Committee for Electrotechnical Standardization
Comité Européen de Normalisation Electrotechnique
Europäisches Komitee für Elektrotechnische Normung
CEN-CENELEC Management Centre: Rue de la Science 23, B-1040 Brussels

© 2018 CENELEC All rights of exploitation in any form and by any means reserved worldwide for CENELEC Members.

Ref. No. EN IEC 62731:2018 E
---------------------- Page: 3 ----------------------
SIST EN 62731:2018
EN IEC 62731:2018 (E)
European foreword

The text of document 100/2989/FDIS, future edition 2 of IEC 62731, prepared by IEC/TC 100 "Audio,

video and multimedia systems and equipment" was submitted to the IEC-CENELEC parallel vote and

approved by CENELEC as EN IEC 62731:2018.
The following dates are fixed:
(dop) 2018-11-14
• latest date by which the document has to be
implemented at national level by
publication of an identical national
standard or by endorsement
• latest date by which the national (dow) 2021-02-14
standards conflicting with the
document have to be withdrawn
This document supersedes EN 62731:2013.

Attention is drawn to the possibility that some of the elements of this document may be the subject of

patent rights. CENELEC shall not be held responsible for identifying any or all such patent rights.

Endorsement notice

The text of the International Standard IEC 62731:2018 was approved by CENELEC as a European

Standard without any modification.

In the official version, for Bibliography, the following note has to be added for the standard indicated:

ISO/IEC 13818-1 NOTE Harmonized as EN ISO/IEC 13818-1.
---------------------- Page: 4 ----------------------
SIST EN 62731:2018
IEC 62731
Edition 2.0 2018-01
INTERNATIONAL
STANDARD
Text-to-speech for television – General requirements
INTERNATIONAL
ELECTROTECHNICAL
COMMISSION
ICS 33.160.25; 33.160.99 ISBN 978-2-8322-5125-6

Warning! Make sure that you obtained this publication from an authorized distributor.

® Registered trademark of the International Electrotechnical Commission
---------------------- Page: 5 ----------------------
SIST EN 62731:2018
– 2 – IEC 62731:2018 © IEC 2018
CONTENTS

FOREWORD ........................................................................................................................... 3

1 Scope .............................................................................................................................. 5

2 Normative references ...................................................................................................... 5

3 Terms, definitions and abbreviated terms ........................................................................ 5

3.1 Terms and definitions .............................................................................................. 5

3.2 Abbreviated terms ................................................................................................... 6

4 Guiding principles and conventions ................................................................................. 7

5 User requirements of visually impaired people ................................................................. 7

5.1 Users' needs ........................................................................................................... 7

5.2 Navigating channels ................................................................................................ 8

5.3 Navigating TV inputs ............................................................................................... 8

5.4 Additional data services .......................................................................................... 8

5.5 Operating the TV .................................................................................................... 8

5.6 TV use .................................................................................................................... 8

6 Functional requirements .................................................................................................. 9

6.1 Functionality for TV, TTS device combination ......................................................... 9

6.2 Functionality: TTS device/engine .......................................................................... 10

6.3 Functionality: TV ................................................................................................... 10

6.4 Setting up: TV, TTS device combination ................................................................ 11

7 TV events and TTS data ................................................................................................ 11

7.1 TV context and events .......................................................................................... 11

7.2 TTS data per event ............................................................................................... 12

7.2.1 Details ........................................................................................................... 12

7.2.2 Channel change ............................................................................................ 12

7.2.3 Additional information .................................................................................... 13

7.2.4 Navigation and selection ................................................................................ 13

7.2.5 Context switch ............................................................................................... 14

7.2.6 Pop-up message ............................................................................................ 16

8 TTS profiles ................................................................................................................... 16

8.1 Basic profile .......................................................................................................... 16

8.2 Main profile ........................................................................................................... 16

8.3 Enhanced profile ................................................................................................... 17

8.4 Summary .............................................................................................................. 18

Bibliography .......................................................................................................................... 19

Figure 1 – TV – TTS device system diagram ........................................................................... 9

Figure 2 – Context event state diagram ................................................................................. 12

Table 1 – Overview of profiles ............................................................................................... 18

---------------------- Page: 6 ----------------------
SIST EN 62731:2018
IEC 62731:2018 © IEC 2018 – 3 –
INTERNATIONAL ELECTROTECHNICAL COMMISSION
____________
TEXT-TO-SPEECH FOR TELEVISION –
GENERAL REQUIREMENTS
FOREWORD

1) The International Electrotechnical Commission (IEC) is a worldwide organization for standardization comprising

all national electrotechnical committees (IEC National Committees). The object of IEC is to promote

international co-operation on all questions concerning standardization in the electrical and electronic fields. To

this end and in addition to other activities, IEC publishes International Standards, Technical Specifications,

Technical Reports, Publicly Available Specifications (PAS) and Guides (hereafter referred to as "IEC

Publication(s)"). Their preparation is entrusted to technical committees; any IEC National Committee interested

in the subject dealt with may participate in this preparatory work. International, governmental and non-

governmental organizations liaising with the IEC also participate in this preparation. IEC collaborates closely

with the International Organization for Standardization (ISO) in accordance with conditions determined by

agreement between the two organizations.

2) The formal decisions or agreements of IEC on technical matters express, as nearly as possible, an international

consensus of opinion on the relevant subjects since each technical committee has representation from all

interested IEC National Committees.

3) IEC Publications have the form of recommendations for international use and are accepted by IEC National

Committees in that sense. While all reasonable efforts are made to ensure that the technical content of IEC

Publications is accurate, IEC cannot be held responsible for the way in which they are used or for any

misinterpretation by any end user.

4) In order to promote international uniformity, IEC National Committees undertake to apply IEC Publications

transparently to the maximum extent possible in their national and regional publications. Any divergence

between any IEC Publication and the corresponding national or regional publication shall be clearly indicated in

the latter.

5) IEC itself does not provide any attestation of conformity. Independent certification bodies provide conformity

assessment services and, in some areas, access to IEC marks of conformity. IEC is not responsible for any

services carried out by independent certification bodies.

6) All users should ensure that they have the latest edition of this publication.

7) No liability shall attach to IEC or its directors, employees, servants or agents including individual experts and

members of its technical committees and IEC National Committees for any personal injury, property damage or

other damage of any nature whatsoever, whether direct or indirect, or for costs (including legal fees) and

expenses arising out of the publication, use of, or reliance upon, this IEC Publication or any other IEC

Publications.

8) Attention is drawn to the Normative references cited in this publication. Use of the referenced publications is

indispensable for the correct application of this publication.

9) Attention is drawn to the possibility that some of the elements of this IEC Publication may be the subject of

patent rights. IEC shall not be held responsible for identifying any or all such patent rights.

International Standard IEC 62731 has been prepared by IEC technical committee 100: Audio,

video and multimedia systems and equipment.

This second edition cancels and replaces the first edition published in 2013. This edition

constitutes a technical revision.

This edition includes the following significant technical changes with respect to the previous

edition:

a) in 6.2, the levels of announcement quality were revised as well as considerations for ways

in which device users can provide service providers with feedback on incorrectly
announced terms.

b) in 6.3, the following TV functionality was added: the TV can receive updated words,

associated conversions and updated conversion rules for the TTS engine via a network

connection.
---------------------- Page: 7 ----------------------
SIST EN 62731:2018
– 4 – IEC 62731:2018 © IEC 2018
The text of this standard is based on the following documents:
FDIS Report on voting
100/2989/FDIS 100/3013/RVD

Full information on the voting for the approval of this International Standard can be found in

the report on voting indicated in the above table.

This document has been drafted in accordance with the ISO/IEC Directives, Part 2.

The committee has decided that the contents of this document will remain unchanged until the

stability date indicated on the IEC website under "http://webstore.iec.ch" in the data related to

the specific document. At this date, the document will be
• reconfirmed,
• withdrawn,
• replaced by a revised edition, or
• amended.
A bilingual version of this publication may be issued at a later date.
---------------------- Page: 8 ----------------------
SIST EN 62731:2018
IEC 62731:2018 © IEC 2018 – 5 –
TEXT-TO-SPEECH FOR TELEVISION –
GENERAL REQUIREMENTS
1 Scope

This International Standard specifies the text-to-speech functionality for a (broadcast) receiver

with a text-to-speech system. Such a system may be one device, i.e. a receiver with an

integrated text-to-speech generator, or may be two devices, i.e. a receiver interfacing with an

external text-to-speech device. This document applies only to completely functional stationary

(or semi-stationary) digital TV receivers such as set top boxes, integrated digital TVs,

recorders and other products whose primary function is to receive TV content. Where this

document refers to TV, this will be shorthand for all such receivers.

This document does not apply to products that are capable of receiving TV as a secondary

function (e.g. PCs or game consoles with digital television receivers). It also does not apply to

sub-assemblies (e.g. PC tuner cards).
2 Normative references
There are no normative references in this document.
3 Terms, definitions and abbreviated terms
3.1 Terms and definitions
For the purposes of this document, the following terms and definitions apply.

ISO and IEC maintain terminological databases for use in standardization at the following

addresses:
• IEC Electropedia: available at http://www.electropedia.org/
• ISO Online browsing platform: available at http://www.iso.org/obp
3.1.1
context
one specific function of a TV
EXAMPLE: Watching TV, EPG.
3.1.2
DTV broadcast event classification
general category of programme/event content, or its classification
EXAMPLES: Movie (drama), news/current affairs, talk show, sports (football).
3.1.3
EPG filter

filter that organises or reduces the list of displayed EPG items according to certain criteria

EXAMPLES Of criteria are to show only:
• programmes with a certain content type;
• favourites;
• programmes that are audio described;

• programmes for a given time period (for instance "today", "tomorrow", "next 7 days").

---------------------- Page: 9 ----------------------
SIST EN 62731:2018
– 6 – IEC 62731:2018 © IEC 2018
3.1.4
event
trigger to start an action
3.1.5
list
collection of items
3.1.6
menu
subsequent order of items
3.1.7
receiver
device capable of receiving or handling digital television signals
3.1.8
service

sequence of programmes under the control of a broadcaster which can be broadcast as part

of a schedule
3.1.9
subtitle

textual representation of the dialogue (and frequently additional auditory information),

typically shown at the bottom of the screen

Note 1 to entry: Subtitles can be a textual rendering in the same language as the spoken dialogue, or can provide

a written translation in a different language.

Note 2 to entry: In some parts of the world, subtitles are called "(closed) captions", and subtitling is referred to as

"(closed) captioning".
Note 3 to entry: This document uses the term subtitles throughout.
3.1.10
TTS audio
audio generated by the TTS engine based on the TTS data

Note 1 to entry: If the TV uses an external TTS converter, TTS audio is interpreted as TTS data.

3.1.11
priority audio information

audio information for immediate output typically related to emergency information

3.1.12
TTS data
(text) data converted into TTS audio information by the text-to-speech engine
3.2 Abbreviated terms
DTV digital television
EPG electronic programme guide
STB set top box
TTS text-to-speech
TV television
UI user interface
---------------------- Page: 10 ----------------------
SIST EN 62731:2018
IEC 62731:2018 © IEC 2018 – 7 –
4 Guiding principles and conventions

This document describes the required basic behaviour for a TV text-to-speech combination in

a basic profile, but also provides for enhanced profiles. It also gives a short introduction into

the basic problems of visually impaired people: i.e. what are the problems visually impaired

people experience when using and watching TV?

Providing text-to-speech functionality for a broadcast receiver, for example a TV or an STB,

can be of great help to (visually) disabled people. Such speech functionality may be

integrated in the receiver or may be external to the receiver in a separate device.

In general as the guiding principle, when building a TTS interface in the context of this

document, implementers should aspire to achieve functional equivalence of the user

experience. This means that a person operating the device using the speech interface should

have access to similar information and be able to accomplish tasks similar to those they

would with a graphical UI.
The main features of this document are:

• basic functional description for a TV-TTS device combination or TV with integrated TTS;

• profiles for different levels of TV-TTS functionality;
• targeted towards the digital TV application.

In this document, mandatory requirements are specified; optional and informative features are

also included.

A claim of conformity with this document requires conformity with all mandatory requirements.

A TV-TTS device combination or a TV with a TTS that is integrated may provide options for a

user to enable or disable each feature of a product. While options for user configurations may

be provided, the product shall meet the mandatory requirements.
5 User requirements of visually impaired people
5.1 Users' needs

This subclause explains the needs of visually impaired people as the primary target users for

a TV with TTS. Unless these needs are met, the system is not accessible to this user group.

Visually impaired people experience access barriers in the course of the following activities

when watching TV:
a) following TV programming, e.g. the TV series;
b) using a remote control;
c) not being able to see subtitles;
d) navigating channels;
e) navigating TV inputs;

f) using additional data (text) services provided by the broadcaster, e.g. an EPG;

g) daily operation of the TV and initial setup of the TV for use.

Items a), b) and c) are outside the scope of this document. Item c) further relates to the fact

that, in some countries, foreign language programmes are translated via subtitles. For users

who cannot see the subtitles, supplementary audio services are sometimes used to deliver an

audio version of the subtitles. This document elaborates on the remaining four items, i.e. d),

e), f) and g), in 5.2 to 5.6.
---------------------- Page: 11 ----------------------
SIST EN 62731:2018
– 8 – IEC 62731:2018 © IEC 2018

NOTE 1 For DVB systems, item a) is already solved by audio description. Also, the use case of providing

supplementary audio services to deliver an audio version of the subtitles is covered in the DVB-SI specification

ETSI EN 300 468.

NOTE 2 For ATSC systems, the audio system includes a visually impaired (VI) associate service, which allows a

complete programme mix containing music, effects, dialogue, and additionally a narrative description of the picture

content, see ATSC A/53 part 5 and part 6.
5.2 Navigating channels

The problem is that a user does not know which channel the TV displays, i.e. the user gets

"lost during navigation". The TV is displaying navigation data on the screen but the user is

unable to see it. Such data are, for example:
• channel number,
• service name,
• (DTV broadcast) event name.
5.3 Navigating TV inputs

The problem is that a user is unable to select the required input to the TV, e.g. the user

wishes to select DTV or a specific external input linked to a recording or other device. The

choice is shown on the screen but the user is unable to see it.
5.4 Additional data services

With digital TV, a broadcaster may transmit additional data (text) services to augment TV

programming, provide additional information on programming, or provide news. Such

additional data are:
• information about whether audio description, subtitling is available,
• (next) (DTV broadcast) event name,
• (DVB-) event information (enhanced description of the (DTV broadcast) event),
• EPG data.

The items above are listed in order of importance with the most important item appearing first.

It is noted that this data provides additional convenience in using the TV, but that is non-

essential for the primary function of watching TV, and selecting channels.
5.5 Operating the TV

User settings are another needed function besides navigation. This can be done through

buttons on the remote control (out of scope of this specification), but also via on-screen

menus. For visually impaired people, on-screen menus are typically of little use.

A distinction exists between initial setup and daily operation of the TV. Initial setup is typically

a one-time operation during the lifetime of a TV. Daily operation is more frequent and more

important. Consequently, a distinction among menu items for daily operation exists, those

addressing specific accessibility functions, and TV setup menu items. However, the most

frequently used keys are "volume", "channel up/down", and number keys.
5.6 TV use

Use characterization of a TV helps in determining implementation profiles. Navigating

channels, for example, is done most often when watching TV, as well as commands such as

volume up and down. This may be supported by additional data services, but does not affect

the primary functions of the TV. Changing the TV's system settings is not done very often,

except perhaps for changing sound or video settings or switching audio description on and off.

Such settings may have an easy access mode through a special menu. TV installation is

typically performed only once during the lifetime of the TV. Often, visually impaired people

---------------------- Page: 12 ----------------------
SIST EN 62731:2018
IEC 62731:2018 © IEC 2018 – 9 –

can benefit from specialized support for installing the TV, i.e. it is part of the service when

buying a new TV. Understanding this life- and usage cycle of a TV helps with defining the

most effective and efficient solutions and is reflected in the profiles. In the following

paragraphs, we refer to "basic", "main" and "enhanced" profiles, which are further defined and

detailed in Clause 8.

Key operations for a minimum TTS implementation on a receiver for TV use are as set out in

the basic profile defined in this document. This basic profile shall include:

a) channel number, name and event information – key for a user to identify which service

has been selected;

b) availability of audio description – key for a user to know about the availability of this

service feature;

c) availability of subtitles – key for a user to know about the availability of this service

feature;

d) basic EPG – allow the user to navigate through the EPG, if such data is present in the

broadcast, to identify which future events are available to them;

e) context changes – key for a user to understand if the TV went to another state or when a

pop-up message appears.

Additional operations that shall be included in the main profile, in addition to all those in the

basic profile are:

f) receiver menu functions allows the user to navigate receiver operations and functions.

Additional operations that shall be included in the enhanced profile, in addition to all those

from the basic and main profiles, are:
g) event information – provide the event synopsis;

h) additional EPG data – allows the user to get more info on the service or event;

i) operations of a recording device – allows the user to record future events, possibly

selected via the EPG; play/pause a recorded event.
6 Functional requirements
6.1 Functionality for TV, TTS device combination

The TV-TTS device system diagram is illustrated in Figure 1. As shown in the figure, the TTS

device is a separate function from the TV, which can be implemented on a device connected

with a TV-TTS device interface, or may also be integrated in the TV.
Text to
speech
device
IEC
Figure 1 – TV – TTS device system diagram
The functionality requirements for a TV with TTS combination are:
• the TV, in principle, only outputs text strings towards the TTS device;

• the delay between an event and the resulting TTS audio related to that event shall be such

that they are perceived as belonging tied together;
---------------------- Page: 13 ----------------------
SIST EN 62731:2018
– 10 – IEC 62731:2018 © IEC 2018

• the user should be able to move to the next operation even in the middle of currently

playing TTS audio;
• the user should be able to stop currently playing TTS audio;
• the user shall be able to repeat the current or previous TTS audio;
• the user shall be able to mute the TTS audio;
• the user shall be able to switch on/off the TTS function;

• the language of the TTS audio shall be the same as set for the TV’s UI, except when

signalled differently; the TTS device/engine may choose to pronounce the text or to

indicate failure in case it does not support the signalled language;

• TTS audio may not need to literally represent the related visual information on the screen

as long as the meaning of the visual information stays intact;

• priority audio information shall overrule currently playing TTS audio information.

NOTE For example, EAS (Emergency Alert System) [USA] or EWS (Emergency Warning System) [JPN] are higher

priority than TTS audio information.
6.2 Functionality: TTS device/engine
The functionality requirements for the TTS device/engine are:

• an external TTS device should be designed to be fully accessible to visually impaired

users without being dependent on the TV;

• the volume level of the TTS device/engine shall be changeable by the user: the TTS

device shall announce the new volume level; it should be possible to do this independently

of the TV volume;

• the user should be able to adjust speech characteristics like speed, pitch, voice type,

when applicable;

• the TTS engine should announce abbreviations in a manner suited to the context, for

example: "IEC" should be announced "I E C"; and "IEEE" should be announced "I triple E".

The TTS engine may also pronounce common acronyms in full, e.g. "sub" could be spoken

as "subtitles" where appropriate;

• the TTS engine should announce numbers in a manner suited to the context, e.g. as a

natural number, digit-by-digit, etc.;

• the TTS engine should announce proper nouns (e.g. names, place names) correctly;

• levels of announcement quality are at the discretion of the implementers; the following are

examples of the implementation in order to achieve good levels of quality:

– consider ways to update the vocabulary database and conversion rules of the TTS

engine;

– consider provision of additional information (= metadata) to help the TTS engine for

pronunciations or readings;

– consider ways in which device users can provide service providers with feedback on

incorrectly announced terms.
6.3 Functionality: TV

The TV determines the user interface, i.e. what is displayed on the screen, and how the TV

interacts with the user. The TV therefore also determines which text is sent to the TTS engine.

The TV can receive updated words, associated conversions and updated conversion rules for

the TTS engine via a network connection.

The user should be able to control, via the TV TTS, settings such as volume, speed, pitch,

voice type.
---------------------- Page: 14 ----------------------
SIST EN 62731:2018
IEC 62731:2018 © IEC 2018 – 11 –
NOTE In Europe
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.