Information technology — Media context and control — Part 1: Architecture

This document specifies the architecture of MPEG-V (media context and control) and its three types of associated use cases: — information adaptation from virtual world to real world; — information adaptation from real world to virtual world; — information exchange between virtual worlds.

Technologies de l'information — Contexte et contrôle des medias — Partie 1: Architecture

General Information

Status
Published
Publication Date
01-Sep-2020
Current Stage
9060 - Close of review
Completion Date
04-Mar-2031
Ref Project

Relations

Standard
ISO/IEC 23005-1:2020 - Information technology — Media context and control — Part 1: Architecture Released:9/2/2020
English language
40 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


INTERNATIONAL ISO/IEC
STANDARD 23005-1
Fourth edition
2020-08
Information technology — Media
context and control —
Part 1:
Architecture
Technologies de l'information — Contexte et contrôle des medias —
Partie 1: Architecture
Reference number
©
ISO/IEC 2020
© ISO/IEC 2020
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2020 – All rights reserved

Contents Page
Foreword .v
Introduction .vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 MPEG-V system architecture . 3
5 Use cases . 5
5.1 General . 5
5.2 System architecture for information adaptation from virtual world to real world . 5
5.3 System architecture for information adaptation from real world to virtual world . 6
5.4 System architecture for exchanges between virtual worlds . 7
6 Instantiations . 8
6.1 Instantiation A: representation of sensory effects (RoSE) . 8
6.1.1 System architecture for representation of sensory effects . . 8
6.1.2 Instantiation A.1: multi-sensorial effects . 9
6.1.3 Instantiation A.2: motion effects . 9
6.1.4 Instantiation A.3: arrayed light effects .10
6.2 Instantiation B: natural user interaction with virtual world .11
6.2.1 System architecture for natural user interaction with virtual world .11
6.2.2 Examples of sensors . . .11
6.2.3 Instantiation B.1: Full motion control and navigation of avatar/object
with multi-input sources .12
6.2.4 Instantiation B.2: serious gaming for ambient assisted living .12
6.2.5 Instantiation B.3: gesture recognition using multipoint interaction devices .13
6.2.6 Instantiation B.4: avatar facial expression retargeting using smart camera .13
6.2.7 Instantiation B.5: motion tracking and facial animation with multimodal
interaction .14
6.2.8 Instantiation B.6: serious gaming and training with multimodal interaction .14
6.2.9 Instantiation B.7: virtual museum guide with embodied conversational agents .15
6.3 Instantiation C: traveling and navigating real and virtual worlds .15
6.3.1 System architecture for traveling and navigating real and virtual worlds .15
6.3.2 Examples of sensors and path finding mechanisms .15
6.3.3 Instantiation C.1: virtual travel .16
6.3.4 Instantiation C.2: virtual traces of real places .16
6.3.5 Instantiation C.3: virtual tour guides .17
6.3.6 Instantiation C.4: unmanned aerial vehicle scenario .18
6.4 Instantiation D: interoperable virtual worlds .18
6.4.1 System architecture for interoperable virtual worlds .18
6.4.2 Instantiation D.1: avatar appearance .18
6.4.3 Instantiation D.2: virtual objects .18
6.5 Instantiation E: social presence, group decision making and collaboration within
virtual worlds .19
6.5.1 System architecture .19
6.5.2 Instantiation E.1: social presence .19
6.5.3 Instantiation E.2: group decision making in the context of spatial planning .20
6.5.4 Instantiation E.3: consumer collaboration in product design processes
along the supply chain .21
6.6 Instantiation F: interactive haptic sensible media .21
6.6.1 System architecture for interactive haptic sensible media .21
6.6.2 Instantiation F.1: Internet haptic service — YouTube, online chatting .22
6.6.3 Instantiation F.2: next-generation classroom — sensation book .23
6.6.4 Instantiation F.3: immersive broadcasting — home shopping, fishing channels .23
© ISO/IEC 2020 – All rights reserved iii

® ®
6.6.5 Instantiation F.4: entertainment — game (Second Life , StarCraft ),
movie theatre .23
6.6.6 Instantiation F.5: virtual simulation for training — military task, medical
simulations .24
6.7 Instantiation G: bio-sensed information in the virtual world .24
6.7.1 System architecture for bio-sensed information in the virtual world .24
6.7.2 Instantiation G.1: interactive games sensitive to user’s conditions .24
6.7.3 Instantiation G.2: virtual hospital and health monitoring .25
6.7.4 Instantiation G.3: mental health for lifestyle management.25
6.7.5 Instantiation G.4: food intake for lifestyle management .25
6.7.6 Instantiation G.5: cardiovascular rehabilitation for health management .26
6.7.7 Instantiation G.6: glucose level/diabetes management for health management .26
6.8 Instantiation H: environmental monitoring with sensors.27
6.8.1 General.27
6.8.2 System architecture for environmental monitoring .27
6.8.3 Instantiation H.1: environmental monitoring system .28
6.9 Instantiation I: virtual world interfacing with TV platforms .28
6.10 Instantiation J: seamless integration between real and virtual worlds .29
6.10.1 System architecture for seamless integration between real and virtual worlds .29
6.10.2 Instantiation J.1: seamless interaction between real and virtual worlds
with integrating virtual and real sensors and actuators .29
6.11 Instantiation K: hybrid communication .31
6.12 Instantiation L: makeup avatar .33
6.12.1 Spectrum data acquisition .33
6.12.2 Transformation model generation .35
6.13 Instantiation M: usage scenario for automobile sensors.35
6.13.1 Helping auto maintenance/regular inspection .35
6.13.2 Monitoring for eco-friendly driving .36
6.14 Instantiation N: usage scenario for 3D printing .37
6.15 Instantiation O: olfactory information in virtual world.38
6.15.1 System architecture for olfactory information in virtual world .38
6.15.2 Instantiation O.1: olfactory signature(fingerprint) with e-nose .38
6.15.3 Instantiation O.2: 4D film with scent effect .38
6.15.4 Instantiation O.3: healing minds of combat veterans .38
6.15.5 Instantiation O.4: advertisement with olfactory information .38
6.15.6 Instantiation O.5: harmful odour monitoring .38
6.16 Instantiation P: virtual panoramic vision in car.39
6.16.1 General.39
6.16.2 Instantiation O.6.1: virtual panoramic IVI (in-vehicle information system).39
6.16.3 Instantiation O.6.2: virtual panoramic black box .39
6.17 Instantiation Q: adaptive sound handling .39
Bibliography .40
iv © ISO/IEC 2020 – All rights reserved

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are
members of ISO or IEC participate in the development of International Standards through technical
committees established by the respective organization to deal with particular fields of technical
activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international
organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the
work. In the field of information technology, ISO and IEC have established a joint technical committee,
ISO/IEC JTC 1.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the
different types of ISO documents should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www .iso .org/ patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/
is o/ f or ewor d . ht m l .
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
This fourth edition cancels and replaces the third edition (ISO/IEC 23005-1:2016), which has been
technically revised.
The main changes compared to the previous edition are as follows:
— added a new use case for 3D printing;
— added six new use cases for olfactory information in virtual world;
— added two new use cases for virtual panoramic vision in car;
— added a new use case for adaptive sound handling.
A list of all parts in the ISO/IEC 23005 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
© ISO/IEC 2020 – All rights reserved v

Introduction
The ISO/IEC 23005 series provides an architecture and specifies information representation of data
flowing in and out of the real world and virtual worlds.
The data for the real world are communicated through sensors and actuators. The data for virtual
worlds consist of properties of virtual objects and multi-sensorial data embedded in audio-visual
content. The ISO/IEC 23005 series specifies data formats for sensors, actuators, virtual objects and
audio-visual content.
Data captured from the real world can need to be adapted for use in a virtual world and data from
virtual worlds can also need to be adapted for use in the real world. This document does not specify
how the adaptation is carried out but only specifies the interfaces.
Data for sensors are sensor capabilities, sensed data and sensor adaptation preferences.
Data for actuators are sensory device capabilities, sensory device commands and sensory effect
preferences.
Data for virtual objects are characteristics of avatars and virtual world objects.
Data for audio-visual content are sensory effects.
This document contains the tools for exchanging information for interaction devices. To be specific,
it specifies command formats for controlling actuators (e.g. actuator commands for sensory devices)
and data formats for receiving information from sensors (e.g. sensed information from sensors) as
illustrated as the yellow boxes in Figure 1. It also specifies some examples. The adaptation engine is not
within the scope.
vi © ISO/IEC 2020 – All rights reserved

Figure 1 — Scope of the data formats for interaction devices
When this document is used, the adaptation engine (RV or VR engine), which is not within the scope
of standardization, performs bi-directional communications using data formats specified in this
document. The adaptation engine can also utilize other tools defined in ISO/IEC 23005-2, which are
user's sensory preferences (USP), sensory device capabilities (SDC), sensor capabilities (SC) and sensor
adaptation preferences (SAP) for fine controlling devices in both real and virtual worlds.
The International Organization for Standardization (ISO) and International Electrotechnical
Commission (IEC) draw attention to the fact that it is claimed that compliance with this document may
involve the use of a patent.
ISO and IEC take no position concerning the evidence, validity and scope of this patent right.
The holder of this patent right has assured ISO and IEC that he/she is willing to negotiate licences under
reasonable and non-discriminatory terms and conditions with applicants throughout the world. In this
respect, the statement of the holder of this patent right is registered with ISO and IEC. Information may
be obtained from the patent database available at www .iso .org/ patents.
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights other than those in the patent database. ISO and IEC shall not be held responsible for
identifying any or all such patent rights.
© ISO/IEC 2020 – All rights reserved vii

INTERNATIONAL STANDARD ISO/IEC 23005-1:2020(E)
Information technology — Media context and control —
Part 1:
Architecture
1 Scope
This document specifies the architecture of MPEG-V (media context and control) and its three types of
associated use cases:
— information adaptation from virtual world to real world;
— information adaptation from real world to virtual world;
— information exchange between virtual worlds.
2 Normative references
There are no normative references in this document.
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at http:// www .electropedia .org/
3.1
device command
description of controlling actuators used to generate sensory effects (3.9)
3.2
R→V adaptation
procedure that:
— processes the sensed information (3.3) from the real world in order to be consumed within the
virtual world’s, context;
— takes the sensed information with/without the sensor capabilities from sensors (3.4), the sensor
adaptation preferences (3.5) from users (3.12) and/or the virtual world object characteristics from a
virtual world;
— controls the virtual world (3.13) object characteristics or adapts the sensed information by adapting
the sensed information based on the sensor capabilities and/or the sensor adaptation preferences
3.3
sensed information
information acquired by sensors (3.4)
© ISO/IEC 2020 – All rights reserved 1

3.4
sensor
device by which user (3.12) input or environmental information can be gathered
EXAMPLE Temperature sensor, distance sensor, motion sensor, etc.
3.5
sensor adaptation preferences
description schemes and descriptors to represent (user’s) preferences with respect to adapting sensed
information (3.3)
3.6
sensor capability
representation of the characteristics of sensors in terms of the capability of the given sensor (3.4) such
as accuracy, or sensing range
3.7
sensory device
consumer device by which the corresponding sensory effect (3.9) can be made
Note 1 to entry: Real-world devices can contain any combination of sensors (3.4) and actuators in one device.
3.8
sensory device capability
representation of the characteristics of actuators used to generate sensory effects (3.9) in terms of the
capability of the given actuator
3.9
sensory effect
effect to augment perception by stimulating human senses in a particular scene
EXAMPLE Scent, wind, light, haptic [kinesthetic-force, stiffness, weight, friction, texture, widget (button,
slider, joystick, etc.), tactile: air-jet, suction pressure, thermal, current, vibration, etc. Note that combinations of
tactile display can also provide directional, shape information.]
3.10
sensory effect metadata
metadata that defines the description schemes and descriptors to represent sensory effects (3.9)
3.11
user’s sensory preferences
description schemes and descriptors to represent (user’s) preferences with respect to rendering of
sensory effect (3.9)
3.12
user
end user of the system
3.13
virtual world
digital content, real time or non-real time, of various nature
EXAMPLE On-line virtual world, simulation environment, multi-user game, broadcast multimedia
production, peer-to-peer multimedia production or packaged content like a DVD or game.
3.14
V→R adaptation
procedure that:
— processes the sensory effects (3.9) from the virtual world (3.13) in order to be consumed within the
real world’s context;
2 © ISO/IEC 2020 – All rights reserved

— takes sensory effect metadata (3.10) from a virtual world, sensory device (actuator) capabilities
from the sensory devices (actuators), the user’s sensory preferences (3.11) from users (3.12) and/or
the sensed information (3.3) as well as the sensor capabilities from sensors (3.4) as inputs;
— generates the device commands (3.1) by adapting the sensory effects based on the sensed information,
the capabilities and/or the preferences
3.15
VW object characteristics
description schemes and descriptors to represent and describe virtual world objects (from the real
world into the virtual world and vice versa)
4 MPEG-V system architecture
A strong connection (defined by an architecture that provides interoperability through standardization)
between the virtual and the real world is needed to reach simultaneous reactions in both worlds to
changes in the environment and human behaviour. Efficient, effective, intuitive and entertaining
interfaces between users and virtual worlds are of crucial importance for their wide acceptance and
use. To improve the process of creating virtual worlds, a better design methodology and better tools are
indispensable. For fast adoption of virtual world technologies, a better understanding of their internal
economics, rules and regulations is needed. The overall system architecture for the MPEG-V framework
is depicted in Figure 2.
© ISO/IEC 2020 – All rights reserved 3

Figure 2 — System architecture of the MPEG-V framework
The MPEG-V system architecture can be used to serve three different media exchanges. There are
two types of media exchanges occurring between real world and virtual world, i.e. the information
exchange from real world to virtual world and the information exchange from virtual world to real
world. An additional type of media exchanges is the information exchange between virtual worlds. The
three media exchanges are defined as use cases in Clause 5.
Sensory effect metadata, sensory device capability, user’s sensory preferences, device commands,
sensed information, sensor device capability, sensor adaptation preferences and virtual world
object characteristics are within the scope of standardization and are specified in other parts of the
ISO/IEC 23005 series.
On the other side, the V→R adaptation engine, R→V adaptation engine, virtual world as well as devices
(sensors and sensory devices) are left open for industry competition.
Metadata is specified in other parts of the ISO/IEC 23005 series. Sensor device capability, sensory
device capability, sensor adaptation preferences and user’s sensory preferences are specified in
ISO/IEC 23005-2. Sensory effect metadata is specified in ISO/IEC 23005-3. Virtual world object
characteristics is specified in ISO/IEC 23005-4. Device commands and sensed information are specified
in ISO/IEC 23005-5.
4 © ISO/IEC 2020 – All rights reserved

5 Use cases
5.1 General
The three types of media exchanges require information adaptations for a targeting world to adapt
information based on capabilities and preferences: information adaptation from virtual world to real
world, information adaptation from real world to virtual world and information adaptation between
virtual worlds.
5.2 System architecture for information adaptation from virtual world to real world
The system architecture for the information adaptation from virtual world to real world is depicted in
Figure 3. It represents V→R adaptation comprising sensory effect metadata, VW object characteristics,
sensory device capability (actuator capability), device commands, user’s sensory preferences and a
V→R adaptation engine which generates output data based on its input data.
Figure 3 — Example of system architecture for information adaptation from virtual world to
real world
A virtual world within the framework is referred to as an entity that acts as the source of the sensory
effect metadata and VW object characteristics such as a broadcaster, content creator/distributor, or
even a service provider. The V→R adaptation engine is an entity that takes the sensory effect metadata,
the sensory device (actuator) capability and the user’s sensory preferences as inputs and generates
the device commands based on those in order to control the consumer devices enabling a worthwhile,
informative experience to the user.
Real-world devices (sensory devices) are entities that act as the sink of the device commands and as
the source of sensory device (actuator) capability. Additionally, entities that provide user’s sensory
preferences towards the RoSE engine are also collectively referred to as real-world devices. Note that
© ISO/IEC 2020 – All rights reserved 5

sensory devices (actuators) are a sub-set of real-world devices including fans, lights, scent devices,
human input devices, such as a TV set with a remote control (e.g. for preferences).
The actual sensory effect metadata provides means for representing so-called sensory effects, i.e.
effects to augment feeling by stimulating human sensory organs in a particular scene of a multimedia
application. Examples of sensory effects are scent, wind, light, etc. The means for transporting this kind
of metadata is referred to as sensory effect delivery format which, of course, can be combined with an
audio/visual delivery format, e.g. MPEG-2 transport stream, file format, real-time transport protocol
(RTP) payload format, etc.
The sensory device capability defines description formats to represent the characteristics of sensory
devices (actuators) in terms of which sensory effects they are capable of performing and how. A sensory
device (actuator) is a consumer device by which the corresponding sensory effect can be made (e.g.
lights, fans, heater, fan, etc.). Device commands are used to control the sensory devices (actuators).
As for sensory effect metadata, also for sensory device (actuator) capability and device commands,
corresponding means for transporting these assets are referred to as sensory device capability/
commands delivery format respectively.
Finally, the user’s sensory preferences allow end users to describe their preferences with respect to
rendering of sensory effects.
5.3 System architecture for information adaptation from real world to virtual world
The system architecture for information adaptation from real world to virtual world is depicted in
Figure 4. It represents R2V adaptation comprising VW object characteristics, sensed information,
sensor capability, sensor adaptation preferences and an R→V adaptation engine which generates output
data based on its input data.
Figure 4 — Example of system architecture for information adaptation from real world to
virtual world
6 © ISO/IEC 2020 – All rights reserved

R→V adaptation engine is an entity that:
— processes the sensed information from the real world in order to be consumed within the virtual
world’s context;
— takes the sensed information with/without the sensor capabilities from sensors, the sensor adaptation
preferences from users and/or the virtual world object characteristics from a virtual world;
— controls the virtual world object characteristics or adapts the sensed information by adapting the
sensed information based on the sensor capabilities and/or the sensor adaptation preferences.
There are two possible implementations to adapt information from real world to virtual world.
In the first system implementation, R→V adaptation takes the sensor capabilities as inputs, the sensed
information from sensors and sensor adaptation preferences from users, and adapts the sensed
information based on the sensor capabilities and/or sensor adaptation preferences.
In the second system implementation, R→V adaptation takes the sensed information with/without
the sensor capabilities from sensors, the sensor adaptation preferences from users and/or the virtual
world object characteristics from a virtual world, and controls the virtual world object characteristics
adapting the sensed information based on the sensor capabilities and/or the sensor adaptation
preferences.
5.4 System architecture for exchanges between virtual worlds
The system architecture for information exchange between virtual worlds is depicted in Figure 5. It
represents information exchange comprising VW object characteristics which generates exchangeable
information within virtual worlds.
Figure 5 — Example of system architecture for (bidirectional) exchange of information
between virtual worlds
© ISO/IEC 2020 – All rights reserved 7

V→V adaptation adapts proprietary virtual world object characteristics from a virtual world to VW
object characteristics and sends the VW object characteristics from the virtual world to another virtual
world to support interoperability. Based on the data provided in virtual world object characteristics,
the virtual world internally adapts its own representation for virtual object/avatar.
6 Instantiations
6.1 Instantiation A: representation of sensory effects (RoSE)
6.1.1 System architecture for representation of sensory effects
The system for representation of sensory effects is partly instantiated from the system architecture
of information adaption from virtual world to real world. The overall system architecture for
representation of sensory effects (RoSE) is depicted in Figure 6 comprising sensory effect metadata,
sensory device (actuator) capability, device commands, user’s sensory preferences and a V→R
adaptation engine which generates output data based on its input data.
Figure 6 — RoSE system architecture
A provider within the RoSE framework is referred to as an entity that acts as the source of the sensory
effect metadata such as a broadcaster, content creator/distributor or even a service provider. The V→R
adaptation engine is an entity that takes the sensory effect metadata, the sensory device (actuator)
capability and the user’s effect preferences as inputs, and generates the device commands in order to
control the consumer devices enabling a worthwhile, informative experience to the user.
8 © ISO/IEC 2020 – All rights reserved

Consumer devices are entities that act as the sink of the device commands and act as the source of
sensory device (actuator) capability. Additionally, entities that provide user’s sensory preferences
towards the V→R adaptation engine are also collectively referred to as consumer devices. Note that
sensory devices (actuators) are a sub-set of consumer devices including fans, lights, scent devices,
human input devices such as a TV set with a remote control (e.g. for preferences).
6.1.2 Instantiation A.1: multi-sensorial effects
Traditional multimedia with audio/visual contents have been presented to users via display devices
and audio speakers. In practice, however, users are becoming excited about more advanced experiences
of consuming multimedia contents with high fidelity. For example, stereoscopic video, virtual reality,
3-dimensional television, multi-channel audio, etc., are typical types of media increasing the user
experience but are still limited to audio/visual contents.
From a rich multimedia perspective, an advanced user experience would also include special effects
such as opening/closing window curtains for a sensation of fear effect, turning on a flashbulb for
lightning flash effects as well as fragrance, flame, fog and scare effects, which can be made by scent
devices, flame-throwers, fog generators and shaking chairs respectively. Such scenarios would
require enriching multimedia contents with information enabling consumer devices to render them
appropriately in order to create the advanced user experience as described above.
From a technical perspective, this requires a framework for the representation of sensory effects
(RoSE) information which can define metadata about special or sensory effects, characteristics of
target devices, synchronizations, etc. The actual presentation of the RoSE information and associated
audio/visual contents allows for an advanced, worthwhile user experience.
6.1.3 Instantiation A.2: motion effects
One of the important sensory effects that should not be ignored is the effect related to the motion. The
motion effect gives a user a similar feeling on the movement like the actor/actress feels in the movie.
The motion effect is a popular sensory effect commonly used in places like theme parks, game rooms
and movie theatres nowadays. The motion effect is usually provided by the motion chair. The motion
chair usually has motor(s) and axis underneath or above the chair. The number of motors and the length
of axes determine the range and depth of the movement of the chair. There are a lot of manufacturers of
motion chairs in the world and each of them has its own mechanical characteristics.
For example, the motion effect chair of a manufacturer provides several types of motions including tilt
motion (pitch) with various speed and accelerations, fast falling, continuous wave motion at variable
speeds, swaying sideways in variable speeds and accelerations, vibration, and combination of wave and
sway motions. The 4D chair from another manufacturer also supports falling down, rolling and pitching
motion with speed, but not yawing or forward/backward move motion. On the other hand, there are 4D
chair manufactures which only support vibration and falling effect.
Therefore, designing the sensory effect of motion shall not be limited by the capabilities of a specific
chair. We need to consider that authors produce sensory effect metadata based on the audio-visual
data and they do not know beforehand the mechanical characteristics of the motion chair by which
the sensory effects will be rendered. As a matter of fact, the author does not know in which device the
motion sensory metadata will be rendered. This means that the motion sensory metadata should not
restrict the actual movement of motion chair. It is appropriate to describe the conceptual motion in the
scene. Figure 10 shows an example to explain why the motion sensory effect should be conceptual. For
example, the author wants to express “Turn left” motion effect. There are two kinds of motion chair.
The first chair supports rolling, yawing and surging. On the other hand, the second chair only supports
rolling. If motion effect (SI) is expressed in physical terms like “Yawing 90 degrees”, whether the chair
can render it or not depends on the capabilities of chair. However, if motion effect (SI) is expressed in
conceptual terms like “Left turn”, both chairs can render this with their own capability.
In other words, considering the process of adaptation by the engine, the SI should carry the semantics
of the motion effect or the intention of the author so that the adaptation engine can find the best
© ISO/IEC 2020 – All rights reserved 9

combination of command information to satisfy the author's intention under the given restrictions of
the specific motion effect chair.
The proposed schema for motion effect is based on six degrees of freedom (6DoF), which is commonly
used for motion description in robotics and engineering. 6DoF is composed of 3 dimensional axes,
pitch, yaw and roll as shown in Figure 7. It is a well-known fact that any motion of a rigid body can
be described by the 6 DoF motion. The 6 DoF motions have been abstracted into several basic motion
patterns and more combinational motion patterns, which are based on the repetition or a combination
of basic patterns and have specific semantics, were added.

a
3-dimension.
b
Pitching.
c
Rolling.
d
Yawing.
Figure 7 — Six degrees of freedom
6.1.4 Instantiation A.3: arrayed light effects
From a rich media perspective, realistic media coupled/assisted with their target devices are very
beneficial to users because their experiences on media consumption can be greatly enhanced. Business
market such as 4D cinema can be enriched by coupling media and devices together. Multiple devices
are used to enhance realistic experiences, which can simulate the effect of light, wind, flash and more.
When creating a 4D movie, a light effect can be used to indicate the mood or warmth of the space, but
an arrayed light effect can be used to augment the events in a story with a much higher resolution by
expressing complex light sources with an array of multiple light display actuators.
An example of arrayed light effect use case is a simulation of fireworks. Fireworks visual effects can be
simulated in sync with the blasting event occurring in the movie with two arrayed light actuators in 4D
theatre. This 4D theatre provides a big main screen in front of 4D chairs and two light display actuators
are installed on the left and the right side. While the video of fireworks is playing on the big screen, the
arrayed light effect represented with a sequence of m by n matrices is simulated by the light display
actuators according to the timeline of the movie contents. The source of arrayed light effect can be light
sequences composed of an m by n matrix mapped to a light display actuator or a file of video or image
composed of m by n matrix.
Another use case of the arrayed light effect is given by visually augmenting cars that are passing
through a tunnel with coloured lights in 4D theatre. When cars are passing through a tunnel with
10 © ISO/IEC 2020 – All rights reserved

tunnel lights of red, blue and green colours in the movie, effect editors can add an arrayed light effect
in order to make viewers to experience the same light effect as the passengers in the cars. The arrayed
light effect metadata can be authored by editors who are unaware of the capability or the number of
light display actuators installed in the theatre. Then, the adaptation engine generates device commands
for actuating proper light actuators by analysing and extending the arrayed light effect metadata.
6.2 Instantiation B: natural user interaction with virtual world
6.2.1 System architecture for natural user interaction with virtual world
The system for natural user interaction with virtual world is instantiated from the system architecture
of information adaption from real world to virtual world. The sensors for such interaction include gaze
tracking sensors, multi-point sensors, smart cameras, motion sensors, gesture recognition sensors,
intelligent cameras, etc. The R→V adaptation engine analyses the interaction intention from the
information from the sensors and adapts VW object characteristics and/or the sensed information to
send the intention to a virtual world.
6.2.2 Examples of sensors
6.2.2.1 Gaze tracking sensors
Eye tracking is the process of measuring either the point of gaze ("where we are looking") or the motion
of an eye relative to the head. An eye tracker is a device for measuring eye positions and eye movement.
Eye trackers are used in research on the visual system, in psychology, in cognitive linguistics and
in product design. There are a number of methods for measuring eye movement. The most popular
variant uses video images from which the eye position is extracted. Other methods use search coils or
are based on the electro-oculogram.
In the video-based eye trackers, a camera focuses on one or both eyes and records their movement as
the viewer looks at some kind of stimulus. Modern eye-trackers use contrast to locate the centre of the
pupil and use infrared and near-infrared non-collimated light to create a corneal reflection (CR). The
vector between these
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.