ISO/FDIS 9491-1
(Main)Biotechnology — Predictive computational models in personalized medicine research — Part 1: Constructing, verifying and validating models
Biotechnology — Predictive computational models in personalized medicine research — Part 1: Constructing, verifying and validating models
This document specifies requirements and recommendations for the design, development and establishment of predictive computational models for research purposes in the field of personalized medicine. It addresses the set-up, formatting, validation, simulation, storing and sharing of computational models used for personalized medicine. Requirements and recommendations for data used to construct or required for validating such models are also addressed. This includes rules for formatting, descriptions, annotations, interoperability, integration, access and provenance of such data. This document does not apply to computational models used for clinical, diagnostic or therapeutic purposes.
Biotechnologie — Modèles informatiques prédictifs dans la recherche sur la médecine personnalisée — Partie 1: Construction, vérification et validation des modèles
General Information
- Status
- Not Published
- Technical Committee
- ISO/TC 276 - Biotechnology
- Drafting Committee
- ISO/TC 276 - Biotechnology
- Current Stage
- 5020 - FDIS ballot initiated: 2 months. Proof sent to secretariat
- Start Date
- 24-Mar-2026
- Completion Date
- 24-Mar-2026
Relations
- Effective Date
- 24-Jun-2023
Overview
ISO/FDIS 9491-1: Biotechnology - Predictive computational models in personalized medicine research - Part 1: Constructing, verifying and validating models provides essential requirements and recommendations for the design, development, and establishment of predictive computational models used in personalized medicine research. This International Standard was developed by ISO/TC 276 and addresses the vital processes involved in setting up, formatting, validating, simulating, storing, and sharing computational models. It also covers specifications for the data utilized in constructing and validating such models, including aspects of formatting, annotation, interoperability, integration, access, and provenance.
The scope of ISO/FDIS 9491-1 is strictly research-oriented; it does not extend to computational models deployed for routine clinical, diagnostic, or therapeutic applications.
Key Topics
- Model Creation and Management: Guidance on the entire lifecycle of computational models, from conception to sharing, including key activities like model setup, formatting, and documentation, as well as transparent validation and simulation.
- Data Requirements and Quality: Emphasizes standards for data quality, including requirements for data formatting, semantic annotation, provenance, and accessibility. It highlights the need for harmonizing heterogeneous datasets from various sources to facilitate robust model construction and validation.
- Interoperability and Integration: Stresses the importance of using common standards and interoperable formats for both models and data, fostering collaboration across research domains and institutions.
- Model Validation and Verification: Outlines clear processes for verifying model elements and for validating models against independent datasets. Emphasizes reproducibility and transparency as critical for credible predictive results.
- Ethical and Legal Considerations: Recognizes the need for compliance with ethical principles and regulations, especially when dealing with human data in personalized medicine research.
- Standardized Workflows: Recommends workflows guided by principles like FAIR (Findable, Accessible, Interoperable, Reusable) and ALCOA (Attributable, Legible, Contemporaneous, Original, Accurate), supporting data and model reusability and high-quality results.
Applications
The requirements and recommendations in ISO/FDIS 9491-1 are applicable to a variety of research activities in the field of personalized medicine, including:
- Biotechnology Research: Facilitating the development of predictive models in biomolecular and cellular research, supporting deeper understanding of disease mechanisms and progression.
- Preclinical and Clinical Trials: Enabling simulation-based studies and in silico trials that can accelerate hypothesis testing, drug development, and personalized therapy assessment.
- Data Harmonization Projects: Supporting collaborative research projects that require integration of diverse health data, including genomics, proteomics, clinical observations, and environmental data.
- Artificial Intelligence and Machine Learning Applications: Providing a framework for developing, sharing, and validating AI-driven predictive models in health research contexts.
- Education and Methodological Standardization: Serving as a reference for best practices in constructing, validating, and exchanging computational models among academic, industrial, and regulatory communities.
Related Standards
ISO/FDIS 9491-1 references and aligns with several other key standards to enhance consistency and data quality in biotechnology and health informatics. These include:
- ISO 20691:2022: Biotechnology - Requirements for data formatting and description in the life sciences
- ISO/TS 23494-1:2023: Biotechnology - Provenance information model for biological material and data - Part 1: Design concepts and general requirements
- Community-driven standards for data annotation, such as SNOMED CT, LOINC, and ICD
Adopting ISO/FDIS 9491-1 promotes harmonization and interoperability, driving advances in personalized medicine research through transparent, reproducible, and collaborative computational modelling approaches. By following these international guidelines, researchers and organizations can ensure higher data quality, better integration, and greater impact for predictive models in biotechnology.
Buy Documents
ISO/FDIS 9491-1 - Biotechnology — Predictive computational models in personalized medicine research — Part 1: Constructing, verifying and validating models/24/2025
ISO/FDIS 9491-1 - Biotechnology — Predictive computational models in personalized medicine research — Part 1: Constructing, verifying and validating models
REDLINE ISO/FDIS 9491-1 - Biotechnology — Predictive computational models in personalized medicine research — Part 1: Constructing, verifying and validating models
Frequently Asked Questions
ISO/FDIS 9491-1 is a draft published by the International Organization for Standardization (ISO). Its full title is "Biotechnology — Predictive computational models in personalized medicine research — Part 1: Constructing, verifying and validating models". This standard covers: This document specifies requirements and recommendations for the design, development and establishment of predictive computational models for research purposes in the field of personalized medicine. It addresses the set-up, formatting, validation, simulation, storing and sharing of computational models used for personalized medicine. Requirements and recommendations for data used to construct or required for validating such models are also addressed. This includes rules for formatting, descriptions, annotations, interoperability, integration, access and provenance of such data. This document does not apply to computational models used for clinical, diagnostic or therapeutic purposes.
This document specifies requirements and recommendations for the design, development and establishment of predictive computational models for research purposes in the field of personalized medicine. It addresses the set-up, formatting, validation, simulation, storing and sharing of computational models used for personalized medicine. Requirements and recommendations for data used to construct or required for validating such models are also addressed. This includes rules for formatting, descriptions, annotations, interoperability, integration, access and provenance of such data. This document does not apply to computational models used for clinical, diagnostic or therapeutic purposes.
ISO/FDIS 9491-1 is classified under the following ICS (International Classification for Standards) categories: 07.080 - Biology. Botany. Zoology. The ICS classification helps identify the subject area and facilitates finding related standards.
ISO/FDIS 9491-1 has the following relationships with other standards: It is inter standard links to ISO/TS 9491-1:2023. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.
ISO/FDIS 9491-1 is available in PDF format for immediate download after purchase. The document can be added to your cart and obtained through the secure checkout process. Digital delivery ensures instant access to the complete standard document.
Standards Content (Sample)
DRAFT
International
Standard
ISO/DIS 9491-1
ISO/TC 276
Biotechnology — Predictive
Secretariat: DIN
computational models in
Voting begins on:
personalized medicine research —
2025-07-21
Part 1:
Voting terminates on:
2025-10-13
Constructing, verifying and
validating models
Biotechnologie — Modèles informatiques prédictifs dans la
recherche sur la médecine personnalisée —
Partie 1: Construction, vérification et validation des modèles
ICS: 07.080
THIS DOCUMENT IS A DRAFT CIRCULATED
FOR COMMENTS AND APPROVAL. IT
IS THEREFORE SUBJECT TO CHANGE
AND MAY NOT BE REFERRED TO AS AN
INTERNATIONAL STANDARD UNTIL
PUBLISHED AS SUCH.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL,
TECHNOLOGICAL, COMMERCIAL AND
USER PURPOSES, DRAFT INTERNATIONAL
STANDARDS MAY ON OCCASION HAVE TO
This document is circulated as received from the committee secretariat.
BE CONSIDERED IN THE LIGHT OF THEIR
POTENTIAL TO BECOME STANDARDS TO
WHICH REFERENCE MAY BE MADE IN
NATIONAL REGULATIONS.
RECIPIENTS OF THIS DRAFT ARE INVITED
TO SUBMIT, WITH THEIR COMMENTS,
NOTIFICATION OF ANY RELEVANT PATENT
RIGHTS OF WHICH THEY ARE AWARE AND TO
PROVIDE SUPPORTING DOCUMENTATION.
Reference number
ISO/DIS 9491-1:2025(en)
DRAFT
ISO/DIS 9491-1:2025(en)
International
Standard
ISO/DIS 9491-1
ISO/TC 276
Biotechnology — Predictive
Secretariat: DIN
computational models in
Voting begins on:
personalized medicine research —
Part 1:
Voting terminates on:
Constructing, verifying and
validating models
Biotechnologie — Modèles informatiques prédictifs dans la
recherche sur la médecine personnalisée —
Partie 1: Construction, vérification et validation des modèles
ICS: 07.080
THIS DOCUMENT IS A DRAFT CIRCULATED
FOR COMMENTS AND APPROVAL. IT
IS THEREFORE SUBJECT TO CHANGE
AND MAY NOT BE REFERRED TO AS AN
INTERNATIONAL STANDARD UNTIL
PUBLISHED AS SUCH.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL,
© ISO 2025
TECHNOLOGICAL, COMMERCIAL AND
USER PURPOSES, DRAFT INTERNATIONAL
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
STANDARDS MAY ON OCCASION HAVE TO
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
This document is circulated as received from the committee secretariat. BE CONSIDERED IN THE LIGHT OF THEIR
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
POTENTIAL TO BECOME STANDARDS TO
WHICH REFERENCE MAY BE MADE IN
or ISO’s member body in the country of the requester.
NATIONAL REGULATIONS.
ISO copyright office
RECIPIENTS OF THIS DRAFT ARE INVITED
CP 401 • Ch. de Blandonnet 8
TO SUBMIT, WITH THEIR COMMENTS,
CH-1214 Vernier, Geneva
NOTIFICATION OF ANY RELEVANT PATENT
Phone: +41 22 749 01 11
RIGHTS OF WHICH THEY ARE AWARE AND TO
PROVIDE SUPPORTING DOCUMENTATION.
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland Reference number
ISO/DIS 9491-1:2025(en)
ii
ISO/DIS 9491-1:2025(en)
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Principles . 4
4.1 General .4
4.2 Computational models in personalized medicine .4
4.2.1 General .4
4.2.2 Cellular systems biology models .5
4.2.3 Risk prediction for common diseases.6
4.2.4 Disease course and therapy response prediction .6
4.2.5 Pharmacokinetic/-dynamic modelling and in silico trial simulations .7
4.2.6 Artificial intelligence models .7
4.3 Standardization needs for computational models.8
4.3.1 General .8
4.3.2 Challenges .8
4.3.3 Common standards relevant for personalized medicine .9
4.4 Data preparation for integration into computer models .9
4.4.1 General .9
4.4.2 Sampling data . .10
4.4.3 Data formatting .11
4.4.4 Data description .11
4.4.5 Data annotation (semantics) .11
4.4.6 Data interoperability requirements across subdomains . 12
4.4.7 Data integration . 13
4.4.8 Data provenance information . 13
4.4.9 Data access .14
4.5 Model formatting . .14
4.6 Model validation . 15
4.6.1 General . 15
4.6.2 Specific recommendations for model validation . 15
4.7 Model simulation .17
4.7.1 General .17
4.7.2 Requirements for capturing and sharing simulation set-ups.18
4.7.3 Requirements for capturing and sharing simulation results .19
4.8 Requirements for model storing and sharing .19
4.9 Application of models in clinical trials and research .19
4.9.1 General .19
4.9.2 Specific recommendations . 20
4.10 Ethical requirements for modelling in personalized medicine . 20
Annex A (informative) Common standards relevant for personalized medicine and in silico
approaches .21
Annex B (informative) Information on modelling approaches relevant for personalized
medicine . .24
Bibliography .26
iii
ISO/DIS 9491-1:2025(en)
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO documents should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of any patent
rights identified during the development of the document will be in the Introduction and/or on the ISO list of
patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO’s adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 276, Biotechnology.
A list of all parts in the ISO 9491 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
iv
ISO/DIS 9491-1:2025(en)
Introduction
The capacity to generate data in life sciences and health research has greatly increased in the last decade.
In combination with patient/personal-derived data, such as electronic health records, patient registries and
databases, as well as lifestyle information, this big data holds an immense potential for clinical applications,
especially for computer-based models with predictive capacities in personalized medicine. However, and
despite the ever-progressing technological advances in producing data, the exploitation of big data to
generate new knowledge for medical benefits, while guaranteeing data privacy and security, is lacking
behind its full potential. A reason for this obstacle is the inherent heterogeneity of big data and the lack
of broadly accepted standards allowing interoperable integration of heterogeneous health data to perform
analysis and interpretation for predictive modelling approaches in health research, such as personalized
medicine.
Common standards lead to a mutual understanding and improve information exchange within and across
research communities and are indispensable for collaborative work. In order to setup computer models in
personalized medicine, data integration from heterogeneous and different sources at different times plays a
key role. Consistent documentation of data, models and simulation results based on basic guiding principles
[6]
for data management practices, such as FAIR (findable, accessible, interoperable, reusable) or ALCOA
(attributable, legible, contemporaneous, original, accurate), and standards can ensure that the data and
the corresponding metadata (data describing the data and its context), as well as the models, methods and
visualizations, are of reliable high quality.
Hence, standards for biomedical and clinical data, simulation models and data exchange are a prerequisite
[7]
for reliable integration of health-related data. Such standards, together with harmonized ways to describe
their metadata, ensure the interoperability of tools used for data integration and modelling, as well as the
reproducibility of the simulation results. In this sense, modelling standards are agreed ways of consistently
structuring, describing, and associating models and data, their respective parts and their graphical
visualization, as well as the information about applied methods and the outcome of model simulations. Such
standards also assist in describing how constituent parts interact, or are linked together, and how they are
embedded in their physiological context.
Major challenges in the field of personalized medicine are to:
a) harmonize the standardization efforts that refer to different data types, approaches and technologies;
b) make the standards interoperable, so that the data can be compared and integrated into models.
An overall goal is to FAIRify data and processes in order to improve data integration and reuse. An additional
challenge is to ensure a legal and ethical framework enabling interoperability.
This document presents computational modelling requirements and recommendations for research in
the field of personalized medicine, especially with focus on collaborative research, such that health-
related data can be optimally used for translational research and personalized medicine worldwide.
The recommendations are primarily oriented towards the application of computational modelling in
the biotechnology domain (e.g. biomolecular and cellular research, as well as in clinical trials and drug
development), but also can be applied in other fields of personalized medicine research.
v
DRAFT International Standard ISO/DIS 9491-1:2025(en)
Biotechnology — Predictive computational models in
personalized medicine research —
Part 1:
Constructing, verifying and validating models
1 Scope
This document specifies requirements and recommendations for the design, development and
implementation of predictive computational models for research purposes in the field of personalized
medicine and health product development. It addresses the set-up, formatting, validation, simulation, storing
and sharing of computational models used for personalized medicine. Requirements and recommendations
for data used to construct or required for validating such models are also addressed. This includes rules for
formatting, descriptions, annotations, interoperability, integration, access and provenance of such data.
This document does not apply to computational models used for standard routine clinical, diagnostic or
therapeutic purposes.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
ISO 20691:2022, Biotechnology — Requirements for data formatting and description in the life sciences
ISO/TS 23494-1:2023, Biotechnology — Provenance information model for biological material and data — Part
1: Design concepts and general requirements
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
3.1
artificial intelligence
AI
capability to acquire, process, create and apply knowledge, held in the form of a model, to conduct
one or more given tasks
[SOURCE: ISO/IEC TR 24030:2021, 3.1]
ISO/DIS 9491-1:2025(en)
3.2
molecular biomarker
biomarker
molecular marker
detectable and/or quantifiable molecule or group of molecules used to indicate a biological condition, state,
identity or characteristic or an organism
EXAMPLE Nucleic acid sequences, proteins, small molecules such as metabolites, other molecules such as lipids
and polysaccharides.
[SOURCE: ISO 16577:2022, 3.4.28]
3.3
big data in health
high volume, high diversity biological, clinical, environmental, and lifestyle information collected from single
individuals to large cohorts, in relation to their health and wellness status, at one or several time points
[SOURCE: Reference [8]]
3.4
community standard
standard that reflects the results of a standardization effort from a specific user group, and that is created
by individual organizations or communities
3.5
computational model
in silico model
description of a biological system in a mathematical expression and/or graphical form that is implemented
and studied with a computer highlighting objects and their interactions
Note 1 to entry: An object distributed processing (ODP) concept.
Note 2 to entry: The computational model is similar to OMT ad UML notion of a class diagram when using the
graphical form.
[SOURCE: ISO/IEC 16500-8:1999, 3.6, modified — Admitted term added. “biological”, “mathematical
expression and/or”, “that is implemented and studied with a computer” added, “interfaces” changed to
“interactions” and “as such it is similar to the OMT and UML notion of a class Diagram” deleted from the
definition. “An object distributed processing (ODP) concept” moved to Note 1 to entry. Note 2 to entry added.]
3.6
data-driven model
model developed through the use of data derived from tests or from the output of investigated process or
from real world data or routinely acquired primary care data
[SOURCE: ISO 15746-1:2015, 2.4, modified — “or from real world data or routinely acquired primary care
data” added]
3.7
data harmonization
technical process of bringing together different data types to make them processable in the same
computational framework
3.8
data integration
systematic combining of data from different independent and potentially heterogeneous sources, to create a
more compatible, unified view of these data for research purpose
[SOURCE: ISO 5127:2017, 3.1.11.24]
ISO/DIS 9491-1:2025(en)
3.9
genome-wide association studies
GWAS
testing of genetic variants across the genomes of many individuals to identify genotype–phenotype
associations
3.10
in silico clinical trial
use of individualized computer simulation in the development or regulatory evaluation of a medicinal
product, medical device or medical intervention
[SOURCE: Reference [9]]
3.11
in silico approach
computer-executable analyses of mathematical model(s) (3.13) to study and simulate a biological system
3.12
machine learning
ML
computer technology with the ability to automatically learn and improve from experience without being
explicitly programmed
EXAMPLE Speech recognition, predictive text, spam detection, artificial intelligence.
[SOURCE: ISO 20252:2019, 3.52, modified — Abbreviated term “ML” added.]
3.13
mathematical model
sets of equations that describes the behaviour of a physical system
[SOURCE: ISO 16730-1:2015, 3.11]
3.14
mechanism-based
approach in computational modelling that aims for a structural representation
3.15
model validation
comparison between the output of the calibrated model and the measured data, independent of the data set
used for calibration
[SOURCE: ISO 14837-1:2005, 3.7]
3.16
model verification
confirmation that the mathematical elements of the model behave as intended
[SOURCE: ISO 14837-1:2005, 3.8]
3.17
personalized medicine
precision medicine
medical model using characterization of individuals’ phenotypes and genotypes for tailoring the right
therapeutic strategy for the right person at the right time, and/or to determine the predisposition to disease
and/or to deliver timely and targeted prevention
Note 1 to entry: Examples for individuals’ phenotypes and genotypes are molecular profiling, medical imaging and
lifestyle data.
Note 2 to entry: Medical decisions, prevention strategies and therapies in personalized medicine are based on this
individuality.
ISO/DIS 9491-1:2025(en)
[10]
[SOURCE: EU 2015/C 421/03 ]
3.18
phenotype
set of observable characteristics of an organism resulting from the interaction of its genotype with the
environment
[SOURCE: ISO 4454:2022, 3.14, Note 1 deleted]
3.19
raw data
data in its originally acquired, direct form from its source before subsequent processing
[SOURCE: ISO 5127:2017, 3.1.10.04]
4 Principles
4.1 General
Research in the field of personalized medicine is highly dependent on the exchange of data from different
sources, as well as harmonized integrative analysis of large-scale personalized medicine data (big data in
health). Computational modelling approaches play a key role for understanding, simulating and predicting
the molecular processes and pathways that characterize human biology. Modelling approaches in biomedical
research also lead to a more profound understanding of the mechanisms and factors that drive disease,
and consequently allow for adapting personalized treatment strategies that are guided by central clinical
questions. Patients can greatly benefit from this development in research that equips personalized medicine
with predictive capabilities to simulate in silico clinically relevant questions, such as the effect of therapies,
the response to drug treatments or the progression of disease.
4.2 Computational models in personalized medicine
4.2.1 General
Computational models have the potential to translate in vitro, non-clinical and clinical results (and their
related uncertainty) into descriptive or predictive expressions. The added value of such models in medicine
[11][12][13][14]
and pharmacology has increasingly been recognized by the scientific community , as well as by
[15]
regulatory bodies such as the European Medicines Agency (e.g. EMA guideline on PBPK reporting ), or
[16][17]
the US Food and Drug Administration (FDA) . Computational models are integrated in different fields
in medicine as well as in the development of drugs and other health products, expanding from disease
modelling, molecular and physiological biomarker research to assessment of drug and medical device efficacy
[18]
and safety. In silico approaches are also expanding in neighbouring fields, such as pharmacoeconomics
[19] [20][21] [22][23]
, analytical chemistry and biology that are out of scope of this document .
Model creation starts with a clinical question and the collection of data (see Figure 1). The data employed
need harmonized approaches for data integration to start the model construction. The initial model usually
undergoes several refinement and improvement iterations to enhance predictive capabilities. Common
standards (see 4.3.3) should be used for the model building and curation process. Accuracy measurements
and validation processes are key, and should be transparent, while model output and function should ideally
be interpretable or explainable.
A number of computational modelling approaches in pre-clinical and clinical research already address
these questions in detail (see 4.2.2 to 4.2.6) and, therefore, play a leading role for the future development of
personalized medicine.
ISO/DIS 9491-1:2025(en)
Figure 1 — Modelling approach for personalized medicine
4.2.2 Cellular systems biology models
4.2.2.1 General
For the simulation of complex dynamic biological processes and networks, models can be either data-driven
(“bottom-up”) or mechanism-based (“top-down”).
Mechanism-based concepts aim for a structural representation of the governing physiological
processes based on model equations with limited amount of data, which are required for the base model
[24] [25][26] [11][27]
establishment or, alternatively, on static interacting networks . Data-driven approaches
require sufficiently rich and quantitative (e.g. time-course) data to train and to validate the model. Due to
the often black-box nature of data-driven approaches, the model validation process relies on performance
tests against known results.
4.2.2.2 Challenges
The challenges are as follows:
— Creation of models that balance the level of abstraction with comprehensiveness to make modelling
efforts reproducible and reusable (abstraction versus size).
— Development of prediction models that can be adopted easily to individual patient profiles.
— Efficient parameter estimation tools to cope with population and disease heterogeneity.
— Overfitting of the model to the experimental/patient data and optimization methods for model predictions
in a realistic parametric uncertainty.
— Flexibility in models to cope with missing data (e.g. diverse patient profiles).
ISO/DIS 9491-1:2025(en)
— Scaling from cellular to organ and to organism levels (e.g. high clinical relevance, high hurdles for
regulatory acceptancy).
4.2.3 Risk prediction for common diseases
4.2.3.1 General
Predictive models stratify patients into distinct subgroups at different levels of risk for clinical outcomes
(risk prediction for disease). By training the algorithm on clinical data, phenotypic or genotypic, subgroups
can be identified which have identifiably different patterns of clinical markers. By then identifying which
patterns a patient fits best, the model can place a particular patient within the most similar trajectory,
thereby also stratifying the patient to a particular level of risk. Clinical markers used in such models can be
any health feature, tokenized as to be analysable by the model, from data such as disease history symptoms,
treatment and other exposure data, family history, laboratory data, etc., to genetic data.
4.2.3.2 Challenges
The challenges are as follows:
— Understanding the possible implication to patients at an individual level. What can be inferred? How to
test the inference made?
— Limited replication of measurements and analyses (e.g. genetic associations) and poor application of
diverse populations (e.g. too poorly represented to be of interest for specific analyses), specifically of
mixed or non-European ancestry.
— Varying transparency of methodological choices and reproducibility.
— Limited cellular/tissue context and harmonized functional data availability across populations/studies.
— Missing environmental information coupled to genetic data.
4.2.4 Disease course and therapy response prediction
4.2.4.1 General
Prediction of the disease behaviour (mild versus severe, stable versus progressive) early in the disease
course based on specific molecular biomarkers can allow an improved timing of therapy introduction, as
[28]
well as the choice of therapy scheme (targeted therapy) . Ideally, these models can provide a prediction
of multi-factorial diseases at unprecedented resolution, in a way that clinicians can use the information in
their daily decision-making.
4.2.4.2 Challenges
The challenges are as follows:
— Harmonization and standardization of clinical information for measuring the disease of interest.
— Developing transparent and quality-controlled workflows for data generation and interpretation in
clinical settings.
— Harmonization and application of existing and upcoming pre-examination workflow standards (including
specimen collection, storage and nucleic acid isolation), as well as developing feasible ring trial formats
and external quality assurance (EQA) schemes for given molecular analysis types.
— Transparent reduction of contents and definition of appropriate marker sets and dynamic models to
foster clinical translation.
— Developing intuitive visualization results and insights into molecular analyses, as well as critical
appraisal of limitations of models by physicians.
ISO/DIS 9491-1:2025(en)
4.2.5 Pharmacokinetic/-dynamic modelling and in silico trial simulations
4.2.5.1 General
[29][30]
Pharmacokinetic/pharmacodynamic (PK/PD) models can usefully translate in vitro, non-clinical
and clinical PK/PD data into meaningful information to support decision-making. At the individual level,
substance PKs can either be described by non-compartmental analysis and compartmental PK modelling
or by physiologically-based PK (PBPK) modelling. PBPK models are commonly used for interspecies
extrapolations and drug-drug interactions modelling. At the population level, population PK have become
the most commonly used top-down models that derive a pharmaco-statistical model from observed systemic
concentrations. PK/PD modelling involves on the one hand a quantification of drug absorption, disposition,
metabolism and excretion (PK) and on the other hand a description of the drug-induced effect (PD). PK/PD
models and quantitative systems pharmacology (QSP) both aim for mechanistic and quantitative analyses of
[31]
the interactions between a substance such as a drug and a specific biological system .
PK and PBPK modelling are currently used for simulations for virtual patient populations in in silico clinical
trials. The concept is that computer simulations are proposed as an alternative source of evidence to support
drug development to reduce, refine, complement or replace the established data sources including in vitro
experiments, in vivo animal studies and clinical trials in healthy volunteers and patients.
4.2.5.2 Challenges
The challenges are as follows:
— Reliable data sources for systems-related parameters are currently limited.
— Methods for data generation, collection and integration are not standardized.
[32]
— Reporting of results is very heterogeneous and inconsistent .
— Tools to be used and criteria for model evaluation are very variable across projects.
— Very limited platforms (systems model) are currently considered reliable and qualified for regulatory
submission.
4.2.6 Artificial intelligence models
4.2.6.1 General
Data-driven approaches, utilizing artificial intelligence (AI) and machine learning (ML) treat the mechanism
as unknown and aim to model a function that operates on data input to predict the outcome, regardless of
the unknown physiological processes. The mechanisms operating in the complex systems being modelled,
i.e. which factors together drive outcomes, are considered too complex to be determined (e.g. black-box
models). The quality of AI-based models is assessed through the accuracy of their predictions, tested in
a variety of ways. These data-driven models can be applied in a hypothesis-naive way, made as to which
factors drive the causal mechanism.
ML approaches learn the theory automatically from the data through a process of inference, model fitting or
[33]
learning from examples . ML can be supervised, unsupervised or partially supervised (see Annex B).
4.2.6.2 Challenges
The challenges are as follows:
— Imprecise reporting, which makes it difficult to obtain the full benefit of results, navigate biomedical
literature and generate clinically actionable findings.
— Data standardization, since most in silico methods require comparable input data.
ISO/DIS 9491-1:2025(en)
— Data based on group associations, or pre-determined understanding of clinical relationships, can bias
and limit AI/ML predictions (inappropriately pre-processed data).
— Different proprietary systems in healthcare information technology (IT) make data extraction, labelling,
interpretation and standardization highly complex procedures (data lockdown).
4.3 Standardization needs for computational models
4.3.1 General
Major challenges in the field of personalized medicine are to harmonize the standardization efforts that
refer to different data types, approaches and technologies, as well as to make the standards interoperable,
so that the data can be compared and integrated into models. Reproducible modelling in personalized
medicine requires a basic understanding of the modelled system, as well as of its biological and physiological
background, and finally of the applied virtual experiments.
Because of the heterogeneous nature of the data in personalized medicine, harmonized strategies for data
integration are required that utilize broadly applicable standards to allow for reproducible data exploitation
to generate new knowledge for medical benefits. Whereas the model simulation process itself can vary
greatly or be even partially unknown, e.g. in AI-driven modelling, making it hard to standardize, the
integration of data into the model (input), as well as the outcome of the model (output) can be standardized
and validated. Extensive model validation, e.g. with a set of standardized and high-quality validation data
as input, can be used to validate the whole modelling process, even if the model simulation itself is not
standardized. The two key components for which broad standardization efforts make most sense in the
model building process are thus data integration and model validation (see Figure 2).
Figure 2 — Data integration and model validation as key factors for standardization requirements
for computational models
4.3.2 Challenges
Although for many different data types used in personalized medicine there are domain-specific annotation
standards and terminologies available (see Annex A), the process of model building possesses the following
variety of challenges:
— High degree of variability regarding data types (structured versus unstructured, molecular, clinical,
laboratory, patient-reported, etc.).
— Differences in coding and calculation within data types (between-machine variability, different
measurements, etc.).
— Heterogeneous utilization of existing data and lack of domain- and data-specific standard methods for
data pre-processing.
— High effort of data harmonization in terms of time, resources and cost.
— Models relevant for clinical use need to be fit for purpose.
ISO/DIS 9491-1:2025(en)
— Differences in IT systems used in data generation, e.g. enterprise resource planning systems and
laboratory result software or hardware, at national, regional or clinical centre level.
— Lack of standard workflows (compliant with national and regional regulations and laws) for personal
health data access and processing.
— Lack of training, awareness and empowerment for existing standards and workflows.
— Adoption of different domain-specific terminology standards for health data such as SNOMED CT, NPU
(Nomenclature for Properties and Units) or LOINC (Logical Observation Identifier Names and Codes).
— Differences in implementation of international terminologies such as the International Classification of
Diseases (ICD).
— Long-term variety and dynamics of data and standards.
— Language differences in unstructured text, and other factors.
4.3.3 Common standards relevant for personalized medicine
The use of common standards developed by specific user communities and different stakeholders, as well as
standard-defining organizations, has been enhanced as they have been coupled to tools, which have spread
in the respective field of research. Annex A provides an overview of some of these standards currently in use
by different communities.
4.4 Data preparation for integration into computer models
4.4.1 General
Computational models in the life sciences in general as well as in healthcare and personalized medicine
research in particular are increasingly incorporating rich and varied data sets to capture multiple aspects
of the modelled phenomenon. Data types are encoded in technology and subdomain specific formats and the
variety and incompatibility, as well as lack of interoperability, of such data formats have been noted as one of
the major hurdles for data preparation.
To allow for seamless integration of data used for the construction of predictive computational models in
personalized medicine, these data shall:
— include or be annotated with sampling and specimen data that follow the requirements and
recommendations in accordance with the relevant domain-specific standards;
— be formatted using generally accepted and interoperable standard data formats commonly used for the
corresponding data types (in accordance with ISO 20691);
— include or be annotated with descriptive metadata that consider generally accepted domain-specific
minimum information guidelines and describes the metadata attributes and entities using semantic
standards, such as standard terminologies, controlled vocabularies and ontologies (as specified in
ISO 20691:2022, Annex B);
— follow best practice requirements and recommendations of generally accepted domain-specific data
interoperability frameworks;
— be structured in a way that allows integration of the data into a model, together with other data;
— include or be annotated with data provenance information that allows for tracking of the data and source
material throughout the whole data processing and modelling;
— be made accessible via harmonized data access agreements (hDAAs) for controlled access data, if open
access to the data is not possible.
ISO/DIS 9491-1:2025(en)
4.4.2 Sampling data
Dedicated measures shall be taken for collecting, stabilizing, transporting, storing and processing of
biological specimen/samples, to ensure that profiles of analytes of interest (e.g. gene sequence, transcript,
protein, metabolite) for examination are not changed ex vivo. Without these measures, analyte profiles can
change drastically during and after specimen collection, thus making the outcome from diagnostics or
research unreliable or even impossible, because the subsequent examination cannot determine the situation
in the patient, but determines an artificial profile generated during the pre-examination process.
NOTE Important measures include, for example, times and temperatures of sample transportation not exceeding
the specifications provided in relevant International Standards (e.g. ISO 20916, ISO 20186-1) and International
Technical Specifications (e.g. ISO/TS 20658), giving guidelines on all steps of the pre-examination workflow.
Measurement methods for analysing the specimens should follow standard approaches as much as possible.
For instance, characterisation of biological tissues should be done following community consensus
approaches in order for the data to be reliable and accurate enough for modelling purposes (see ISO 20691
for specific metadata and documentation recommendations and requirements).
Conditions applied to a specimen shall be documented in addition to other important metadata, including
but not limited to the content of Table 1.
Table 1 — Important metadata collected during pre-examination workflows
Metadata Details
Specimen collection — ID of responsible person
Information about specimen — ID
donor
— Health status (e.g. healthy, disease type, concomitant disease, demographics
such as age and gender)
— Routine medical treatment and special treatment prior to specimen collection
(e.g. anaesthetics, medications, surgical or diagnostic procedures, fasting
status)
— Appropriate consent from the specimen donor/patient
Information about the speci- — Type and the purpose of the examination requested
men, collection from the donor
— Specimen collection technique used (e.g. surgery, draw, flush)
or patient and processing
— Anatomical location where the specimen was taken from (at body part or
organ level but even relative position or spacial coordinates or genetic locus, if
applicable) described following existing standards.
— Time and date when the specimen is removed from the body
— Documentation of any additions or modifications to the specimen after
removal from the body (e.g. addition of reagents)
Specimen storage and transport — Temperatures of the collection device’s surroundings
Specimen reception — ID or name of the person receiving the specimen
— Arrival date,
...
FINAL DRAFT
International
Standard
ISO/TC 276
Biotechnology — Predictive
Secretariat: DIN
computational models in
Voting begins on:
personalized medicine research —
2026-03-24
Part 1:
Voting terminates on:
2026-05-19
Constructing, verifying and
validating models
Biotechnologie — Modèles informatiques prédictifs dans la
recherche sur la médecine personnalisée —
Partie 1: Construction, vérification et validation des modèles
RECIPIENTS OF THIS DRAFT ARE INVITED TO SUBMIT,
WITH THEIR COMMENTS, NOTIFICATION OF ANY
RELEVANT PATENT RIGHTS OF WHICH THEY ARE AWARE
AND TO PROVIDE SUPPOR TING DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO-
LOGICAL, COMMERCIAL AND USER PURPOSES, DRAFT
INTERNATIONAL STANDARDS MAY ON OCCASION HAVE
TO BE CONSIDERED IN THE LIGHT OF THEIR POTENTIAL
TO BECOME STAN DARDS TO WHICH REFERENCE MAY BE
MADE IN NATIONAL REGULATIONS.
Reference number
FINAL DRAFT
International
Standard
ISO/TC 276
Biotechnology — Predictive
Secretariat: DIN
computational models in
Voting begins on:
personalized medicine research —
Part 1:
Voting terminates on:
Constructing, verifying and
validating models
Biotechnologie — Modèles informatiques prédictifs dans la
recherche sur la médecine personnalisée —
Partie 1: Construction, vérification et validation des modèles
RECIPIENTS OF THIS DRAFT ARE INVITED TO SUBMIT,
WITH THEIR COMMENTS, NOTIFICATION OF ANY
RELEVANT PATENT RIGHTS OF WHICH THEY ARE AWARE
AND TO PROVIDE SUPPOR TING DOCUMENTATION.
© ISO 2026
IN ADDITION TO THEIR EVALUATION AS
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO-
LOGICAL, COMMERCIAL AND USER PURPOSES, DRAFT
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on
INTERNATIONAL STANDARDS MAY ON OCCASION HAVE
the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below
TO BE CONSIDERED IN THE LIGHT OF THEIR POTENTIAL
or ISO’s member body in the country of the requester.
TO BECOME STAN DARDS TO WHICH REFERENCE MAY BE
MADE IN NATIONAL REGULATIONS.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland Reference number
ii
Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Principles . 5
4.1 General .5
4.2 Computational models in personalized medicine .5
4.2.1 General .5
4.2.2 Cellular systems biology models .6
4.2.3 Risk prediction for common diseases.7
4.2.4 Disease course and therapy response prediction .7
4.2.5 Pharmacokinetic/pharmacodynamic modelling and in silico trial simulations .8
4.2.6 Artificial intelligence systems (AI systems) .8
4.3 Standardization needs for computational models.9
4.3.1 General .9
4.3.2 Challenges .9
4.3.3 Common standards relevant for personalized medicine .10
4.4 Data preparation for integration into computer models .10
4.4.1 General .10
4.4.2 Sampling data . .11
4.4.3 Data formatting . 12
4.4.4 Data description . 13
4.4.5 Data annotation (semantics) . 13
4.4.6 Data interoperability requirements across subdomains .14
4.4.7 Data integration . 15
4.4.8 Data provenance information . 15
4.4.9 Data access .16
4.5 Model formatting . .16
4.6 Model validation .17
4.6.1 General .17
4.6.2 Specific recommendations for model validation .17
4.7 Model simulation .19
4.7.1 General .19
4.7.2 Requirements for capturing and sharing simulation set-ups. 20
4.7.3 Requirements for capturing and sharing simulation results . 20
4.8 Requirements for model storing and sharing . 20
4.9 Application of models in clinical trials and research .21
4.9.1 General .21
4.9.2 Specific recommendations .21
4.10 Ethical requirements for modelling in personalized medicine .21
Annex A (informative) Common standards relevant for personalized medicine and in silico
approaches .23
Annex B (informative) Information on modelling approaches relevant for personalized
medicine . .26
Bibliography .28
iii
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee. International organizations,
governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types
of ISO document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent
rights in respect thereof. As of the date of publication of this document, ISO had not received notice of (a)
patent(s) which may be required to implement this document. However, implementers are cautioned that
this may not represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 276, Biotechnology.
This second edition cancels and replaces the first edition (ISO/TS 9491-1:2023), which has been technically
revised.
The main changes are as follows:
— normative references in Clause 2 have been consolidated, updated and revised;
— update and clarification of terminology including the alignment with the terminology of ISO/TS 9491-2;
— updated to match the latest developments in the domain;
— bibliography has been revised and updated;
— editorial revision and clarification of wording.
A list of all parts in the ISO 9491 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
iv
Introduction
The capacity to generate data in life sciences and health research has greatly increased in the last decade.
In combination with patient/personal-derived data, such as electronic health records, patient registries and
databases, as well as lifestyle information, this big data holds an immense potential for clinical applications,
especially for computer-based models with predictive capacities in personalized medicine. However, and
despite the ever-progressing technological advances in producing data, the exploitation of big data to
generate new knowledge for medical benefits, while guaranteeing data privacy and security, is lacking
behind its full potential. A reason for this obstacle is the inherent heterogeneity of big data and the lack
of broadly accepted standards allowing interoperable integration of heterogeneous health data to perform
analysis and interpretation for predictive modelling approaches in health research, such as personalized
medicine.
Common standards lead to a mutual understanding and improve information exchange within and across
research communities and are indispensable for collaborative work. In order to setup computer models in
personalized medicine, data integration from heterogeneous and different sources at different times plays a
key role. Consistent documentation of data, models and simulation results based on basic guiding principles
[6]
for data management practices, such as FAIR (findable, accessible, interoperable, reusable) or ALCOA
(attributable, legible, contemporaneous, original, accurate), and standards can ensure that the data and
the corresponding metadata (data describing the data and its context), as well as the models, methods and
visualizations, are of reliable high quality.
Hence, standards for biomedical and clinical data, simulation models and data exchange are a prerequisite
[7]
for reliable integration of health-related data . Such standards, together with harmonized ways to describe
their metadata, ensure the interoperability of tools used for data integration and modelling, as well as the
reproducibility of the simulation results. In this sense, modelling standards are agreed ways of consistently
structuring, describing, and associating models and data, their respective parts and their graphical
visualization, as well as the information about applied methods and the outcome of model simulations. Such
standards also assist in describing how constituent parts interact, or are linked together, and how they are
embedded in their physiological context.
Major challenges in the field of personalized medicine are to:
a) harmonize the standardization efforts that refer to different data types, approaches and technologies;
b) make the standards interoperable, so that the data can be compared and integrated into models.
An overall goal is to FAIRify data and processes in order to improve data integration and reuse. An additional
challenge is to ensure a legal and ethical framework enabling interoperability.
This document presents computational modelling requirements and recommendations for research in
the field of personalized medicine, especially with focus on collaborative research, such that health-
related data can be optimally used for translational research and personalized medicine worldwide.
The recommendations are primarily oriented towards the application of computational modelling in
the biotechnology domain (e.g. biomolecular and cellular research, as well as in clinical trials and drug
development), but also can be applied in other fields of personalized medicine research.
v
FINAL DRAFT International Standard ISO/FDIS 9491-1:2026(en)
Biotechnology — Predictive computational models in
personalized medicine research —
Part 1:
Constructing, verifying and validating models
1 Scope
This document specifies requirements and recommendations for the design, development and
implementation of predictive computational models for research purposes in the field of personalized
medicine and health product development.
This document addresses the set-up, formatting, validation, simulation, storing and sharing of computational
models used for personalized medicine. Requirements and recommendations for data used to construct
or required for validating such models are also specified. This includes rules for formatting, descriptions,
annotations, interoperability, integration, access and provenance of such data.
This document does not apply to computational models used for standard routine clinical, diagnostic or
therapeutic purposes.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
1)
ISO 20691, Biotechnology — Requirements for data formatting and description in the life sciences
2)
ISO 20387:2025, Biotechnology — Biobanking — General requirements for biobanks
ISO 23494-1, Biotechnology — Provenance information model for biological material and data — Part 1: Design
concepts and general requirements
ISO 23494-2, Biotechnology — Provenance information model for biological material and data — Part 2:
Common Provenance Model
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at https:// www .electropedia .org/
1) https:// fairsharing .org/ 3533
2) Under preparation. Stage at the time of publication: ISO/DIS 20387:2025.
3.1
artificial intelligence
AI
research and development of mechanisms and applications of AI systems (3.2)
Note 1 to entry: Research and development can take place across any number of fields such as computer science, data
science, humanities, mathematics and natural sciences.
[SOURCE: ISO/IEC 22989:2022, 3.1.3]
3.2
artificial intelligence system
AI system
engineered system that generates outputs such as content, forecasts, recommendations or decisions for a
given set of human-defined objectives
Note 1 to entry: The engineered system can use various techniques and approaches related to artificial intelligence to
develop a model to represent data, knowledge, processes, etc. which can be used to conduct tasks.
Note 2 to entry: AI systems are designed to operate with varying levels of automation.
[SOURCE: ISO/IEC 22989:2022, 3.1.4]
3.3
big data
extensive datasets — primarily in the data characteristics of volume, variety, velocity, and/or variability —
that require a scalable technology for efficient storage, manipulation, management, and analysis
Note 1 to entry: Big data is commonly used in many different ways, for example as the name of the scalable technology
used to handle big data extensive datasets.
EXAMPLE high volume, high diversity biological, clinical, environmental, and lifestyle information collected from
single individuals to large cohorts, in relation to their health and wellness status, at one or several time points (see
reference [8] for additional information)
[SOURCE: ISO/TR 24291:2021, 3.2, modified — EXAMPLE added.]
3.4
community consensus standard
standard that reflects the results of a consensus standardization effort from a specific domain-specific
expert group outside of recognized standard defining organizations and their technical committees
Note 1 to entry: Created by domain-specific professional societies, scientific standardization initiatives, individual
organizations or research communities (often in collaboration with industry partners)
Note 2 to entry: Often publicly available, open and not proprietary
3.5
computational model
in silico model
description of a biological system in either a mathematical expression or graphical form, or both, that is
implemented and studied with a computer highlighting objects and their interactions
Note 1 to entry: An object distributed processing (ODP) concept.
[SOURCE: ISO/IEC 16500-8:1999, 3.6, modified — Admitted term added. “biological”, “mathematical
expression or”, “, or both, that is implemented and studied with a computer” added, “interfaces” changed
to “interactions” and “as such it is similar to the OMT and UML notion of a class diagram” deleted from the
definition. “An object distributed processing (ODP) concept” moved to Note 1 to entry.]
3.6
data-driven model
model developed through the use of data derived from tests or from the output of investigated process or
from real world data or routinely acquired primary care data
[SOURCE: ISO 15746-1:2015, 2.4, modified — “or from real world data or routinely acquired primary care
data” added]
3.7
harmonization of data concepts
data harmonization
process of reconciling differences in semantics, structure and syntax of similar data concepts
Note 1 to entry: Harmonization can include the establishment of a single pervasive definition for each data concept
(i.e. standardization), but can also encompass flexible approaches in which definitions can be understood to grow
closer without becoming identical.
[SOURCE: ISO/TR 25100:2012, 2.1.4, modified — “harmonisation” replaced by “harmonization”, “may” in
Note 1 to entry replaced by “can”.]
3.8
data integration
systematic combining of data from different independent and potentially heterogeneous sources, to create a
more compatible, unified view of these data for research purpose
[SOURCE: ISO 5127:2017, 3.1.11.24]
3.9
genome-wide association studies
GWAS
testing of genetic variants across the genomes of many individuals to identify genotype–phenotype
associations
3.10
in silico clinical trial
use of computer modelling and simulation(s) to mimic human experimentation in the development or
regulatory evaluation process of a medicinal product (e.g. medical device) or medical intervention, under
defined conditions using verified and validated models
Note 1 to entry: It is a subdomain of ‘in silico medicine’, the discipline that encompasses the use of individualised
computer simulations in all aspects of the prevention, diagnosis, prognostic assessment, and treatment of disease.
[SOURCE: Reference [9], modified — Note 1 to entry added.]
3.11
in silico approach
computer-executable analyses of mathematical model(s) (3.13) to study and simulate a biological system
3.12
machine learning
ML
computer technology with the ability to automatically learn and improve from experience without being
explicitly programmed
EXAMPLE Speech recognition, predictive text, spam detection, or optimizing model parameters through
computational techniques, such that the model's behaviour reflects the data or experience.
[SOURCE: ISO 20252:2019, 3.52, modified — Abbreviated term “ML” added and EXAMPLES changed to
“Speech recognition, predictive text, spam detection, or optimizing model parameters through computational
techniques, such that the model's behaviour reflects the data or experience.”.]
3.13
mathematical model
set of equations that describes the behaviour of a physical system
[SOURCE: ISO 16730-1:2015, 3.11]
3.14
mechanism-based
approach in computational modelling that aims for a structural representation
3.15
model validation
comparison between the output of the calibrated model and the measured data, independent of the data set
used for calibration
[SOURCE: ISO 14837-1:2005, 3.7]
3.16
model verification
confirmation that the mathematical elements of the model behave as intended
[SOURCE: ISO 14837-1:2005, 3.8]
3.17
molecular biomarker
biomarker
molecular marker
detectable and/or quantifiable molecule or group of molecules used to indicate a biological condition, state,
identity or characteristic of an organism (e.g. an individual)
EXAMPLE Nucleic acid sequences, proteins, small molecules such as metabolites, other molecules such as lipids
and polysaccharides.
[SOURCE: ISO 16577:2022, 3.4.28, modified — “or an” changed to “of an” and “(e.g. an individual)” added to
definition.]
3.18
personalized medicine
precision medicine
medical model using characterization of individuals’ phenotypes and genotypes for tailoring the right
therapeutic strategy for the right person at the right time, and/or to determine the predisposition to disease
and/or to deliver timely and targeted prevention
Note 1 to entry: Examples for individuals’ phenotypes and genotypes are molecular profiling, medical imaging and
lifestyle data.
Note 2 to entry: Medical decisions, prevention strategies and therapies in personalized medicine are based on this
individuality.
[10]
[SOURCE: EU 2015/C 421/03, modified — Notes 1 and 2 to entry added and “(e.g. molecular profiling,
medical imaging, lifestyle data)” deleted from definition.]
3.19
phenotype
set of observable characteristics of an organism resulting from the interaction of its genotype with the
environment
[SOURCE: ISO 4454:2022, 3.14, modified — Note 1 to entry deleted.]
3.20
raw data
data in its originally acquired, direct form from its source before subsequent processing
[SOURCE: ISO 5127:2017, 3.1.10.04]
4 Principles
4.1 General
Research in the field of personalized medicine is highly dependent on the exchange of data from different
sources, as well as harmonized integrative analysis of large-scale personalized medicine data (big data in
health research). Computational modelling approaches play a key role for understanding, simulating and
predicting the molecular processes and pathways that characterize human biology. Modelling approaches in
biomedical research also lead to a more profound understanding of the mechanisms and factors that drive
diseases, and consequently allow for adapting personalized treatment strategies that are guided by central
clinical questions. Patients can greatly benefit from this development in research that equips personalized
medicine with predictive capabilities to simulate in silico clinically relevant questions, such as the effect of
therapies, the response to drug treatments or the progression of disease.
4.2 Computational models in personalized medicine
4.2.1 General
Computational models have the potential to translate in vitro, non-clinical and clinical results (and their
related uncertainty) into descriptive or predictive expressions. The added value of such models in medicine
[11][12][13][14]
and pharmacology has increasingly been recognized by the scientific community, as well as by
[15]
regulatory bodies such as the European Medicines Agency (e.g. EMA guideline on PBPK reporting ), or
[16][17]
the US Food and Drug Administration (FDA). Computational models are integrated in different fields
in medicine as well as in the development of drugs and other health products, expanding from disease
modelling, molecular and physiological biomarker research to assessment of drug and medical device efficacy
[18]
and safety. In silico approaches are also expanding in neighbouring fields, such as pharmacoeconomics,
[19] [20][21] [22][23]
analytical chemistry and biology that are out of scope of this document .
Model creation starts with a clinical question and the collection of data (see Figure 1). The data employed
need harmonized approaches for data integration to start the model construction. The initial model usually
undergoes several refinement and improvement iterations to enhance predictive capabilities. Common
standards (see 4.3.3) should be used for the model building and curation process. Accuracy measurements
and validation processes are key, and should be transparent, while model output and function should ideally
be interpretable or explainable.
A number of computational modelling approaches in pre-clinical and clinical research already address
these questions in detail (see 4.2.2 to 4.2.6) and, therefore, play a leading role for the future development of
personalized medicine.
Figure 1 — Modelling approach for personalized medicine
4.2.2 Cellular systems biology models
4.2.2.1 General
For the simulation of complex dynamic biological processes and networks, models can be either data-driven
(“bottom-up”) or mechanism-based (“top-down”).
Mechanism-based concepts aim for a structural representation of the governing physiological processes based
[24]
on model equations with limited amount of data, which are required for the base model establishment or,
[25][26] [11][27]
alternatively, on static interacting networks. Data-driven approaches require sufficiently rich
and quantitative (e.g. time-course) data to train and to validate the model. Due to the occasional black-box
nature of data-driven approaches, the model validation process relies on performance tests against known
results.
4.2.2.2 Challenges
The challenges are as follows:
a) the creation of models that balance the level of abstraction with comprehensiveness to make modelling
efforts reproducible and reusable (abstraction versus size);
b) the development of prediction models that can be adapted easily to individual patient profiles;
c) efficient parameter estimation tools to cope with population and disease heterogeneity;
d) overfitting of the model to the experimental/patient data and optimization methods for model
predictions in a realistic parametric uncertainty;
e) flexibility in models to cope with missing data (e.g. diverse patient profiles);
f) scaling from cellular to organ and to organism levels (e.g. high clinical relevance, high hurdles for
regulatory acceptancy).
4.2.3 Risk prediction for common diseases
4.2.3.1 General
Predictive models stratify patients into distinct subgroups at different levels of risk for clinical outcomes
(risk prediction for disease). By training the algorithm on clinical data, phenotypic or genotypic subgroups
can be identified which have identifiably different patterns of clinical markers. By then identifying which
patterns a patient fits best, the model can place a particular patient within the most similar trajectory,
thereby also stratifying the patient to a particular level of risk. Clinical markers used in such models can
be any health feature, which can be tokenized to be analysable by the model. These health features range
from disease history symptoms, treatment and other exposure data, family history, laboratory data, etc., to
genetic data.
4.2.3.2 Challenges
The challenges are as follows:
a) understanding the possible implication to patients at an individual level:
1) What can be inferred?
2) How to test the inference made?
b) limited replication of measurements and analyses (e.g. genetic associations) and poor application of
diverse populations (e.g. too poorly represented to be of interest for specific analyses), specifically of
mixed or non-European ancestry;
c) varying transparency of methodological choices and reproducibility;
d) limited cellular/tissue context and harmonized functional data availability across populations/studies;
e) missing environmental information coupled to genetic data.
4.2.4 Disease course and therapy response prediction
4.2.4.1 General
Prediction of the disease behaviour (mild versus severe, stable versus progressive) early in the disease
course based on specific molecular biomarkers can allow an improved timing of therapy introduction, as
[28]
well as the choice of therapy scheme (targeted therapy). Ideally, these models can provide a prediction
of multi-factorial diseases at unprecedented resolution, in a way that clinicians can use the information in
their daily decision-making.
4.2.4.2 Challenges
The challenges are as follows:
a) harmonization and standardization of clinical information for measuring the disease of interest;
b) developing transparent and quality-controlled workflows for data generation and interpretation in
clinical settings;
c) harmonization and application of existing and upcoming pre-examination workflow standards
(including specimen collection, storage and nucleic acid isolation), as well as developing feasible ring
trial formats and external quality assurance (EQA) schemes for given molecular analysis types;
d) transparent reduction of contents and definition of appropriate marker sets and dynamic models to
foster clinical translation;
e) developing intuitive visualization results and insights into molecular analyses, as well as critical
appraisal of limitations of models by physicians.
4.2.5 Pharmacokinetic/pharmacodynamic modelling and in silico trial simulations
4.2.5.1 General
[29][30]
Pharmacokinetic/pharmacodynamic (PK/PD) models can translate in vitro, non-clinical and clinical
PK/PD data into meaningful information to support decision-making. At the individual level, substance
PKs can either be described by non-compartmental analysis and compartmental PK modelling or by
physiologically-based PK (PBPK) modelling. PBPK models are commonly used for interspecies extrapolations
and drug-drug interactions modelling. At the population level, population PK models have become the
most commonly used top-down models that derive a pharmaco-statistical model from observed systemic
concentrations. PK/PD modelling involves on the one hand a quantification of drug absorption, disposition,
metabolism and excretion (PK) and on the other hand a description of the drug-induced effect (PD). PK/PD
models and quantitative systems pharmacology (QSP) both aim for mechanistic and quantitative analyses of
[31]
the interactions between a substance such as a drug and a specific biological system .
PK and PBPK modelling are currently used for simulations for virtual patient populations in in silico clinical
trials. The concept is that computer simulations are proposed as an alternative source of evidence to support
drug development to reduce, refine, complement or replace the established data sources including in vitro
experiments, in vivo animal studies and clinical trials in healthy volunteers and patients.
4.2.5.2 Challenges
The challenges are as follows:
a) reliable data sources for systems-related parameters are currently limited;
b) methods for data generation, collection and integration are not standardized;
[32]
c) the reporting of results is very heterogeneous and inconsistent ;
d) tools to be used and criteria for model evaluation are very variable across projects;
e) a very limited number of platforms (systems model) are currently considered reliable and qualified for
regulatory submission.
4.2.6 Artificial intelligence systems (AI systems)
4.2.6.1 General
Data-driven approaches, utilizing AI systems and machine learning (ML) treat the mechanism as unknown
and aim to model a function that operates on data input to predict the outcome, regardless of the unknown
physiological processes. The mechanisms operating in the complex systems being modelled, i.e. which
factors together drive outcomes, are considered too complex to be determined (e.g. black-box models). The
quality of AI systems is assessed through the accuracy of their predictions, tested in a variety of ways. These
data-driven models can be applied in a hypothesis-naive way, made as to which factors drive the causal
mechanism.
ML approaches learn the theory automatically from the data through a process of inference, model fitting or
[33]
learning from examples. ML can be supervised, unsupervised or partially supervised (see Annex B).
4.2.6.2 Challenges
The challenges are as follows:
a) imprecise reporting, which makes it difficult to obtain the full benefit of results, navigate biomedical
literature and generate clinically actionable findings;
b) data standardization, since most in silico methods require comparable input data;
c) data based on group associations, or pre-determined understanding of clinical relationships, can bias
and limit AI/ML predictions (inappropriately pre-processed data);
d) different proprietary systems in healthcare information technology (IT) make data extraction, labelling,
interpretation and standardization highly complex procedures (data lockdown).
4.3 Standardization needs for computational models
4.3.1 General
Major challenges in the field of personalized medicine are to harmonize the standardization efforts that
refer to different data types, approaches and technologies, as well as to make the standards interoperable,
so that the data can be compared and integrated into models. Reproducible modelling in personalized
medicine requires a basic understanding of the modelled system, as well as of its biological and physiological
background, and finally of the applied virtual experiments.
Because of the heterogeneous nature of the data in personalized medicine, harmonized strategies for data
integration are required that utilize broadly applicable standards to allow for reproducible data exploitation
to generate new knowledge for medical benefits. Whereas the model simulation process itself can vary
greatly or be even partially unknown, e.g. in AI systems, making it hard to standardize, the integration of
data into the model (input), as well as the outcome of the model (output) can be standardized and validated.
Extensive model validation, e.g. with a set of standardized and high-quality validation data as input, can
be used to validate the whole modelling process, even if the model simulation itself is not standardized.
The two key components for which broad standardization efforts make most sense in the model building
process are thus data integration and model validation (see Figure 2).
Figure 2 — Data integration and model validation as key factors for standardization requirements
for computational models
4.3.2 Challenges
Although for many different data types used in personalized medicine there are domain-specific annotation
standards and terminologies available (see Tables A.1 to A.4), the process of model building possesses the
following variety of challenges:
a) high degree of variability regarding data types (structured versus unstructured, molecular, clinical,
laboratory, patient-reported, etc.);
b) differences in coding and calculation within data types (between-machine variability, different
measurements, etc.);
c) heterogeneous utilization of existing data and lack of domain- and data-specific standard methods for
data pre-processing;
d) high effort of harmonization of data concepts in terms of time, resources and cost;
e) models relevant for clinical use need to be fit for purpose;
f) differences in IT systems used in data generation, e.g. enterprise resource planning systems and
laboratory result software or hardware, at national, regional or clinical centre level;
g) lack of standard workflows (compliant with national and regional regulations and laws) for personal
health data access and processing;
h) lack of training, awareness and empowerment for existing standards and workflows;
i) adoption of different domain-specific terminology standards for health data such as SNOMED CT, NPU
(Nomenclature for Properties and Units) or LOINC (Logical Observation Identifier Names and Codes);
j) differences in implementation of international terminologies such as the International Classification of
Diseases (ICD);
k) long-term variety and dynamics of data and standards;
l) language differences in unstructured text, and other factors.
4.3.3 Common standards relevant for personalized medicine
The use of common standards developed by specific user communities and different stakeholders, as well
as standard-defining organizations, has been enhanced as they have been coupled to tools, which have
spread in the respective field of research. Tables A.1 to A.4 provide an overview of some of these standards
currently in use by different communities.
4.4 Data preparation for integration into computer models
4.4.1 General
Computational models in the life sciences in general as well as in healthcare and personalized medicine
research in particular are increasingly incorporating rich and varied data sets to capture multiple aspects
of the modelled phenomenon. Data types are encoded in technology and subdomain specific formats and the
variety and incompatibility, as well as lack of interoperability, of such data formats have been noted as one of
the major hurdles for data preparation.
To allow for seamless integration of data used for the construction of predictive computational models in
personalized medicine, these data shall:
a) include or be annotated with sampling and specimen data that follow the requirements and
recommendations in accordance with the relevant domain-specific standards;
b) be formatted using generally accepted and interoperable standard data formats commonly used for the
corresponding data types in accordance with ISO 20691;
c) include or be annotated with descriptive metadata that consider generally accepted domain-specific
minimum information guidelines and describe the metadata attributes and entities using semantic
standards, standard terminologies, controlled vocabularies and ontologies as specified in ISO 20691;
d) follow best practice requirem
...
ISO/TC 276/WG 5
Secretariat: DIN
Date: 2026-01-1503-09
Biotechnology — Predictive computational models in personalized
medicine research — —
Part 1:
Constructing, verifying and validating models
Biotechnologie — Modèles informatiques prédictifs dans la recherche sur la médecine personnalisée —
Partie 1: Construction, vérification et validation des modèles
FDIS stage
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication
may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying,
or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO
at the address below or ISO’s member body in the country of the requester.
ISO Copyright Officecopyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: + 41 22 749 01 11
Email: copyright@iso.org
E-mail: copyright@iso.org
Website: www.iso.orgwww.iso.org
Published in Switzerland.
ii
Contents
Foreword . v
Introduction . vi
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Principles . 5
Annex A (informative) Common standards relevant for personalized medicine and in silico
approaches . 27
Annex B (informative) Information on modelling approaches relevant for personalized
medicine . 30
Bibliography . 32
Foreword . iv
Introduction . v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Principles . 5
4.1 General . 5
4.2 Computational models in personalized medicine . 5
4.2.1 General . 5
Figure 1 — Modelling approach for personalized medicine . 6
4.2.2 Cellular systems biology models . 6
4.2.3 Risk prediction for common diseases . 7
4.2.4 Disease course and therapy response prediction . 7
4.2.5 Pharmacokinetic/pharmacodynamic modelling and in silico trial simulations . 8
4.2.6 Artificial intelligence systems (AI systems) . 8
4.3 Standardization needs for computational models . 9
4.3.1 General . 9
Figure 2 — Data integration and model validation as key factors for standardization
requirements for computational models . 9
4.3.2 Challenges . 10
4.3.3 Common standards relevant for personalized medicine . 10
4.4 Data preparation for integration into computer models . 10
4.4.1 General . 10
4.4.2 Sampling data . 11
Table 1 — Important metadata collected during pre-examination workflows . 11
4.4.3 Data formatting . 12
4.4.4 Data description . 13
4.4.5 Data annotation (semantics) . 14
4.4.6 Data interoperability requirements across subdomains . 14
iii
Table 2 — Summary of metadata properties . 14
4.4.7 Data integration . 15
4.4.8 Data provenance information . 16
4.4.9 Data access . 16
4.5 Model formatting . 17
4.6 Model validation . 17
4.6.1 General . 17
4.6.2 Specific recommendations for model validation . 18
Table 3 — Specific recommendations for model validation . 18
4.7 Model simulation . 19
4.7.1 General . 19
Table 4 — Typical methods for model simulation used for modelling of biochemical systems . 20
4.7.2 Requirements for capturing and sharing simulation set-ups . 21
4.7.3 Requirements for capturing and sharing simulation results . 21
4.8 Requirements for model storing and sharing . 21
4.9 Application of models in clinical trials and research . 22
4.9.1 General . 22
4.9.2 Specific recommendations . 22
Table 5 — Specific recommendations for the application of models in clinical trials and research22
4.10 Ethical requirements for modelling in personalized medicine . 23
Annex A (informative) Common standards relevant for personalized medicine and in silico
approaches . 24
Table A.1 — DNA, RNA, protein sequence formats . 24
Table A.2 — Mass spectrometry . 24
Table A.3 — Medical imaging, digital imaging and communications in medicine . 25
Table A.4 — Semantic integrations . 25
Annex B (informative) Information on modelling approaches relevant for personalized
medicine . 27
B.1 Risk prediction for common diseases . 27
B.2 Artificial intelligence systems . 27
B.3 Minimum amount of information to understand a model . 27
Bibliography . 29
iv
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards
bodies (ISO member bodies). The work of preparing International Standards is normally carried out through
ISO technical committees. Each member body interested in a subject for which a technical committee has been
established has the right to be represented on that committee. International organizations, governmental and
non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the
International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described
in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types of
ISO document should be noted. This document was drafted in accordance with the editorial rules of the
ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a)
patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent rights
in respect thereof. As of the date of publication of this document, ISO had not received notice of (a) patent(s)
which may be required to implement this document. However, implementers are cautioned that this may not
represent the latest information, which may be obtained from the patent database available at
www.iso.org/patents.www.iso.org/patents. ISO shall not be held responsible for identifying any or all such
patent rights.
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions
related to conformity assessment, as well as information about ISO's adherence to the World Trade
Organization (WTO) principles in the Technical Barriers to Trade (TBT), see
www.iso.org/iso/foreword.htmlwww.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 276, Biotechnology.
This second edition cancels and replaces the first edition (ISO/TS 9491-1:2023), which has been technically
revised.
The main changes are as follows:
— the normative references in Clause 2Clause 2 have been consolidated, updated and revised;
— update and clarification of terminology including the alignment with the terminology of ISO/TS 9491-2;
— updated to match the latest developments in the domain;
— the bibliography has been revised and updated;
— editorial updaterevision and clarification of wording.
A list of all parts in the ISO 9491 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www.iso.org/members.html.
v
Introduction
The capacity to generate data in life sciences and health research has greatly increased in the last decade. In
combination with patient/personal-derived data, such as electronic health records, patient registries and
databases, as well as lifestyle information, this big data holds an immense potential for clinical applications,
especially for computer-based models with predictive capacities in personalized medicine. However, and
despite the ever-progressing technological advances in producing data, the exploitation of big data to generate
new knowledge for medical benefits, while guaranteeing data privacy and security, is lacking behind its full
potential. A reason for this obstacle is the inherent heterogeneity of big data and the lack of broadly accepted
standards allowing interoperable integration of heterogeneous health data to perform analysis and
interpretation for predictive modelling approaches in health research, such as personalized medicine.
Common standards lead to a mutual understanding and improve information exchange within and across
research communities and are indispensable for collaborative work. In order to setup computer models in
personalized medicine, data integration from heterogeneous and different sources at different times plays a
key role. Consistent documentation of data, models and simulation results based on basic guiding principles
[6] [6]
for data management practices, such as FAIR (findable, accessible, interoperable, reusable) ) or ALCOA
(attributable, legible, contemporaneous, original, accurate), and standards can ensure that the data and the
corresponding metadata (data describing the data and its context), as well as the models, methods and
visualizations, are of reliable high quality.
Hence, standards for biomedical and clinical data, simulation models and data exchange are a prerequisite for
[7][7]
reliable integration of health-related data. . Such standards, together with harmonized ways to describe
their metadata, ensure the interoperability of tools used for data integration and modelling, as well as the
reproducibility of the simulation results. In this sense, modelling standards are agreed ways of consistently
structuring, describing, and associating models and data, their respective parts and their graphical
visualization, as well as the information about applied methods and the outcome of model simulations. Such
standards also assist in describing how constituent parts interact, or are linked together, and how they are
embedded in their physiological context.
Major challenges in the field of personalized medicine are to:
a) a) harmonize the standardization efforts that refer to different data types, approaches and
technologies;
b) b) make the standards interoperable, so that the data can be compared and integrated into
models.
An overall goal is to FAIRify data and processes in order to improve data integration and reuse. An additional
challenge is to ensure a legal and ethical framework enabling interoperability.
This document presents computational modelling requirements and recommendations for research in the
field of personalized medicine, especially with focus on collaborative research, such that health-related data
can be optimally used for translational research and personalized medicine worldwide. The recommendations
are primarily oriented towards the application of computational modelling in the biotechnology domain (e.g.
biomolecular and cellular research, as well as in clinical trials and drug development), but also can be applied
in other fields of personalized medicine research.
vi
DRAFT International Standard ISO/DIS 9491-1:2025(en)
Biotechnology — Predictive computational models in personalized
medicine research — —
Part 1:
Constructing, verifying and validating models
1 Scope
This document specifies requirements and recommendations for the design, development and
implementation of predictive computational models for research purposes in the field of personalized
medicine and health product development.
This document addresses the set-up, formatting, validation, simulation, storing and sharing of computational
models used for personalized medicine. Requirements and recommendations for data used to construct or
required for validating such models are also addressedspecified. This includes rules for formatting,
descriptions, annotations, interoperability, integration, access and provenance of such data.
This document does not apply to computational models used for standard routine clinical, diagnostic or
therapeutic purposes.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes
requirements of this document. For dated references, only the edition cited applies. For undated references,
the latest edition of the referenced document (including any amendments) applies.
11)
ISO 20691, Biotechnology — Requirements for data formatting and description in the life sciences
ISO 20387, Biotechnology — Biobanking — General requirements for biobanks
2)
ISO/DIS 20387:2025, Biotechnology — Biobanking — General requirements for biobanks
ISO 23494--1, Biotechnology — Provenance information model for biological material and data — Part 1:
Design concepts and general requirements
ISO 23494--2, Biotechnology — Provenance information model for biological material and data — Part 2:
Common Provenance Model
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
https://fairsharing.org/3533
1)
https://fairsharing.org/3533
2)
Under preparation. Stage at the time of publication: ISO/DIS 20387:2025.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— — ISO Online browsing platform: available at https://www.iso.org/obphttps://www.iso.org/obp
— — IEC Electropedia: available at https://www.electropedia.org/https://www.electropedia.org/
3.1 3.1
artificial intelligence
AI
research and development of mechanisms and applications of AI systems (3.2)(3.2)
Note 1 to entry: Research and development can take place across any number of fields such as computer science, data
science, humanities, mathematics and natural sciences.
[SOURCE: ISO/IEC 22989:2022, 3.1.3]
3.2 3.2
artificial intelligence system
AI system
engineered system that generates outputs such as content, forecasts, recommendations or decisions for a
given set of human-defined objectives
Note 1 to entry: The engineered system can use various techniques and approaches related to artificial intelligence to
develop a model to represent data, knowledge, processes, etc. which can be used to conduct tasks.
Note 2 to entry: AI systems are designed to operate with varying levels of automation.
[SOURCE: ISO/IEC 22989:2022, 3.1.4]
3.3
3.3
big data
extensive datasets — primarily in the data characteristics of volume, variety, velocity, and/or variability —
that require a scalable technology for efficient storage, manipulation, management, and analysis
Note 1 to entry: Big data is commonly used in many different ways, for example as the name of the scalable technology
used to handle big data extensive datasets.
EXAMPLE: high volume, high diversity biological, clinical, environmental, and lifestyle information collected from
single individuals to large cohorts, in relation to their health and wellness status, at one or several time points (see
reference [8] [[8]] for additional information)
[SOURCE: ISO/TR 24291:2021, 3.2, modified — EXAMPLE added].]
3.4 3.4
community consensus standard
standard that reflects the results of a consensus standardization effort from a specific domain-specific expert
group outside of recognized standard defining organizations and their technical committees
Note 1 to entry: Created by domain-specific professional societies, scientific standardization initiatives, individual
organizations or research communities (often in collaboration with industry partners)
Note 2 to entry: Often publicly available, open and not proprietary
ISO/DISFDIS 9491-1:20252026(en)
3.5 3.5
computational model
in silico model
description of a biological system in either a mathematical expression or graphical form, or both, that is
implemented and studied with a computer highlighting objects and their interactions
Note 1 to entry: An object distributed processing (ODP) concept.
[SOURCE: ISO/IEC 16500-8:1999, 3.6, modified — Admitted term added. “biological”, “mathematical
expression or”, “, or both, that is implemented and studied with a computer” added, “interfaces” changed to
“interactions” and “as such it is similar to the OMT and UML notion of a class diagram” deleted from the
definition. “An object distributed processing (ODP) concept” moved to Note 1 to entry.]
3.6 3.6
data-driven model
model developed through the use of data derived from tests or from the output of investigated process or from
real world data or routinely acquired primary care data
[SOURCE: ISO 15746-1:2015, 2.4, modified — “or from real world data or routinely acquired primary care
data” added]
3.7
3.7
data harmonization
harmonization of data concepts
data harmonization
process of reconciling differences in semantics, structure and syntax of similar data concepts
Note 1 to entry: Harmonization can include the establishment of a single pervasive definition for each data concept (i.e.
standardization), but can also encompass flexible approaches in which definitions can be understood to grow closer
without becoming identical.
[SOURCE: ISO/TR 25100:2012, 2.1.4, modified — “harmonisation” replaced by “harmonization”, “may” in Note
1 to entry replaced by “can”]”.]
3.73.8 3.8
data integration
systematic combining of data from different independent and potentially heterogeneous sources, to create a
more compatible, unified view of these data for research purpose
[SOURCE: ISO 5127:2017, 3.1.11.24]
3.83.9 3.9
genome-wide association studies
GWAS
testing of genetic variants across the genomes of many individuals to identify genotype–phenotype
associations
3.93.10 3.10
in silico clinical trial
use of computer modelling and simulation(s) to mimic human experimentation in the development or
regulatory evaluation process of a medicinal product (e.g. medical device) or medical intervention, under
defined conditions using verified and validated models
Note 1 to entry: It is a subdomain of ‘in silico medicine’, the discipline that encompasses the use of individualised
computer simulations in all aspects of the prevention, diagnosis, prognostic assessment, and treatment of disease.
[SOURCE: Reference [9],[9], modified — Note 1 to entry added].]
3.103.11 3.11
in silico approach
computer-executable analyses of mathematical model(s) (3.13)(3.13) to study and simulate a biological
system
3.113.12 3.12
machine learning
ML
computer technology with the ability to automatically learn and improve from experience without being
explicitly programmed
EXAMPLESEXAMPLE Speech recognition, predictive text, spam detection, or optimizing model parameters
through computational techniques, such that the model's behaviour reflects the data or experience.
[SOURCE: ISO 20252:2019, 3.52, modified — Abbreviated term “ML” added and EXAMPLES
changed to “Speech recognition, predictive text, spam detection, or optimizing model parameters through
computational techniques, such that the model's behaviour reflects the data or experience.”.]
3.123.13 3.13
mathematical model
set of equations that describes the behaviour of a physical system
[SOURCE: ISO 16730-1:2015, 3.11]
3.133.14 3.14
mechanism-based
approach in computational modelling that aims for a structural representation
3.143.15 3.15
model validation
comparison between the output of the calibrated model and the measured data, independent of the data set
used for calibration
[SOURCE: ISO 14837-1:2005, 3.7]
3.153.16 3.16
model verification
confirmation that the mathematical elements of the model behave as intended
[SOURCE: ISO 14837-1:2005, 3.8]
3.163.17 3.17
molecular biomarker
biomarker
molecular marker
detectable and/or quantifiable molecule or group of molecules used to indicate a biological condition, state,
identity or characteristic of an organism (e.g. an individual)
EXAMPLE: Nucleic acid sequences, proteins, small molecules such as metabolites, other molecules such as lipids and
polysaccharides.
[SOURCE: ISO 16577:2022, 3.4.28, modified — “or an” changed to “of an” and “(e.g. an individual)” added to
definition].]
ISO/DISFDIS 9491-1:20252026(en)
3.173.18 3.18
personalized medicine
precision medicine
medical model using characterization of individuals’ phenotypes and genotypes for tailoring the right
therapeutic strategy for the right person at the right time, and/or to determine the predisposition to disease
and/or to deliver timely and targeted prevention
Note 1 to entry: Examples for individuals’ phenotypes and genotypes are molecular profiling, medical imaging and
lifestyle data.
Note 2 to entry: Medical decisions, prevention strategies and therapies in personalized medicine are based on this
individuality.
[10] [10]
[SOURCE: EU 2015/C 421/03 ,, modified — Notes 1 and 2 to entry added and “(e.g. molecular profiling,
medical imaging, lifestyle data)” deleted from definition].]
3.183.19 3.19
phenotype
set of observable characteristics of an organism resulting from the interaction of its genotype with the
environment
[SOURCE: ISO 4454:2022, 3.14, modified — Note 1 to entry deleted.]
3.193.20 3.20
raw data
data in its originally acquired, direct form from its source before subsequent processing
[SOURCE: ISO 5127:2017, 3.1.10.04]
4 Principles
4.1 General
Research in the field of personalized medicine is highly dependent on the exchange of data from different
sources, as well as harmonized integrative analysis of large-scale personalized medicine data (big data in
health research). Computational modelling approaches play a key role for understanding, simulating and
predicting the molecular processes and pathways that characterize human biology. Modelling approaches in
biomedical research also lead to a more profound understanding of the mechanisms and factors that drive
diseases, and consequently allow for adapting personalized treatment strategies that are guided by central
clinical questions. Patients can greatly benefit from this development in research that equips personalized
medicine with predictive capabilities to simulate in silico clinically relevant questions, such as the effect of
therapies, the response to drug treatments or the progression of disease.
4.2 Computational models in personalized medicine
4.2.1 General
Computational models have the potential to translate in vitro, non-clinical and clinical results (and their
related uncertainty) into descriptive or predictive expressions. The added value of such models in medicine
[11][12][13][14] [11][12][13][14]
and pharmacology has increasingly been recognized by the scientific community ,, as
well as by regulatory bodies such as the European Medicines Agency (e.g. EMA guideline on PBPK
[15] [15] [16][17] [16][17]
reporting ), ), or the US Food and Drug Administration (FDA) .). Computational models are
integrated in different fields in medicine as well as in the development of drugs and other health products,
expanding from disease modelling, molecular and physiological biomarker research to assessment of drug and
medical device efficacy and safety. In silico approaches are also expanding in neighbouring fields, such as
[18][19] [18][19] [20][21][20][21]
pharmacoeconomics ,, analytical chemistry and biology that are out of scope of this
[22][23] [22][23]
document . .
Model creation starts with a clinical question and the collection of data (see Figure 1).Figure 1). The data
employed need harmonized approaches for data integration to start the model construction. The initial model
usually undergoes several refinement and improvement iterations to enhance predictive capabilities.
Common standards (see 4.3.3)4.3.3) should be used for the model building and curation process. Accuracy
measurements and validation processes are key, and should be transparent, while model output and function
should ideally be interpretable or explainable.
A number of computational modelling approaches in pre-clinical and clinical research already address these
questions in detail (see 4.2.2 to 4.2.6)4.2.2 to 4.2.6) and, therefore, play a leading role for the future
development of personalized medicine.
ISO/DISFDIS 9491-1:20252026(en)
Figure 1 — Modelling approach for personalized medicine
4.2.2 Cellular systems biology models
4.2.2.1 General
For the simulation of complex dynamic biological processes and networks, models can be either data-driven
(“bottom-up”) or mechanism-based (“top-down”).
Mechanism-based concepts aim for a structural representation of the governing physiological processes based
[24][24]
on model equations with limited amount of data, which are required for the base model establishment
[25][26] [25][26] [11][27][11][27]
or, alternatively, on static interacting networks . Data-driven approaches require
sufficiently rich and quantitative (e.g. time-course) data to train and to validate the model. Due to the
occasional black-box nature of data-driven approaches, the model validation process relies on performance
tests against known results.
4.2.2.2 Challenges
The challenges are as follows:
a) the creation of models that balance the level of abstraction with comprehensiveness to make modelling
efforts reproducible and reusable (abstraction versus size);
b) the development of prediction models that can be adapted easily to individual patient profiles;
ISO/DISFDIS 9491-1:20252026(en)
c) efficient parameter estimation tools to cope with population and disease heterogeneity;
d) overfitting of the model to the experimental/patient data and optimization methods for model predictions
in a realistic parametric uncertainty;
e) flexibility in models to cope with missing data (e.g. diverse patient profiles);
f) scaling from cellular to organ and to organism levels (e.g. high clinical relevance, high hurdles for
regulatory acceptancy).
4.2.3 Risk prediction for common diseases
4.2.3.1 General
Predictive models stratify patients into distinct subgroups at different levels of risk for clinical outcomes (risk
prediction for disease). By training the algorithm on clinical data, phenotypic or genotypic subgroups can be
identified which have identifiably different patterns of clinical markers. By then identifying which patterns a
patient fits best, the model can place a particular patient within the most similar trajectory, thereby also
stratifying the patient to a particular level of risk. Clinical markers used in such models can be any health
feature, which can be tokenized to be analysable by the model. These health features range from disease
history symptoms, treatment and other exposure data, family history, laboratory data, etc., to genetic data.
4.2.3.2 Challenges
The challenges are as follows:
a) understanding the possible implication to patients at an individual level:
1) What can be inferred?
2) How to test the inference made?
b) limited replication of measurements and analyses (e.g. genetic associations) and poor application of
diverse populations (e.g. too poorly represented to be of interest for specific analyses), specifically of
mixed or non-European ancestry;
c) varying transparency of methodological choices and reproducibility;
d) limited cellular/tissue context and harmonized functional data availability across populations/studies;
e) missing environmental information coupled to genetic data.
4.2.4 Disease course and therapy response prediction
4.2.4.1 General
Prediction of the disease behaviour (mild versus severe, stable versus progressive) early in the disease course
based on specific molecular biomarkers can allow an improved timing of therapy introduction, as well as the
[28] [28]
choice of therapy scheme (targeted therapy) .). Ideally, these models can provide a prediction of multi-
factorial diseases at unprecedented resolution, in a way that clinicians can use the information in their daily
decision-making.
4.2.4.2 Challenges
The challenges are as follows:
a) harmonization and standardization of clinical information for measuring the disease of interest;
b) developing transparent and quality-controlled workflows for data generation and interpretation in
clinical settings;
c) harmonization and application of existing and upcoming pre-examination workflow standards (including
specimen collection, storage and nucleic acid isolation), as well as developing feasible ring trial formats
and external quality assurance (EQA) schemes for given molecular analysis types;
d) transparent reduction of contents and definition of appropriate marker sets and dynamic models to foster
clinical translation;
e) developing intuitive visualization results and insights into molecular analyses, as well as critical appraisal
of limitations of models by physicians.
4.2.5 Pharmacokinetic/pharmacodynamic modelling and in silico trial simulations
4.2.5.1 General
[29][30][29][30]
Pharmacokinetic/pharmacodynamic (PK/PD) models can translate in vitro, non-clinical and
clinical PK/PD data into meaningful information to support decision-making. At the individual level, substance
PKs can either be described by non-compartmental analysis and compartmental PK modelling or by
physiologically-based PK (PBPK) modelling. PBPK models are commonly used for interspecies extrapolations
and drug-drug interactions modelling. At the population level, population PK models have become the most
commonly used top-down models that derive a pharmaco-statistical model from observed systemic
concentrations. PK/PD modelling involves on the one hand a quantification of drug absorption, disposition,
metabolism and excretion (PK) and on the other hand a description of the drug-induced effect (PD). PK/PD
models and quantitative systems pharmacology (QSP) both aim for mechanistic and quantitative analyses of
[31] [31]
the interactions between a substance such as a drug and a specific biological system . .
PK and PBPK modelling are currently used for simulations for virtual patient populations in in silico clinical
trials. The concept is that computer simulations are proposed as an alternative source of evidence to support
drug development to reduce, refine, complement or replace the established data sources including in vitro
experiments, in vivo animal studies and clinical trials in healthy volunteers and patients.
4.2.5.2 Challenges
The challenges are as follows:
a) reliable data sources for systems-related parameters are currently limited;
b) methods for data generation, collection and integration are not standardized;
[32] [32]
c) the reporting of results is very heterogeneous and inconsistent ; ;
d) tools to be used and criteria for model evaluation are very variable across projects;
e) a very limited number of platforms (systems model) are currently considered reliable and qualified for
regulatory submission.
4.2.6 Artificial intelligence systems (AI systems)
4.2.6.1 General
Data-driven approaches, utilizing AI systems and machine learning (ML) treat the mechanism as unknown
and aim to model a function that operates on data input to predict the outcome, regardless of the unknown
ISO/DISFDIS 9491-1:20252026(en)
physiological processes. The mechanisms operating in the complex systems being modelled, i.e. which factors
together drive outcomes, are considered too complex to be determined (e.g. black-box models). The quality of
AI systems is assessed through the accuracy of their predictions, tested in a variety of ways. These data-driven
models can be applied in a hypothesis-naive way, made as to which factors drive the causal mechanism.
ML approaches learn the theory automatically from the data through a process of inference, model fitting or
[33] [33]
learning from examples . ML can be supervised, unsupervised or partially supervised (see
Annex B).Annex B).
4.2.6.2 Challenges
The challenges are as follows:
a) imprecise reporting, which makes it difficult to obtain the full benefit of results, navigate biomedical
literature and generate clinically actionable findings;
b) data standardization, since most in silico methods require comparable input data;
c) data based on group associations, or pre-determined understanding of clinical relationships, can bias and
limit AI/ML predictions (inappropriately pre-processed data);
d) different proprietary systems in healthcare information technology (IT) make data extraction, labelling,
interpretation and standardization highly complex procedures (data lockdown).
4.3 Standardization needs for computational models
4.3.1 General
Major challenges in the field of personalized medicine are to harmonize the standardization efforts that refer
to different data types, approaches and technologies, as well as to make the standards interoperable, so that
the data can be compared and integrated into models. Reproducible modelling in personalized medicine
requires a basic understanding of the modelled system, as well as of its biological and physiological
background, and finally of the applied virtual experiments.
Because of the heterogeneous nature of the data in personalized medicine, harmonized strategies for data
integration are required that utilize broadly applicable standards to allow for reproducible data exploitation
to generate new knowledge for medical benefits. Whereas the model simulation process itself can vary greatly
or be even partially unknown, e.g. in AI systems, making it hard to standardize, the integration of data into the
model (input), as well as the outcome of the model (output) can be standardized and validated. Extensive
model validation, e.g. with a set of standardized and high-quality validation data as input, can be used to
validate the whole modelling process, even if the model simulation itself is not standardized. The two key
components for which broad standardization efforts make most sense in the model building process are thus
data integration and model validation (see Figure 2).Figure 2).
Figure 2 — Data integration and model validation as key factors for standardization requirements
for computational models
4.3.2 Challenges
Although for many different data types used in personalized medicine there are domain-specific annotation
standards and terminologies available (see Annex A),Tables A.1 to A.4), the process of model building
possesses the following variety of challenges:
a) high degree of variability regarding data types (structured versus unstructured, molecular, clinical,
laboratory, patient-reported, etc.);
b) differences in coding and calculation within data types (between-machine variability, different
measurements, etc.);
c) heterogeneous utilization of existing data and lack of domain- and data-specific standard methods for data
pre-processing;
d) high effort of harmonization of data concepts in terms of time, resources and cost;
e) models relevant for clinical use need to be fit for purpose;
f) differences in IT systems used in data generation, e.g. enterprise resource planning systems and
laboratory result software or hardware, at national, regional or clinical centre level;
g) lack of standard workflows (compliant with national and regional regulations and laws) for personal
health data access and processing;
h) lack of training, awareness and empowerment for existing standards and workflows;
i) adoption of different domain-specific terminology standards for health data such as SNOMED CT, NPU
(Nomenclature for Properties and Units) or LOINC (Logical Observation Identifier Names and Codes);
j) differences in implementation of international t
...












Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...