ETSI GS NFV-REL 003 V1.1.2 (2016-07)
Network Functions Virtualisation (NFV); Reliability; Report on Models and Features for End-to-End Reliability
Network Functions Virtualisation (NFV); Reliability; Report on Models and Features for End-to-End Reliability
RGS/NFV-REL003ed112
General Information
Standards Content (Sample)
ETSI GS NFV-REL 003 V1.1.2 (2016-07)
GROUP SPECIFICATION
Network Functions Virtualisation (NFV);
Reliability;
Report on Models and Features for End-to-End Reliability
Disclaimer
The present document has been produced and approved by the Network Functions Virtualisation (NFV) ETSI Industry
Specification Group (ISG) and represents the views of those members who participated in this ISG.
It does not necessarily represent the views of the entire ETSI membership.
---------------------- Page: 1 ----------------------
2 ETSI GS NFV-REL 003 V1.1.2 (2016-07)
Reference
RGS/NFV-REL003ed112
Keywords
availability, NFV, reliability, resiliency
ETSI
650 Route des Lucioles
F-06921 Sophia Antipolis Cedex - FRANCE
Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16
Siret N° 348 623 562 00017 - NAF 742 C
Association à but non lucratif enregistrée à la
Sous-Préfecture de Grasse (06) N° 7803/88
Important notice
The present document can be downloaded from:
http://www.etsi.org/standards-search
The present document may be made available in electronic versions and/or in print. The content of any electronic and/or
print versions of the present document shall not be modified without the prior written authorization of ETSI. In case of any
existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the
print of the Portable Document Format (PDF) version kept on a specific network drive within ETSI Secretariat.
Users of the present document should be aware that the document may be subject to revision or change of status.
Information on the current status of this and other ETSI documents is available at
https://portal.etsi.org/TB/ETSIDeliverableStatus.aspx
If you find errors in the present document, please send your comment to one of the following services:
https://portal.etsi.org/People/CommiteeSupportStaff.aspx
Copyright Notification
No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying
and microfilm except as authorized by written permission of ETSI.
The content of the PDF version shall not be modified without the written authorization of ETSI.
The copyright and the foregoing restriction extend to reproduction in all media.
© European Telecommunications Standards Institute 2016.
All rights reserved.
TM TM TM
DECT , PLUGTESTS , UMTS and the ETSI logo are Trade Marks of ETSI registered for the benefit of its Members.
TM
3GPP and LTE™ are Trade Marks of ETSI registered for the benefit of its Members and
of the 3GPP Organizational Partners.
GSM® and the GSM logo are Trade Marks registered and owned by the GSM Association.
ETSI
---------------------- Page: 2 ----------------------
3 ETSI GS NFV-REL 003 V1.1.2 (2016-07)
Contents
Intellectual Property Rights . 5
Foreword . 5
Modal verbs terminology . 5
1 Scope . 6
2 References . 6
2.1 Normative references . 6
2.2 Informative references . 6
3 Definitions and abbreviations . 8
3.1 Definitions . 8
3.2 Abbreviations . 9
4 Overview . 10
4.1 End-to-End network service chain . 10
4.2 Reliability model of an end-to-end service . 11
4.3 Structure of the present document . 11
5 Generic reliability and availability modelling and estimation. 12
5.1 Introduction . 12
5.2 Reliability models and estimations . 12
5.2.1 Basic equations . 12
5.2.2 Modelling of composed systems . 13
5.3 Software reliability models and estimation . 14
5.3.1 Introduction. 14
5.3.2 Diversity, an approach for fault tolerance . 16
5.3.3 Software reliability evaluation and measurement . 18
6 Reliability/availability methods . 21
6.1 Overview . 21
6.1.1 NFV architecture models . 21
6.1.2 Network service elements . 24
6.1.3 Characteristics of the networks adopting NFV concepts in terms of reliability and availability . 24
6.2 Network function . 27
6.2.1 Introduction. 27
6.2.2 NFVI and NFV-MANO support for VNF reliability and availability . 29
6.2.2.1 Mechanisms overview by fault management cycle phase . 29
6.2.2.1.0 General . 29
6.2.2.1.1 Affinity aware resource placement . 29
6.2.2.1.2 State protection . 30
6.2.2.1.3 Failure detection . 31
6.2.2.1.4 Localization . 31
6.2.2.1.5 Isolation . 32
6.2.2.1.6 Remediation . 32
6.2.2.1.7 Recovery . 33
6.2.2.2 Non-redundant/on-demand redundant VNFC configurations . 33
6.2.2.3 Active-Standby VNFC redundancy configurations . 35
6.2.2.4 Active-Active VNFC redundancy configurations . 40
6.2.3 VNF protection schemes . 43
6.2.3.1 Introduction . 43
6.2.3.2 Active-Standby method . 43
6.2.3.3 Active-Active method . 50
6.2.3.4 Load balancing method . 53
6.3 Reliability methods for virtual links . 59
6.3.1 Introduction. 59
6.3.2 Failure of physical links across NFVI nodes . 59
7 Reliability estimation models . 60
ETSI
---------------------- Page: 3 ----------------------
4 ETSI GS NFV-REL 003 V1.1.2 (2016-07)
7.1 Basic redundancy methods and reliability/availability estimation . 60
7.1.1 Sub network (or a chain of NFs) . 60
7.1.2 Reliability/availability of a sub network . 61
7.1.3 Sub network with redundancy configurations. 61
7.1.4 Reliability/availability of end-to-end paths . 64
7.2 Lifecycle operation . 65
7.2.1 Introduction. 65
7.2.2 Network service without redundancy . 65
7.2.3 Network service with redundancy . 67
7.2.4 Reliability and availability owing to the protection schemes . 68
7.2.5 Reliability and availability during load fluctuation . 71
8 Reliability issues during NFV software upgrade and improvement mechanisms . 73
8.1 Service availability recommendations for software upgrade in NFV . 73
8.2 Implementation of NFV software upgrade and service availability . 74
8.2.1 VNF software upgrade . 74
8.2.1.1 Software upgrade of traditional network functions . 74
8.2.1.2 VNF software upgrade in an NFV environment . 76
8.2.1.3 Software upgrade method using migration avoidance . 78
8.2.2 NFVI software upgrade . 80
8.3 Summary for software upgrade in NFV . 82
9 E2E operation and management of service availability and reliability . 82
9.1 Service flows in a network service . 82
9.2 Metadata of service availability and reliability for NFV descriptors . 84
9.2.1 Introduction. 84
9.2.2 Metadata of service availability and reliability . 84
9.2.3 Service availability level in NFV descriptors . 85
9.3 End-to-end operation and management of service availability and reliability . 87
10 Recommendations/Guidelines . 88
11 Security Considerations . 89
Annex A (informative): Reliability/availability methods in some ETSI NFV PoCs . 90
A.1 Introduction . 90
A.2 PoC#12 Demonstration of multi-location, scalable, stateful Virtual Network Function . 90
A.3 PoC#35 Availability management with stateful fault tolerance . 91
Annex B (informative): Service quality metrics . 94
Annex C (informative): Accountability-oriented end-to-end modelling . 95
C.1 Steady state operation outage downtime . 95
C.2 Lifecycle operation . 95
C.3 Disaster operation objectives . 95
Annex D (informative): Automated diversity . 97
D.1 Randomization approaches . 97
D.2 Integrated approach for diversity . 98
Annex E (informative): Software reliability models . 100
E.1 Exponential times between failures modes . 100
E.2 Non-homogeneous Poisson processes . 101
Annex F (informative): Authors & contributors . 104
Annex G (informative): Change History . 105
History . 106
ETSI
---------------------- Page: 4 ----------------------
5 ETSI GS NFV-REL 003 V1.1.2 (2016-07)
Intellectual Property Rights
IPRs essential or potentially essential to the present document may have been declared to ETSI. The information
pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found
in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in
respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web
server (https://ipr.etsi.org/).
Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee
can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web
server) which are, or may be, or may become, essential to the present document.
Foreword
This Group Specification (GS) has been produced by ETSI Industry Specification Group (ISG) Network Functions
Virtualisation (NFV).
Modal verbs terminology
In the present document "shall", "shall not", "should", "should not", "may", "need not", "will", "will not", "can" and
"cannot" are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of
provisions).
"must" and "must not" are NOT allowed in ETSI deliverables except when used in direct citation.
ETSI
---------------------- Page: 5 ----------------------
6 ETSI GS NFV-REL 003 V1.1.2 (2016-07)
1 Scope
The present document describes the models and methods for end-to-end reliability in NFV environments and software
upgrade from a resilience perspective. The scope of the present document covers the following items:
• Study reliability estimation models for NFV including modelling architecture.
• Study NFV reliability and availability methods.
• Develop reliability estimation models for these methods, including dynamic operational aspects such as impact
of load and life-cycle operations.
• Study reliability issues during NFV software upgrade and develop upgrade mechanisms for improving
resilience.
• Develop guidelines to realise the differentiation of resiliency for different services.
2 References
2.1 Normative references
References are either specific (identified by date of publication and/or edition number or version number) or
non-specific. For specific references, only the cited version applies. For non-specific references, the latest version of the
referenced document (including any amendments) applies.
Referenced documents which are not found to be publicly available in the expected location might be found at
http://docbox.etsi.org/Reference.
NOTE: While any hyperlinks included in this clause were valid at the time of publication, ETSI cannot guarantee
their long term validity.
The following referenced documents are necessary for the application of the present document.
Not applicable.
2.2 Informative references
References are either specific (identified by date of publication and/or edition number or version number) or
non-specific. For specific references, only the cited version applies. For non-specific references, the latest version of the
referenced document (including any amendments) applies.
NOTE: While any hyperlinks included in this clause were valid at the time of publication, ETSI cannot guarantee
their long term validity.
The following referenced documents are not necessary for the application of the present document but they assist the
user with regard to a particular subject area.
[i.1] ETSI GS NFV 002: "Network Functions Virtualisation (NFV); Architectural Framework".
[i.2] ETSI GS NFV-REL 001: "Network Functions Virtualisation (NFV); Resiliency Requirements".
[i.3] http://www.crn.com/slide-shows/cloud/240165024/the-10-biggest-cloud-outages-of-2013.htm.
[i.4] http://www.crn.com/slide-shows/cloud/300075204/the-10-biggest-cloud-outages-of-2014.htm.
[i.5] ETSI GS NFV-MAN 001: "Network Functions Virtualisation (NFV); Management and
Orchestration".
[i.6] ETSI GS NFV-SWA 001: "Network Functions Virtualisation (NFV); Virtual Network Functions
Architecture".
ETSI
---------------------- Page: 6 ----------------------
7 ETSI GS NFV-REL 003 V1.1.2 (2016-07)
[i.7] SA Forum SAI-AIS-AMF-B.04.01: "Service Availability Forum Application Interface
Specification".
[i.8] ETSI GS NFV-REL 002: "Network Functions Virtualisation (NFV); Reliability; Report on
Scalable Architectures for Reliability Management".
[i.9] ETSI GS NFV-REL 004: "Network Functions Virtualisation (NFV); Assurance; Report on Active
Monitoring and Failure Detection".
[i.10] ETSI GS NFV-INF 010: "Network Functions Virtualisation (NFV); Service Quality Metrics".
[i.11] ETSI GS NFV-INF 001: "Network Functions Virtualisation (NFV); Infrastructure Overview".
[i.12] IEEE 802.1ax™: "IEEE standard for local and metropolitan area networks -- Link aggregation".
[i.13] IEEE 802.1aq™: "Shortest Path Bridging".
[i.14] ETSI GS NFV-REL 005: Network Functions Virtualisation (NFV); Assurance; Quality
Accountability Framework.
[i.15] QuestForum: "TL 9000 Measurements Handbook", release 5.0, July 2012.
NOTE: Available at http://www.tl9000.org/handbooks/measurements_handbook.html.
[i.16] QuEST Forum: "Quality Measurement of Automated Lifecycle Management Actions," 1.0,
August 18th, 2015.
NOTE: Available at
http://www.tl9000.org/resources/documents/QuEST_Forum_ALMA_Quality_Measurement_150819.pdf.
[i.17] B. Baudry and M. Monperrus: "The multiple facets of software diversity: recent developments in
year 2000 and beyond", 2014. .
[i.18] L.H. Crow: "Reliability analysis for complex repairable systems", in "Reliability and biometry -
statistical analysis of lifelength" (F. Prochan and R.J. Serfling, Eds.), SIAM Philadelphia, 1974,
pp. 379-410.
[i.19] Y. Deswarte, K. Kanoun and J.-C. Laprie: "Diversity against accidental and deliberate faults",
Conf. on Computer Security, Dependability, and Assurance: From Needs to Solutions,
Washington, DC, USA, July 1998.
[i.20] J.T. Duane: "Learning curve approach to reliability monitoring", IEEE™ Transactions on
Aerospace, AS-2, 2, 1964, pp. 563-566.
[i.21] A.L. Goel and K. Okumoto: "Time dependent error detection rate model for software reliability
and other performance measures", IEEE™ Transactions on Reliability, R-28, 1, 1979,
pp. 206-211.
[i.22] Z. Jelinski Z. and P.B. Moranda: "Statistical computer performance evaluation", in "Software
reliability research" (W. Freiberger, Ed.), Academic Press, 1972, pp. 465-497.
[i.23] J.E. Just and M. Cornwell: "Review and analysis of synthetic diversity for breaking
monocultures", ACM Workshop on Rapid Malcode, New York, NY, USA, 2004, pp. 23-32.
[i.24] J.C. Knight: "Diversity", Lecture Notes in Computer Science 6875, 2011.
[i.25] P. Larsen, A. Homescu, S. Brunthaler and M. Franz: "Sok: automated software diversity", IEEE™
Symposium on Security and Privacy, San Jose, CA, USA, May 2014, pp. 276-291.
[i.26] B. Littlewood, P. Popov and L. Strigini: "Modeling software design diversity: a review", ACM
Computing Surveys (CSUR), 33(2), 2001, pp. 177-208.
[i.27] P.B. Moranda: "Event altered rate models for general reliability analysis", IEEE™ Transactions on
Reliability, R-28, 5, 1979, pp. 376-381.
ETSI
---------------------- Page: 7 ----------------------
8 ETSI GS NFV-REL 003 V1.1.2 (2016-07)
[i.28] J.D. Musa and K. Okumoto: "A logarithmic Poisson execution time model for software reliability
measurement", 7th Int. Conf. on Software Engineering, Orlando, FL, USA, March 1984,
-
pp. 230 238.
[i.29] I. Schaefer et al.: "Software diversity: state of the art and perspectives", International Journal on
Software Tools for Technology Transfer, 14, 2012, pp. 477-495.
[i.30] S. Yamada, M. Ohba and S. Osaki: "S-shaped reliability growth modelling for software error
detection", IEEE Transactions on Reliability, R-35, 5, 1983, pp. 475-478.
[i.31] ETSI GS NFV-INF 003: "Network Functions Virtualisation (NFV); Infrastructure; Compute
Domain".
[i.32] ETSI GS NFV 003: "Network Functions Virtualisation (NFV); Terminology for main concepts in
NFV".
[i.33] IEEE Reliability Society: "Recommended Practice on Software Reliability", IEEE™ Std 1633,
2008.
[i.34] NFV PoC#35 final report.
3 Definitions and abbreviations
3.1 Definitions
For the purposes of the present document, the terms and definitions given in ETSI GS NFV 003 [i.32], ETSI
GS NFV-REL 001 [i.2] and the following apply:
fault detection: process of identifying an undesirable condition (fault or symptom) that may lead to the loss of service
from the system or device
fault diagnosis: high confidence level determination of the required repair actions for the components that are
suspected to be faulty
NOTE: Diagnosis actions are generally taken while the component being diagnosed is out of service.
fault isolation: isolation of the failed component(s) from the system
NOTE: The objectives of fault isolation include avoidance of fault propagation to the redundant components
and/or simultaneous un-intended activation of active and backup components in the context of
active-standby redundancy configurations (i.e. "split-brain" avoidance).
fault localization: determining the component that led to the service failure and its location
fault management notification: notification about an event pertaining to fault management
EXAMPLE: Fault management notifications include notifications of fault detection events, entity availability
state changes, and fault management phase related state progression events.
fault recovery: full restoration of the original intended system configuration, including the redundancy configuration
NOTE: For components with protected state, this phase includes bringing the new protecting unit online and
transferring the protected state from the active unit to the new unit.
fault remediation: restoration of the service availability and/or continuity after occurrence of a fault
fault repair: removal of the failed unit from the system configuration and its replacement with an operational unit
NOTE: For the hardware units that pass the full diagnosis, it may be determined that the probable cause was a
transient fault, and the units may be placed back into the operational unit pool without physical repair.
ETSI
---------------------- Page: 8 ----------------------
9 ETSI GS NFV-REL 003 V1.1.2 (2016-07)
state protection: protection of the service availability and/or service continuity relevant portions of system or
subsystem state against faults and failures
NOTE: State protection involves replicating the protected state to a redundant resource.
3.2 Abbreviations
For the purposes of the present document, the following abbreviations apply:
3GPP Third Generation Partnership Project
AIS Application Interface Specification
API Application Programming Interface
ATCA Advanced TCA (Telecom Computing Architecture)
BER Bit Error Rates
CoS Class of Service
COTS Commercial Off-The-Shelf
CP Connection Point
CPU Central Processing Unit
DARPA Defence Advanced Research Projects Agency
DNS Domain Name Service
E2E End-to-End
ECMP Equal-Cost Multi-Path
EM Element Manager
EMS Element Management System
ETBF Exponential Times Between Failures
GTP GPRS Tunnelling Protocol
HPP Homogenous Poisson Process
IMS IP Multimedia Subsystem
IP Internet Protocol
LACP Link Aggregation Control Protocol
LAN Local Area Network
LB Load Balancer
LCM Life Cycle Management
LOC Lines of Code
MAN Metropolitan Area Network
MOS Mean Opinion Score
MPLS Multiprotocol Label Switching
MTBF Mean Time Between Failure
MTD Moving Target Defense
MTTF Mean time To Failure
MTTR Mean Time To Repair
NF Network Function
NFP Network Forwarding Path
NFVI Network Functions Virtualisation Infrastructure
NFV-MANO Network Functions Virtualisation Management and Orchestration
NFVO Network Functions Virtualisation Orchestrator
NHPP Non-Homogenous Poisson
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.