ArticlesAll Issue
ArticlesUnderstanding Stochastic Modeling Approach for Container-Based SFC Service Analysis
• Yingsi Zhao1,*, Yaru Li2, and Han-Chieh Chao3

Human-centric Computing and Information Sciences volume 12, Article number: 45 (2022)
https://doi.org/10.22967/HCIS.2022.12.045

Abstract

Network function virtualization technology enables the dynamic deployment of all virtualized network functions (VNFs) in service function chain (SFC), thus boosting the degree of SFC service availability and strengthening the capability of SFC service operators to manage SFC services. VNFs in an SFC can be deployed in different containers, which can run on different operating systems (OS). It is known that any software cannot avoid software aging after a long time and continuous execution, which can lead to service availability reduction and even service failure. Rejuvenation techniques such as failover and live container migration can effectively alleviate the negative impact of VNF and OS aging. However, both the failover trigger and migration trigger intervals can affect the effectiveness of rejuvenation techniques. This paper develops a semi-Markov process-based analytical approach to quantitatively investigate the availability of a container-based SFC service consisting of any number of VNFs. By sensitivity analysis, we identified the impact of different system parameters on SFC service availability. Through numerical experiments, we analyzed the impact of the number of VNFs and OS on SFC service availability. In addition, we determined both the optimal failover trigger interval and migration trigger interval simultaneously to achieve approximately maximum SFC service availability.

Keywords

Availability, Container, Semi-Markov Process, Service Function Chain, Software Aging

Introduction

Network function virtualization (NFV) has garnered popularity among cloud service providers with the advantages of reducing investment and increasing network flexibility [1]. By 2024, the market for NFV is projected to reach US$363 billion [2]. As a recently emerging paradigm, NFV technology transforms network functions from dedicated hardware devices to virtual network functions (VNFs), which cloud service providers can deploy in the cloud or Edge to provide flexible and scalable services [1, 3, 4]. In the NFV environment, multiple-ordered VNFs in the network are interconnected to achieve complete end-to-end service delivery in the form of service function chain (SFC) [3]. Container is a kind of virtualization environment for executing VNF software with high resource utilization and fast migration [5]. VNFs in an SFC can be deployed on different containers, which can run on different operating systems (OS). However, VNF and OS, as key software components of the container-based SFC system, will inevitably encounter software aging [6], which can eventually lead to a reduction in the availability of SFC service and even to service interruption [7]. Rejuvenation techniques such as failover and the container migration technique can be adopted in the container-based SFC system to reduce the negative impacts of VNF and OS aging on SFC service availability [8]. The rejuvenation trigger intervals affect the effectiveness of rejuvenation techniques [9]. Specifically, triggering a software rejuvenation technique immediately when aging occurs does not always entail maximum SFC service availability [9], as SFC services remain available for a period of time after aging occurs. Moreover, both failover trigger and migration trigger intervals, as shown in Fig. 1, affect each other. Therefore, determining the optimal failover trigger and optimal migration trigger intervals jointly to maximize SFC service availability presents a challenging problem. Fig. 1. Failover trigger and migration trigger intervals. Quantitative evaluation can help investigate the impact of software rejuvenation techniques on SFC service availability, and then explore how to maximize service availability. Analytical modeling is a kind of effective approach to make a quantitative evaluation [10]. There were various analytical models for evaluating the degree of availability in the virtualized system. They often assumed that the interval times of all events followed exponential distributions [1113]. Many studies like [11, 12, 1419] ignored the fact that software components can suffer from aging after a long time and continuous execution. In addition, the authors in [1921] ignored the varying number of components (VNFs and OS) in the SFC system. In particular, none of the existing analytical models analyzed the time-dependent behaviors between components, such as VNF and its operating environment. This paper aims to study a container-based SFC system that deploys the rejuvenation techniques of failover, container migration, software restart, and OS reboot. Each rejuvenation technique is detailed in [21]. We use a 2n-dimensional semi-Markov process (SMP) model to quantitatively investigate the impact of software rejuvenation techniques on the availability of an SFC consisting of n VNF instances, namely n-sized SFC service. As far as we know, it would be the first time to quantitatively analyze SFC service availability in the above-mentioned container-based SFC system. In contrast to the existing studies, our model allows both failure and recovery times (including failover, migration, OS rebooting, VNF restarting, and fixing times) to follow any probability distribution. Moreover, we consider the time-dependent aging and failure and recovery behaviors between components (VNFs and OS) in the container-based SFC system. The main contributions are summarized as follows: We proposed a multi-dimensional SMP model to describe time-dependent aging and failure and recovery behaviors between VNF and OS, as well as VNFs or OS themselves in the container-based SFC system that consists of any number of VNFs, and deployed failover, container migration, software restart and OS reboot techniques. We derived the general closed-form formula of calculating the availability of SFC service composed of any number of VNFs. This closed-form formula can help service providers identify any bottlenecks for improving the degree of SFC service availability. We carried out experiments to validate the proposed model and formulas. We also conducted numerical experiments to (1) analyze the impact of various system parameters and the number of VNFs and OS on SFC service availability and (2) determined the optimal failover trigger interval and migration trigger interval simultaneously to maximize SFC service availability, thereby helping cloud service providers maximize the benefits. In the rest of this paper, Section 2 discusses the related work, while Sections 3 presents the SMP model constructed in this paper and Section 4 the experiment results. Lastly, Section 5 summarizes the conclusion and states the future work left to be done. Related Work Analytical modeling, as one of the main methods of quantitative evaluation, has been widely adopted to evaluate availability in a virtualized system. Analytical models are generally divided into three types, namely non-state space models including reliability block diagrams (RBD), reliability diagrams, fault trees, state space models, and multi-level models [10]. Researchers have explored non-state space models to evaluate SFC service availability. Fan et al. [14] considered redundancy models of using backup VNF to protect primary VNF, and used the RBD model to evaluate SFC service availability. Moualla et al. [15] proposed an algorithm to solve the problem of placing SFCs under resource and availability constraints, and calculated the degree of SFC service availability based on RBD for different SFCs placement. Wang et al. [16] proposed five backup models for SFC service availability improvement, such as the VNF backup and path backup models, and analyzed SFC service availability based on RBD. However, non-state space models assumed that components were statistically independent, so it was difficult to use this model to capture the time-dependent behaviors between components in an actual system. In recent years, the availability evaluation based on state space models has been heralded widely, for example, Markov models. Di Mauro et al. [11] used the continuous time Markov chains (CTMC) to model SFC and computed the degree of SFC service availability. Tola et al. [12] presented an availability model of an NFV-enabled network service based on the stochastic activity network (SAN) model, and analyzed the service availability under different availability modes. Based on their previous work, they considered software aging and rejuvenation in [13], and carried out a sensitivity analysis. These studies [1113] developed Markov models and assumed that the interval times of all events followed an exponential distribution with a constant failure rate. However, some research practices proved that the failure rate of most equipment increased with time [22]. Our work in this paper relaxes the assumption of an exponential distribution of time intervals of failure and recovery events to correctly capture the behaviors of the container-based SFC system. In addition, there were many studies based on non-Markovian models which can relax the restriction of exponential distributions to analyze the service availability. The authors in [20] studied the service availability and job completion time in the virtualized system that deploys live virtual machine (VM) migration based on the SMP model. Moreover, in [21], they built a Markov regenerative process (MRGP) model for a virtualization system that deploys regeneration techniques at the application service (AS), VM, and virtual machine monitor (VMM) levels, respectively, and then determined the optimal inspection time interval to maximize availability. The authors in [20, 21] constructed the availability model for a single service. These models ignored the variation in the number of components in a SFC system. Therefore, these studies were not applicable to SFC services. In contrast to these studies, we constructed a multi-dimensional SMP model to model a SFC system that consists of any number of VNFs. Moreover, they analyzed the resilience of vehicle-platooning services in [23] and studied the impact of VNF aging on SFC service dependability in [24]. These models did not capture the host OS behaviors. Alternatively, our model can describe the time-dependent behaviors between VNF and the host OS, as well as between VNFs or host OS themselves. Based on multi-level models, various studies on availability evaluation were also conducted. Di Mauro et al. [17] proposed a two-level model to evaluate SFC availability, where stochastic reward nets (SRN) model modeled the probabilistic behavior of a single VNF while RBD described the dependencies between components in an SFC. In [18], they further adopted RBD and SRN models to analyze the availability of an IP multimedia subsystem (IMS), which is one of the NFV use cases. Shoyari et al. [19] proposed a composed availability model combining RBD and CTMC to evaluate the availability of OpenStack private cloud in different scenarios. However, these studies ignored the time-dependent aging and failure and recovery behaviors between components in the SFC system. In contrast to these studies, our model can capture the aging and failure and recovery behaviors of each VNF and OS in a SFC system, along with the time-dependent behaviors between VNFs, as well as between VNFs and OS or OS themselves. It should be noted that based on the proposed model, service providers can design an optimization algorithm for maximizing the benefits [2531]. A comparison of the above-mentioned related studies and our work is shown in Table 1. Table 1.Comparison of existing models  Study SFC State space modeld) Distributione) Aging & recovery behaviora) Time-dependent behaviors between componentsb) Varying number of componentsc) [14],[15],[16] X X √ X X [11] X X √ CTMC E [12] X X √ SAN E [13] √ X √ SAN E [20] √ X X SMP G [21] √ X X MRGP G [23] √ X √ SMP G [24] √ X √ SMP G [17],[18] X X √ RBD+SRN E [19] X X X RBD+CTMC E Our work √ √ √ SMP G a)Whether aging and recovery behaviors of components (VNFs and OS) in the SFC system are considered. b)Whether the time-dependent behaviors between VNFs and OS, as well as VNFs or OS themselves in the SFC system are considered. c)Whether the change in the number of components (VNFs and OS) in the SFC system is considered. c)Whether a state space model is constructed and which state space model is used. d)The type of probability distribution of event time in the model. E and G denote exponential and general, respectively. System Model This section first introduces the container-based SFC system considered in this paper. Then we present the SMP model and process of calculating the availability of the n-sized SFC service. System Description We abstracted the n-sized container-based SFC system that deploys failover, container migration, software restart and OS reboot as the system composed of a SFC control plane, n primary and backup hosts. Each primary host includes an OS running one active container that deploys one VNF and multiple backup containers for supporting a failover technique. The corresponding backup host is used to support container migration. The requests are processed sequentially by n VNFs in an SFC. We assume that the backup resources in the cloud are sufficient with a backup container and backup host always available at any given time. It should be noted that the backup resources can suffer from software aging and failure. If the backup resources suffer from software aging, they will be restarted/ rebooted immediately. If the backup resources suffer from failure, they will be fixed and restarted/ rebooted immediately. Fig. 2 illustrates the SFC system architecture consisting of n VNFs studied in this paper. Fig. 3 shows the process of the container-based SFC system that performs a SFC service. Fig. 2. Container-Based SFC system architecture. The blue-dashed lines and frames in Fig. 3 show the process of SFC service being successfully completed without suffering from software aging and failure. The orange solid line and solid line frames in Fig. 3 show the process in which a single component (VNF or OS) suffers from software aging and recovers through rejuvenation techniques (failover or container migration). During service execution, both VNF software and an OS can suffer from standard software aging and failure caused by software aging. If software aging of an active VNF is detected during request processing, the failover technique is triggered after a certain failover trigger interval time, after which the backup container on the same primary host will take charge of processing requests. Then the container with an aging VNF is restarted to its initial state. We assume that OS rebooting and VNF restarting times are far less than the software aging time. If software aging of a primary host OS is detected, the container migration technique is triggered after a certain container migration trigger interval time, with the backup OS taking charge of processing requests. Then the aging OS is rebooted to its initial state. Fig. 3. Process of the container-based SFC system performing SFC service. The violet dash-dotted lines and frames in Fig. 3 show the process in which during the aging or recovery process of a single component, other components suffer from aging. If both an active VNF and OS that executes it suffer from software aging, this OS is rebooted. If more than one OS of a primary host or both an active VNF and OS executing another VNF suffer from software aging, all OS in the SFC system are rebooted. If more than one active VNF suffers from software aging, all VNFs in the SFC system are restarted. The grey dash double-dotted lines and frames in Fig. 3 show the process in which during the aging or recovery process of a single component, the aging component fails. The component failure interrupts SFC services and causes service failures. State and Variable Definitions We define a 2n-tuple index ($i_{o1}$,$i_{o2}$,$i_{o3}$,...,$i_{on}$,$j_{v1}$,$j_{v2}$,$j_{v3}$,...,$j_{vn}$) to denote the system state. That is, active containers running n VNF are hosted by n OS. Here,$i_{vn}$and$j_{on}$denote the states of the nth VNF and$n^{th}$primary host OS, respectively. There are six states of healthy, degradation, migration, failover, restart/ reboot and failed, with each state denoted as H, D, M, G, R and F, respectively. A description of each state is given as follows. Healthy (H): The VNF (OS) is robust and the service can be performed normally. Rejuvenation techniques can bring the aging VNF (OS) back to this state. Degradation (D): The VNF (OS) at this state can work but suffers from software aging. Migration (M): At this state, the container is ready to move from primary host to backup host via live container migration. Failover (G): At this state, the VNF is ready to move from an active container to backup container via the failover technique. Restart/ Reboot (R): The VNF (OS) at this state is restarted (rebooted). Failed (F): This state is a failure state caused by a VNF (OS) failure due to software aging. Then we get a total 62n system states, of which 62n-5n-4 are meaningless system states and can be ignored. Taking system state ($D_{o1}$,$D_{o2}$,$H_{o3}$,...,$H_{on}$,$H_{v1}$,$H_{v2}$,$H_{v3}$,...,$H_{vn}$) as an example, if two OS’s of the primary host suffer from software aging, all OS’s in the SFC system are rebooted while all components enter R state. Therefore, this system state is meaningless. Table 2 defines 5n+4 meaningful system states of the container-based SFC system that performs n-sized SFC service. Table 2. Definition of meaningful states  No. System state State of the 1st OS … State of the nth OS State of the 1st VNF … State of the nth VNF SFC service availability status$S_{sc0}$($H_{o1}$,...,$H_{on}$,$H_{v1}$,...,$H_{vn}$) Healthy … Healthy Healthy … Healthy Yes$S_{sc1}$($F_{o1}$,...,$F_{on}$,$F_{v1}$,...,$F_{vn}$) Failed … Failed Failed … Failed No$S_{sc2}$($R_{o1}$,...,$R_{on}$,$R_{v1}$,...,$R_{vn}$) Reboot … Reboot Restart … Restart No$S_{sc3}$($H_{o1}$,...,$H_{on}$,$R_{v1}$,...,$R_{vn}$) Healthy … Healthy Restart … Restart No$S_{sc4}$($R_{o1}$,...,$H_{on}$,$R_{v1}$,...,$H_{vn}$) Reboot … Healthy Restart … Healthy No … … … … … … … … …$S_{("sc" (n+3))}$($H_{o1}$,...,$R_{on}$,$H_{v1}$,...,$R_{vn}$) Healthy … Reboot Healthy … Restart No$S_{("sc" (n+4))}$($D_{o1}$,...,$H_{on}$,$H_{v1}$,...,$H_{vn}$) Degradation … Healthy Healthy … Healthy Yes … … … … … … … … …$S_{("sc" (3+3))}$($H_{o1}$,...,$H_{on}$,$H_{v1}$,...,$D_{vn}$) Healthy … Healthy Healthy … Degradation Yes$S_{("sc" (3n+4))}$($M_{o1}$,...,$H_{on}$,$H_{v1}$,...,$H_{vn}$) Migration … Healthy Healthy … Healthy Yes … … … … … … … … …$S_{("sc" (5n+3))}$($H_{o1}$,...,$H_{on}$,$H_{v1}$,...,$G_{vn}$) Healthy … Healthy Healthy … Failover Yes SMP Model Based on the description in the previous sections, we can use a 2n-dimensional SMP model to describe the behaviors of the n-sized SFC system that suffers from software aging until recovery by applying rejuvenation techniques. We define {$Z_X$(t)|t≥0} as a stochastic process. The sequence of system states of the stochastic process at Markov renewal moments T={T0, T1, T2, T3…} is X={X0, X1, X2, X3…} (including the occurrence of active VNF aging, an OS of primary host aging, failover, container migration, failure, VNF restarting and an OS of primary host rebooting). This sequence satisfies the Markov property and thus creates a discrete-time Markov chain, which is called the embedded discrete time Markov chain (EDTMC) [10]. The sojourn time distribution HSsci(t) at state Ssci follows a general distribution. Therefore, the stochastic process {$Z_X$(t)|t≥0} is called an SMP [10]. Fig. 4 shows the SMP model for the 3-sized SFC system, where the green ellipses denote unavailable states and the other states available ones. Tables 3 and 4 show the definitions of variables used in the model. Fig. 4. SMP model for the SFC system with three VNFs. Table 3. Definitions of variables denoting minimum time  Symbol Definition Distributiona) Typeb) Default valuesc)$T_{dc}$Variable with distribution function Fdc(t) denoting the minimum holding time of each primary host OS from healthy to degradation state during restart of all VNFs. E A -$T_{dni}$A variable with distribution function Fdni(t) denoting the minimum holding time of other components from healthy to degradation state after the ith VNF and the ith OS suffering from aging. E A -$T_{dsi}$A variable with distribution function Fdsi(t) denoting the minimum holding time of other VNFs from healthy to degradation state after the ith VNF suffering from software aging. E A -$T_{dci}$A variable with distribution function Fdci(t) denoting the minimum holding time of OS’s (in addition to the ith OS) from healthy to degradation state after the ith VNF suffering from aging. E A -$T_{dai}$A variable with distribution function Fdai(t) denoting the minimum holding time of other components (in addition to the VNF running on the ith OS) from healthy to degradation state after the ith primary host OS suffering from aging. E A - a)Type of distribution function that a random variable follows. E denotes exponential. b)Type of holding time denoted by the random variable. A denotes aging time. c)Setting of the variables depends on the other variables. Table 4. Definitions of variables denoting holding time  Symbol Definition Distributiona) Typeb) Default values$T_{avi}$A variable with distribution function$F_{avi}$(t) denoting the holding time of the ith VNF from healthy to degradation state. E A 15–16 months$T_{aoi}$A variable with distribution function$F_{aoi}$(t) denoting the holding time of the ith primary host OS from healthy to degradation state. E A 17–18 months$T_{fvi}$A variable with distribution function$F_{fvi}$(t) denoting the holding time of the ith VNF from degradation to failed state. G Fa 14–15 months$T_{foi}$A variable with distribution function$F_{foi}$(t) denoting the holding time of the ith primary host OS from degradation to failed state. G Fa 15–16 months$T_{rvi}$A variable with distribution function$F_{rvi}$(t) denoting the holding time of the ith VNF from failover to healthy state. G Fa 8–10 Seconds [13]$T_{roi}$A variable with distribution function$T_{roi}$(t) denoting the holding time of the ith primary host OS from migration to healthy state. G M 30–40 seconds$T_{Roi}$A variable with distribution function$T_{Roi}$(t) denoting the holding time of rebooting the ith primary host OS. G O 1–1.5 minutes$T_{RS}$A variable with distribution function$T_{RS}$(t) denoting the holding time of restarting all VNFs. G V 10–20 seconds [13]$T_{RO}$A variable with distribution function$T_{RO}$(t) denoting the holding time of rebooting all OS’s. G O 1–2 minutes$T_R$A variable with distribution function$T_R$(t) denoting the holding time of SFC system from (Fo1,…,Fon,Fv1,…,Fvn) to (Ho1,…,Hon,Hv1,…,Hvn). G Fi 0.8–1.2 hours [13]$T_{uvi}$A variable with distribution function$F_{uvi}$(t)=u(t-$a_{vi}$) denoting the holding time of the ith VNF from degradation to failover state. U I 0–2 months$T_{uoi}$A variable with distribution function$F_{uoi}$(t)=u(t-$a_{oi}$) denoting the holding time of the$i^{th}\$ OS from degradation to migration state. U I 0–2 months

a)Type of distribution function that a random variable follows. E, G, and U denote exponential, general, and unit step function, respectively. b)Type of holding time denoted by the random variable. A, Fa, M, O, Fi, I, and V denote aging, failure, migration, OS rebooting, fixing, interval, and VNF restarting times, respectively.

Formulas for Calculating Steady-State Availability of n-Sized SFC Service
This section describes the process of calculating steady-state availability of n-sized SFC service. The details are shown as follows. The steady-state availability of n-sized SFC service can be defined by summing the steady-state probabilities of all the system states at which the system is available. Thus, we can get the steady-state availability π_sc of n-sized SFC service by using Equation (1) below.

(1)

where vSsci is the steady-state probability of EDTMC at system state Ssci, while hSsci is the mean sojourn time at system state Ssci. Firstly, we use Equations (2)–(10) to calculate the mean sojourn time hSsci at system state Ssci.

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

where A={k|1≤k≤n} and i∈[1,n]. We then calculate the steady-state probability vSsci of the EDTMC at system state Ssci by using Equations (11)–(17).

(11)

(12)

(13)

(14)

(15)

(16)

(17)

where

i∈[4,n+3], j∈[n+4,3n+3], k∈[3n+4,5n+3] and p* can be calculated by using Equations (18)–(43).

(18)

(19)

(20)

(21)

(22)

(23)

(24)

(25)

(26)

(27)

(28)

(29)

(30)

(31)

(32)

(33)

(34)

(35)

(36)

(37)

(38)

(39)

(40)

(41)

(42)

(43)

where A={k|1≤k≤n}, A_i'={k|1≤k≤n,k≠i} and i∈[1,n].

Numerical Results

In this section, we first verified the approximate accuracy of the proposed model and formulas, i.e., Equations (1)–(43), by comparing numerical and simulation results. We then used numerical experiments to perform a sensitivity analysis experiment to analyze the impact of various system parameters on SFC service availability. Then through numerical experiments, we analyzed the impact of the number of VNFs and OS on SFC service availability and determined both the optimal failover trigger interval and migration trigger interval simultaneously to achieve approximate maximum SFC service availability. Experiment configurations are introduced in Section 4.1, while Sections 4.2–4.5 show the numerical results.

Experimental Configurations
In this paper, the failure time is assumed to follow hypo-exponential distribution, denoted as the trigger interval time is assumed to follow the unit step function denoted as and the time intervals of other events are assumed to follow an exponential distribution denoted as:

where It should be noted that the exponential and hypo-exponential distributions are just one set of examples, with other distributions also usable for numerical experiments. Tables 3 and 4 give the definition of each variable. Some parameters are set according to literature [13], while the remaining parameters are set in order to demonstrate the effectiveness of the proposed model. We developed a simulator in Maple language to verify the approximate accuracy of the proposed model and formulas. The simulation experiments are run 40,000,000 times, with the results given as averages of 40,000,000 simulation experiments. Based on the formulas derived in the previous section, we performed numerical experiments, while simulation and numerical experiments are conducted in MAPLE [32]. It should be noted that unlike the existing models, our model captures the aging and recovery behaviors of all VNFs and host OS in the container-based SFC system composed of any number of VNFs. Thus, we do not make a comparison with the existing models.

Validation of Our Proposed Model and Formulas
Fig. 5 shows a comparison between the simulation and numerical results for 3-sized SFC service availability. The numerical and corresponding simulation results are denoted by “n=3_num” and “n=3_sim,” respectively. The close similarity between the numerical and corresponding simulation results validates the approximate accuracy of the proposed model and formulas, with a 95% confidence level of all simulation results.

Fig. 5. Comparison between simulation results and numerical results.

Impact of Various System Parameters on SFC Service Availability
This section illustrates the impact of various system parameters on SFC service availability. The scaled sensitivity can be computed by applying where Y is the steady-state availability and its sensitivity depends on parameter σ. Table 5 shows the sensitivity of SFC service availability to different parameters of both the first OS and VNF under n=3, with the parameters of other components being fixed.

Table 5. Sensitivity of SFC service availability to different parameters
 Parameter Sensitivity value Parameter Sensitivity value αv1 -9.23E-07 βv1 2.12E-13 αo1 -6.86E-07 γo1 2.82E-12 λv11 -4.45E-07 δ1 4.14E-08 λv12 -8.41E-07 θ 5.25E-07 λo11 -3.49E-07 μ 3.41E-08 λo12 -5.12E-07 σ 3.99E-06

We observe that all other parameters are inversely proportional to SFC service availability, except for parameters which are proportional to SFC service availability. Parameters αv1 and σ have the greatest impact on SFC service availability. That is, parameters related to VNF-aging and system-fixing times deserve more attention in improving the degree of SFC service availability. It can be explained that the increases in migration, failover, OS rebooting, VNF restarting, and system-fixing times all lead to the increase in the probability of system failure, resulting in decreased availability. In the numerical experiment, these times are assumed to follow exponential distributions with a mean of respectively. Therefore, the degree availability is proportional to parameters as shown by the positive results in Table 5. Also, an increase in aging and failure times leads to an increase in the time when the system is available, resulting in increased availability. The aging times are assumed to follow exponential distributions with a mean of 1/αo1 and 1/αv1, while the failure times are assumed to follow hypo-exponential distributions with a mean Therefore, the degree availability is inversely proportional to parameters as shown by the negative results in Table 5.

Impact of Rejuvenation Trigger Intervals on SFC Service Availability
This section illustrates the impact of both VNF failover trigger and container migration trigger intervals on SFC service availability. Fig. 6 shows the results of SFC service availability varying with the failover trigger interval of the first VNF, with the migration trigger interval of the container running on the first OS under n=3 and parameters of other components being fixed.

Fig. 6. SFC service availability over failover trigger and migration trigger intervals.

We can observe that the degree of SFC service availability increases gradually with an increasing failover trigger interval av1 and migration trigger interval ao1, respectively, and decreases gradually after reaching a maximum value of 0.9999900224 at points ao1=0.0149 hours and av1=4.1513 hours. It can be explained that SFC service is still available within a certain period of time even after VNF or OS aging occurs; therefore, if trigger intervals are smaller than this optimal point, the time when the system is available increases as the trigger interval increases, thus leading to increased availability. When the trigger interval gets too large, the probability of system failure increases the higher it gets, which conversely decreases the degree of availability.

Fig. 7. SFC service availability over the number of VNFs and OS.

Impact of the Number of VNFs and OS’s on SFC Service Availability
This section illustrates the impact of the number of VNFs and OS on SFC service availability. Fig. 7 shows the numerical results of SFC service availability under different number of VNFs and OS and fixing time. We can observe that as the number of VNFs and OS increases, the SFC service availability decreases inversely. It can be explained that an increase in the number of VNFs and OS that could fail leads to an increase in the time that the SFC system stays at failure states, resulting in decreased availability.

Conclusion and Future Work

In this paper, we developed a multi-dimensional model to describe the aging and recovery behaviors of all VNFs and host OS in the container-based SFC system comprising n VNFs. Specifically, our model can describe the time-dependent behaviors between a VNF and host OS, as well as VNFs or host OS themselves in a situation where the failure and recovery times follow general distributions. We derived the formulas for calculating the n-sized SFC service availability, and analyzed the impact of various system parameters and the number of VNFs and OS on SFC service availability. Also, we determined both the optimal failover trigger interval and migration trigger interval simultaneously to maximize SFC service availability, thereby helping cloud service providers maximize the benefits. However, we assumed that the backup resources are sufficient. In reality, the backup resources cannot be sufficient. The aging and failure of backup resources can affect SFC service availability. We will extend our model to analyze the impact of backup resources on SFC service availability. In addition, the proposed model assumes that the aging times follow exponential distributions. In fact, the aging times can follow any type of distribution. In the future work to come, we will extend our model to investigate this situation.

Author’s Contributions

Conceptualization: ZYS, LYR, CHC; investigation and methodology: ZYS, LYR, CHC; writing-original draft preparation: ZYS, LYR; writing-review and editing: ZYS, CHC; software: ZYS, LYR; validation: ZYS, LYR. All the authors have proofread the final version.

Funding

None.

Competing Interests

The authors declare that they have no competing interests.

Author Biography

Name : Yingsi Zhao
Affiliation : School of Economics and Management, Beijing Jiaotong University
Biography : Yingsi Zhao received the bachelor degree in Engineering in 2007, and after completed the course work of Communication Engineering from Beijing Jiaotong University, Beijing, China, in 2009, she got the Master’s degree in Engineering. Respectively, she received the doctoral degree in Management in 2014 and currently works as a teacher in School of economics and management of Beijing Jiaotong University from 2014. Her research interests are in the area of enterprise management, including but not limited to Human Resources，Complex network, Innovation performance, Blockchain Technology and so on.

Name : Yaru Li
Affiliation : School of Computer and Information Technology, Beijing Jiaotong University
Biography : Yaru Li is a master student of School of Computer and Information Technology, Beijing Jiaotong University. Her research interest is network security.

Name : Han-Chieh Chao
Affiliation : Department of Electrical Engineering, National Dong Hwa University
Biography : Han-Chieh Chao received the M.S. and Ph.D. degrees in electrical engineering from Purdue University, West Lafayette, IN, USA, in 1989 and 1993, respectively. He has been with the department of electrical engineering of National Dong Hwa University, Hualien City, Taiwan, since February 2016. He has published nearly 500 peer-reviewed professional research papers. His research interests include high-speed networks, wireless networks, IPv6-based networks, and artificial intelligence. Dr. Chao is the Editor-in-Chief of the Journal of Internet Technology. He has served as a Guest Editor for ACM MONET, the IEEE Journal on Selected Areas in Communications, IEEE Communications Magazine, IEEE Systems Journal, Computer Communications, IEEE Proceedings Communications, Wireless Personal Communications, and Wireless Communications and Mobile Computing. He is a fellow of IET.

References

[1] C. Wang, Q. Hu, D. Yu, and X. Cheng, “Proactive deployment of chain-based VNF backup at the edge using online bandit learning,” in Proceedings of 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS), Washington DC, 2021, pp. 740-750.
[2] Markets and Markets, “Network function virtualization (NFV) market by component (solutions, orchestration and automation, and professional services), virtualized network function, application (virtual appliance and core network), end user, and region - global forecast to 2024,” 2020 [Online]. Available: https://www.marketsandmarkets.com/Market-Reports/network-function-virtualization-market-93929190.html.
[3] M. Ozdem and M. Alkan, “Subscriber aware dynamic service function chaining,” Computer Networks, vol. 194, article no. 108138, 2021. https://doi.org/10.1016/j.comnet.2021.108138
[4] L. Jiang, X. Chang, J. Misic, V. B. Misic, and J. Bai, “Understanding MEC empowered vehicle task offloading performance in 6G networks,” Peer-to-Peer Networking and Applications, vol. 15, no. 2, pp. 1090-1104, 2022.
[5] Y. Mansouri and M. A. Babar, “A review of edge computing: Features and resource virtualization,” Journal of Parallel and Distributed Computing, vol. 150, pp. 155-183, 2021.
[6] R. Pietrantuono and S. Russo, “A survey on software aging and rejuvenation in the cloud,” Software Quality Journal, vol. 28, no. 3, pp. 7-38, 2020.
[7] H. Zhu, J. Bai, X. Chang, J. Misic, V. Misic, and Y. Yang, “Stochastic model-based quantitative analysis of edge UPF service dependability,” in Algorithms and Architectures for Parallel Processing. Cham, Switzerland: Springer, 2020, pp. 619-632.
[8] D. Cotroneo, L. De Simone, and R. Natella, “NFV-bench: a dependability benchmark for network function virtualization systems,” IEEE Transactions on Network and Service Management, vol. 14, no. 4, pp. 934-948, 2017.
[9] F. Machida, J. Xiang, K. Tadano, and Y. Maeno, “Lifetime extension of software execution subject to aging,” IEEE Transactions on Reliability, vol. 66, no. 1, pp. 123-134, 2017.
[10] K. S. Trivedi and A. Bobbio, Reliability and Availability Engineering: Modeling, Analysis, and Applications. Cambridge, UK: Cambridge University Press, 2017.
[11] M. Di Mauro, M. Longo, and F. Postiglione, “Availability evaluation of multi-tenant service function chaining infrastructures by multidimensional universal generating function,” IEEE Transactions on Services Computing, vol. 14, no. 5, pp. 1320-1332, 2021.
[12] B. Tola, G. Nencioni, B. E. Helvik, and Y. Jiang, “Modeling and evaluating NFV-enabled network services under different availability modes,” in Proceedings of 2019 15th International Conference on the Design of Reliable Communication Networks (DRCN), Coimbra, Portugal, 2019, pp. 1-5.
[13] B. Tola, Y. Jiang, and B. E. Helvik, “Model-driven availability assessment of the NFV-MANO with software rejuvenation,” IEEE Transactions on Network and Service Management, vol. 18, no. 3, pp. 2460-2477, 2021.
[14] J. Fan, C. Guan, Y. Zhao, and C. Qiao, “Availability-aware mapping of service function chains,” in Proceedings of IEEE INFOCOM 2017 - IEEE Conference on Computer Communications, Atlanta, GA, 2017, pp. 1-9.
[15] G. Moualla, T. Turletti, and D. Saucez, “An availability-aware SFC placement algorithm for fat-tree data centers,” in Proceedings of 2018 IEEE 7th International Conference on Cloud Networking (CloudNet), Tokyo, Japan, 2018, pp. 1-4.
[16] M. Wang, B. Cheng, S. Wang, and J. Chen, “Availability- and traffic-aware placement of parallelized SFC in data center networks,” IEEE Transactions on Network and Service Management, vol. 18, no. 1, pp. 182-194, 2021.
[17] M. Di Mauro, M. Longo, F. Postiglione, G. Carullo, and M. Tambasco, “Service function chaining deployed in an NFV environment: An availability modeling,” in Proceedings of 2017 IEEE Conference on Standards for Communications and Networking (CSCN), Helsinki, Finland, 2017, pp. 42-47.
[18] M. Di Mauro, G. Galatro, M. Longo, F. Postiglione, and M. Tambasco, “IP multimedia subsystem in an NFV environment: availability evaluation and sensitivity analysis,” in Proceedings of 2018 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN), Verona, Italy, 2018, pp. 1-6.
[19] M. F. Shoyari, E. Ataie, R. Entezari-Maleki, and A. Movaghar, “Availability modeling in redundant OpenStack private clouds,” Software: Practice and Experience, vol. 51, no. 6, pp. 1218-1241, 2021.
[20] J. Bai, X. Chang, F. Machida, K. S. Trivedi, and Z. Han, “Analyzing software rejuvenation techniques in a virtualized system: service provider and user views,” IEEE Access, vol. 8, pp. 6448-6459, 2020.
[21] J. Bai, X. Chang, G. Ning, Z. Zhang, and K. S. Trivedi, “Service availability analysis in a virtualized system: a Markov regenerative model approach,” IEEE Transactions on Cloud Computing, 2020. https://doi.org/10.1109/TCC.2020.3028648
[22] M. Nadjafi, M. A. Farsi, E. Zio, and A. K. Mousavi, “Fault trees analysis using expert opinion based on fuzzy-bathtub failure rates,” Quality and Reliability Engineering International, vol. 34, no. 6, pp. 1142-1157, 2018.
[23] J. Bai, X. Chang, K. S. Trivedi, and Z. Han, “Resilience-driven quantitative analysis of vehicle platooning service,” IEEE Transactions on Vehicular Technology, vol. 70, no. 6, pp. 5378-5389, 2021.
[24] J. Bai, X. Chang, F. Machida, L. Jiang, Z. Han, and K. S. Trivedi, “Impact of service function aging on the dependability for MEC service function chain,” IEEE Transactions on Dependable and Secure Computing, 2022.https://doi.org/10.1109/TDSC.2022.3150782
[25] S. Basheer, S. Bhatia, and S. B. Sakri, "Computational modeling of dementia prediction using deep neural network: analysis on OASIS dataset," IEEE Access, vol. 9, pp. 42449-42462, 2021.
[26] K. Chakraborty, S. Bhatia, S. Bhattacharyya, J. Platos, R. Bag, and A. E. Hassanien, “Sentiment analysis of COVID-19 tweets by deep learning classifiers: a study to show how popularity is affecting accuracy in social media,” Applied Soft Computing, vol. 97, article no. 106754, 2020. https://doi.org/10.1016/j.asoc.2020.106754
[27] R. A. Sheikh, S. Bhatia, S. G. Metre, and A. Y. A. Faqihi, “Strategic value realization framework from learning analytics: a practical approach,” Journal of Applied Research in Higher Education, vol. 14, no. 2, pp. 693-713, 2022.
[28] M. T. Quasim, A. Shaikh, M. Shuaib, A. Sulaiman, S. Alam, and Y. Asiri, “Smart healthcare management evaluation using fuzzy decision making method,” 2021 [Online]. Available: https://www.researchsquare.com/article/rs-424702/v1.
[29] K. Bhalla, D. Koundal, S. Bhatia, M. K. I. Rahmani, and M. Tahir, “Fusion of infrared and visible images using fuzzy based siamese convolutional network,” Computers, Materials & Continua, vol. 70, no. 3, pp. 5503-5518, 2022.
[30] S. Bhatia, S. Alam, M. Shuaib, M. H. Alhameed, F. Jeribi, and R. I. Alsuwailem, “Retinal vessel extraction via assisted multi-channel feature map and U-Net,” Frontiers in Public Health, vol. 10, article no. 858327, 2022. https://doi.org/10.3389/fpubh.2022.858327
[31] P. Kaur, S. Harnal, R. Tiwari, S. Upadhyay, S. Bhatia, A. Mashat, and A. M. Alabdali, “Recognition of leaf disease using hybrid convolutional neural network by applying feature reduction,” Sensors, vol. 22, no. 2, article no. 575, 2022. https://doi.org/10.3390/s22020575
[32] Maplesoft, “Maple,” 2017 [Online]. Available: http://www.maplesoft.com/products/maple.

Yingsi Zhao1,*, Yaru Li2, and Han-Chieh Chao3, Understanding Stochastic Modeling Approach for Container-Based SFC Service Analysis, Article number: 12:45 (2022) Cite this article 2 Accesses