Bert Zwart - Academia.edu (original) (raw)
Papers by Bert Zwart
Eurandom, Sep 4, 2012
We investigate a computer network consisting of two layers occurring in, for example, application... more We investigate a computer network consisting of two layers occurring in, for example, application servers. The first layer incorporates the arrival of jobs at a network of multi-server nodes, which we model as a many-server Jackson network. At the second layer, active servers at these nodes act now as customers who are served by a common CPU. Our main result shows a separation of time scales in heavy traffic: the main source of randomness occurs at the (aggregate) CPU layer; the interactions between different types of nodes at the other layer is shown to converge to a fixed point at a faster time scale; this also yields a state-space collapse property. Apart from these fundamental insights, we also obtain an explicit approximation for the joint law of the number of jobs in the system, which is provably accurate for heavily loaded systems and performs numerically well for moderately loaded systems. The obtained results for the model under consideration can be applied to thread-pool dimensioning in application servers, while the technique seems applicable to other layered systems too.
Journal of Applied Probability, 2020
We investigate the probability that an insurance portfolio gets ruined within a finite time perio... more We investigate the probability that an insurance portfolio gets ruined within a finite time period under the assumption that the r largest claims are (partly) reinsured. We show that for regularly varying claim sizes the probability of ruin after reinsurance is also regularly varying in terms of the initial capital, and derive an explicit asymptotic expression for the latter. We establish this result by leveraging recent developments on sample-path large deviations for heavy tails. Our results allow, on the asymptotic level, for an explicit comparison between two well-known large-claim reinsurance contracts, namely LCR and ECOMOR. Finally, we assess the accuracy of the resulting approximations using state-of-the-art rare event simulation techniques.
We study large deviations asymptotics for unbounded additive functionals of one-dimensional Lange... more We study large deviations asymptotics for unbounded additive functionals of one-dimensional Langevin diffusions with sub-linear gradient drifts. Our results allow us to obtain parametric insights on the speed and the rate functions in terms of the growth rate of the drift and the growth rate of the additive functional. We find a critical value in terms of these growth parameters which dictates regions of sub-linear speed for our large deviations asymptotics. Our proof technique hinges on various constructions of independent interest, including a suitable decomposition of the diffusion process in terms of alternating renewal cycles and a decomposition of the paths on excursion. The key to the sub-linear behavior is a heavy-tailed large deviations phenomenon arising from the fact that at each regeneration cycle the accumulated area of the diffusion process displays a semi-exponential behavior.
We study extreme values in certain fork-join queueing networks: consider N identical queues with ... more We study extreme values in certain fork-join queueing networks: consider N identical queues with a common arrival process and independent service processes. All arrival and service processes are deterministic with random perturbations following Brownian motions. We prove that as N →∞, the scaled maximum of N steady-state queue lengths converges in distribution to a normally distributed random variable. We explore repercussions of this result for original equipment manufacturers (OEMs) that assemble a large number of components, each produced using specialized equipment, into complex systems. Component production capacity is subject to fluctuations, causing high risk of shortages of at least one component, which results in costly system production delays. OEMs hedge this risk by investing in a combination of excess production capacity and component inventories. We formulate a stylized model of the OEM that enables us to study the resulting trade-off between shortage risk, inventory c...
We consider the sample average of a centered random walk in ℝ^d with regularly varying step size ... more We consider the sample average of a centered random walk in ℝ^d with regularly varying step size distribution. For the first exit time from a compact convex set A not containing the origin, we show that its tail is of lognormal type. Moreover, we show that the typical way for a large exit time to occur is by having a number of jumps growing logarithmically in the scaling parameter.
We develop a non-parametric procedure for ranking transmission lines in a power system according ... more We develop a non-parametric procedure for ranking transmission lines in a power system according to the probability that they will overload due to stochastic renewable generation or demand-side load fluctuations, and compare this procedure to several benchmark approaches. Using the IEEE 39-bus test network we provide evidence that our approach, which statistically estimates the rate function for each line, is highly promising relative to alternative methods which count overload events or use incorrect parametric assumptions.
We propose a method for determining the most likely cause, in terms of conventional generator out... more We propose a method for determining the most likely cause, in terms of conventional generator outages and renewable fluctuations, of power system frequency reaching a predetermined level that is deemed unacceptable to the system operator. Our parsimonious model of system frequency incorporates primary and secondary control mechanisms, and supposes that conventional outages occur according to a Poisson process and renewable fluctuations follow a diffusion process. We utilize a large deviations theory based approach that outputs the most likely cause of a large excursion of frequency from its desired level. These results yield the insight that current levels of renewable power generation do not significantly increase system vulnerability in terms of frequency deviations relative to conventional failures. However, for a large range of model parameters it is possible that such vulnerabilities may arise as renewable penetration increases.
In the present paper, we study the evolution of an overloaded cyclic polling model that starts em... more In the present paper, we study the evolution of an overloaded cyclic polling model that starts empty. Exploiting a connection with multitype branching processes, we derive fluid asymptotics for the joint queue length process. Under passage to the fluid dynamics, the server switches between the queues infinitely many times in any finite time interval causing frequent oscillatory behavior of the fluid limit in the neighborhood of zero. Moreover, the fluid limit is random. Additionally, we suggest a method that establishes finiteness of moments of the busy period in an M/G/1 queue.
We consider a two-node tandem queueing network in which the upstream queue is M/G/1 and each job ... more We consider a two-node tandem queueing network in which the upstream queue is M/G/1 and each job reuses its upstream service requirement when moving to the downstream queue. Both servers employ the first-in-first-out policy. We investigate the amount of work in the second queue at certain embedded arrival time points, namely when the upstream queue has just emptied. We focus on the case of infinite-variance service times and obtain a heavy traffic process limit for the embedded Markov chain.
In many applications, significant correlations between arrivals of load-generating events make th... more In many applications, significant correlations between arrivals of load-generating events make the numerical evaluation of the load of a system a challenging problem. Here, we construct very accurate approximations of the workload distribution of the MAP/G/1 queue that capture the tail behavior of the exact workload distribution and provide a small relative error. Motivated by statistical analysis, we assume that the service times are a mixture of a phase-type and a heavy-tailed distribution. With the aid of perturbation analysis, we derive our approximations as a sum of the workload distribution of the MAP/PH/1 queue and a heavy-tailed component that depends on the perturbation parameter. We refer to our approximations as corrected phase-type approximations, and we exhibit their performance with a numerical study.
We consider a queuing model with the workload evolving between consecutive i.i.d. exponential tim... more We consider a queuing model with the workload evolving between consecutive i.i.d. exponential timers {e_q^(i)}_i=1,2,... according to a spectrally positive Lévy process Y_i(t) that is reflected at zero, and where the environment i equals 0 or 1. When the exponential clock e_q^(i) ends, the workload, as well as the Lévy input process, are modified; this modification may depend on the current value of the workload, the maximum and the minimum workload observed during the previous cycle, and the environment i of the Lévy input process itself during the previous cycle. We analyse the steady-state workload distribution for this model. The main theme of the analysis is the systematic application of non-trivial functionals, derived within the framework of fluctuation theory of Lévy processes, to workload and queuing models.
Reflected Brownian motion (RBM) in a wedge is a 2-dimensional stochastic process Z whose state sp... more Reflected Brownian motion (RBM) in a wedge is a 2-dimensional stochastic process Z whose state space in R^2 is given in polar coordinates by S=(r,theta): r >= 0, 0 <= theta <= xi for some 0 < xi < 2 pi. Let alpha= (theta_1+theta_2)/xi, where -pi/2 < theta_1,theta_2 < pi/2 are the directions of reflection of Z off each of the two edges of the wedge as measured from the corresponding inward facing normal. We prove that in the case of 1 < alpha < 2, RBM in a wedge is a Dirichlet process. Specifically, its unique Doob-Meyer type decomposition is given by Z=X+Y, where X is a two-dimensional Brownian motion and Y is a continuous process of zero energy. Furthermore, we show that for p > alpha , the strong p-variation of the sample paths of Y is finite on compact intervals, and, for 0 < p <= alpha, the strong p-variation of Y is infinite on [0,T] whenever Z has been started from the origin. We also show that on excursion intervals of Z away from the origi...
Numerical evaluation of ruin probabilities in the classical risk model is an important problem. I... more Numerical evaluation of ruin probabilities in the classical risk model is an important problem. If claim sizes are heavy-tailed, then such evaluations are challenging. To overcome this, an attractive way is to approximate the claim sizes with a phase-type distribution. What is not clear though is how many phases are enough in order to achieve a specific accuracy in the approximation of the ruin probability. The goals of this paper are to investigate the number of phases required so that we can achieve a pre-specified accuracy for the ruin probability and to provide error bounds. Also, in the special case of a completely monotone claim size distribution we develop an algorithm to estimate the ruin probability by approximating the excess claim size distribution with a hyperexponential one. Finally, we compare our approximation with the heavy traffic and heavy tail approximations.
We consider the steady-state distribution of the sojourn time of a job entering an M/GI/1 queue w... more We consider the steady-state distribution of the sojourn time of a job entering an M/GI/1 queue with the foreground-background scheduling policy in heavy traffic. The growth rate of its mean, as well as the limiting distribution, are derived under broad conditions. Assumptions commonly used in extreme value theory play a key role in both the analysis and the results.
Random multiple-access protocols of type ALOHA are used to regulate networks with a star configur... more Random multiple-access protocols of type ALOHA are used to regulate networks with a star configuration where client nodes talk to the hub node at the same frequency (finding a wide range of applications among telecommunication systems, including mobile telephone networks and WiFi networks). Such protocols control who talks at what time sharing the common idea "try to send your data and, if your message collides with another transmission, try resending later". In the present paper, we consider a time-slotted ALOHA model where users are allowed to renege before transmission completion. We focus on the scenario that leads to overload in the absence of impatience. Under mild assumptions, we show that the fluid (or law-of-large-numbers) limit of the system workload coincides a.s. with the unique solution to a certain integral equation. We also demonstrate that the fluid limits for distinct initial conditions converge to the same value as time tends to infinity.
Motivated by a capacity allocation problem within a finite planning period, we conduct a transien... more Motivated by a capacity allocation problem within a finite planning period, we conduct a transient analysis of a single-server queue with L\'evy input. From a cost minimization perspective, we investigate the error induced by using stationary congestion measures as opposed to time-dependent measures. Invoking recent results from fluctuation theory of L\'evy processes, we derive a refined cost function, that accounts for transient effects. This leads to a corrected capacity allocation rule for the transient single-server queue. Extensive numerical experiments indicate that the cost reductions achieved by this correction can by significant.
We consider a GI/GI/1 queue with the shortest remaining processing time discipline (SRPT) and lig... more We consider a GI/GI/1 queue with the shortest remaining processing time discipline (SRPT) and light-tailed service times. Our interest is focused on the tail behavior of the sojourn-time distribution. We obtain a general expression for its large-deviations decay rate. The value of this decay rate critically depends on whether there is mass in the endpoint of the service-time distribution or not. An auxiliary priority queue, for which we obtain some new results, plays an important role in our analysis. We apply our SRPT-results to compare SRPT with FIFO from a large-deviations point of view.
We study the convergence of the M/G/1 processor-sharing, queue length process in the heavy traffi... more We study the convergence of the M/G/1 processor-sharing, queue length process in the heavy traffic regime, in the finite variance case. To do so, we combine results pertaining to Lévy processes, branching processes and queuing theory. These results yield the convergence of long excursions of the queue length processes, toward excursions obtained from those of some reflected Brownian motion with drift, after taking the image of their local time process by the Lamperti transformation. We also show, via excursion theoretic arguments, that this entails the convergence of the entire processes to some (other) reflected Brownian motion with drift. Along the way, we prove various invariance principles for homogeneous, binary Crump-Mode-Jagers processes. In the last section we discuss potential implications of the state space collapse property, well known in the queuing literature, to branching processes.
For a GI/GI/1 queue, we show that the average sojourn time under the (blind) Randomized Multileve... more For a GI/GI/1 queue, we show that the average sojourn time under the (blind) Randomized Multilevel Feedback algorithm is no worse than that under the Shortest Remaining Processing Time algorithm times a logarithmic function of the system load. Moreover, it is verified that this bound is tight in heavy traffic, up to a constant multiplicative factor. We obtain this result by combining techniques from two disparate areas: competitive analysis and applied probability.
Eurandom, Sep 4, 2012
We investigate a computer network consisting of two layers occurring in, for example, application... more We investigate a computer network consisting of two layers occurring in, for example, application servers. The first layer incorporates the arrival of jobs at a network of multi-server nodes, which we model as a many-server Jackson network. At the second layer, active servers at these nodes act now as customers who are served by a common CPU. Our main result shows a separation of time scales in heavy traffic: the main source of randomness occurs at the (aggregate) CPU layer; the interactions between different types of nodes at the other layer is shown to converge to a fixed point at a faster time scale; this also yields a state-space collapse property. Apart from these fundamental insights, we also obtain an explicit approximation for the joint law of the number of jobs in the system, which is provably accurate for heavily loaded systems and performs numerically well for moderately loaded systems. The obtained results for the model under consideration can be applied to thread-pool dimensioning in application servers, while the technique seems applicable to other layered systems too.
Journal of Applied Probability, 2020
We investigate the probability that an insurance portfolio gets ruined within a finite time perio... more We investigate the probability that an insurance portfolio gets ruined within a finite time period under the assumption that the r largest claims are (partly) reinsured. We show that for regularly varying claim sizes the probability of ruin after reinsurance is also regularly varying in terms of the initial capital, and derive an explicit asymptotic expression for the latter. We establish this result by leveraging recent developments on sample-path large deviations for heavy tails. Our results allow, on the asymptotic level, for an explicit comparison between two well-known large-claim reinsurance contracts, namely LCR and ECOMOR. Finally, we assess the accuracy of the resulting approximations using state-of-the-art rare event simulation techniques.
We study large deviations asymptotics for unbounded additive functionals of one-dimensional Lange... more We study large deviations asymptotics for unbounded additive functionals of one-dimensional Langevin diffusions with sub-linear gradient drifts. Our results allow us to obtain parametric insights on the speed and the rate functions in terms of the growth rate of the drift and the growth rate of the additive functional. We find a critical value in terms of these growth parameters which dictates regions of sub-linear speed for our large deviations asymptotics. Our proof technique hinges on various constructions of independent interest, including a suitable decomposition of the diffusion process in terms of alternating renewal cycles and a decomposition of the paths on excursion. The key to the sub-linear behavior is a heavy-tailed large deviations phenomenon arising from the fact that at each regeneration cycle the accumulated area of the diffusion process displays a semi-exponential behavior.
We study extreme values in certain fork-join queueing networks: consider N identical queues with ... more We study extreme values in certain fork-join queueing networks: consider N identical queues with a common arrival process and independent service processes. All arrival and service processes are deterministic with random perturbations following Brownian motions. We prove that as N →∞, the scaled maximum of N steady-state queue lengths converges in distribution to a normally distributed random variable. We explore repercussions of this result for original equipment manufacturers (OEMs) that assemble a large number of components, each produced using specialized equipment, into complex systems. Component production capacity is subject to fluctuations, causing high risk of shortages of at least one component, which results in costly system production delays. OEMs hedge this risk by investing in a combination of excess production capacity and component inventories. We formulate a stylized model of the OEM that enables us to study the resulting trade-off between shortage risk, inventory c...
We consider the sample average of a centered random walk in ℝ^d with regularly varying step size ... more We consider the sample average of a centered random walk in ℝ^d with regularly varying step size distribution. For the first exit time from a compact convex set A not containing the origin, we show that its tail is of lognormal type. Moreover, we show that the typical way for a large exit time to occur is by having a number of jumps growing logarithmically in the scaling parameter.
We develop a non-parametric procedure for ranking transmission lines in a power system according ... more We develop a non-parametric procedure for ranking transmission lines in a power system according to the probability that they will overload due to stochastic renewable generation or demand-side load fluctuations, and compare this procedure to several benchmark approaches. Using the IEEE 39-bus test network we provide evidence that our approach, which statistically estimates the rate function for each line, is highly promising relative to alternative methods which count overload events or use incorrect parametric assumptions.
We propose a method for determining the most likely cause, in terms of conventional generator out... more We propose a method for determining the most likely cause, in terms of conventional generator outages and renewable fluctuations, of power system frequency reaching a predetermined level that is deemed unacceptable to the system operator. Our parsimonious model of system frequency incorporates primary and secondary control mechanisms, and supposes that conventional outages occur according to a Poisson process and renewable fluctuations follow a diffusion process. We utilize a large deviations theory based approach that outputs the most likely cause of a large excursion of frequency from its desired level. These results yield the insight that current levels of renewable power generation do not significantly increase system vulnerability in terms of frequency deviations relative to conventional failures. However, for a large range of model parameters it is possible that such vulnerabilities may arise as renewable penetration increases.
In the present paper, we study the evolution of an overloaded cyclic polling model that starts em... more In the present paper, we study the evolution of an overloaded cyclic polling model that starts empty. Exploiting a connection with multitype branching processes, we derive fluid asymptotics for the joint queue length process. Under passage to the fluid dynamics, the server switches between the queues infinitely many times in any finite time interval causing frequent oscillatory behavior of the fluid limit in the neighborhood of zero. Moreover, the fluid limit is random. Additionally, we suggest a method that establishes finiteness of moments of the busy period in an M/G/1 queue.
We consider a two-node tandem queueing network in which the upstream queue is M/G/1 and each job ... more We consider a two-node tandem queueing network in which the upstream queue is M/G/1 and each job reuses its upstream service requirement when moving to the downstream queue. Both servers employ the first-in-first-out policy. We investigate the amount of work in the second queue at certain embedded arrival time points, namely when the upstream queue has just emptied. We focus on the case of infinite-variance service times and obtain a heavy traffic process limit for the embedded Markov chain.
In many applications, significant correlations between arrivals of load-generating events make th... more In many applications, significant correlations between arrivals of load-generating events make the numerical evaluation of the load of a system a challenging problem. Here, we construct very accurate approximations of the workload distribution of the MAP/G/1 queue that capture the tail behavior of the exact workload distribution and provide a small relative error. Motivated by statistical analysis, we assume that the service times are a mixture of a phase-type and a heavy-tailed distribution. With the aid of perturbation analysis, we derive our approximations as a sum of the workload distribution of the MAP/PH/1 queue and a heavy-tailed component that depends on the perturbation parameter. We refer to our approximations as corrected phase-type approximations, and we exhibit their performance with a numerical study.
We consider a queuing model with the workload evolving between consecutive i.i.d. exponential tim... more We consider a queuing model with the workload evolving between consecutive i.i.d. exponential timers {e_q^(i)}_i=1,2,... according to a spectrally positive Lévy process Y_i(t) that is reflected at zero, and where the environment i equals 0 or 1. When the exponential clock e_q^(i) ends, the workload, as well as the Lévy input process, are modified; this modification may depend on the current value of the workload, the maximum and the minimum workload observed during the previous cycle, and the environment i of the Lévy input process itself during the previous cycle. We analyse the steady-state workload distribution for this model. The main theme of the analysis is the systematic application of non-trivial functionals, derived within the framework of fluctuation theory of Lévy processes, to workload and queuing models.
Reflected Brownian motion (RBM) in a wedge is a 2-dimensional stochastic process Z whose state sp... more Reflected Brownian motion (RBM) in a wedge is a 2-dimensional stochastic process Z whose state space in R^2 is given in polar coordinates by S=(r,theta): r >= 0, 0 <= theta <= xi for some 0 < xi < 2 pi. Let alpha= (theta_1+theta_2)/xi, where -pi/2 < theta_1,theta_2 < pi/2 are the directions of reflection of Z off each of the two edges of the wedge as measured from the corresponding inward facing normal. We prove that in the case of 1 < alpha < 2, RBM in a wedge is a Dirichlet process. Specifically, its unique Doob-Meyer type decomposition is given by Z=X+Y, where X is a two-dimensional Brownian motion and Y is a continuous process of zero energy. Furthermore, we show that for p > alpha , the strong p-variation of the sample paths of Y is finite on compact intervals, and, for 0 < p <= alpha, the strong p-variation of Y is infinite on [0,T] whenever Z has been started from the origin. We also show that on excursion intervals of Z away from the origi...
Numerical evaluation of ruin probabilities in the classical risk model is an important problem. I... more Numerical evaluation of ruin probabilities in the classical risk model is an important problem. If claim sizes are heavy-tailed, then such evaluations are challenging. To overcome this, an attractive way is to approximate the claim sizes with a phase-type distribution. What is not clear though is how many phases are enough in order to achieve a specific accuracy in the approximation of the ruin probability. The goals of this paper are to investigate the number of phases required so that we can achieve a pre-specified accuracy for the ruin probability and to provide error bounds. Also, in the special case of a completely monotone claim size distribution we develop an algorithm to estimate the ruin probability by approximating the excess claim size distribution with a hyperexponential one. Finally, we compare our approximation with the heavy traffic and heavy tail approximations.
We consider the steady-state distribution of the sojourn time of a job entering an M/GI/1 queue w... more We consider the steady-state distribution of the sojourn time of a job entering an M/GI/1 queue with the foreground-background scheduling policy in heavy traffic. The growth rate of its mean, as well as the limiting distribution, are derived under broad conditions. Assumptions commonly used in extreme value theory play a key role in both the analysis and the results.
Random multiple-access protocols of type ALOHA are used to regulate networks with a star configur... more Random multiple-access protocols of type ALOHA are used to regulate networks with a star configuration where client nodes talk to the hub node at the same frequency (finding a wide range of applications among telecommunication systems, including mobile telephone networks and WiFi networks). Such protocols control who talks at what time sharing the common idea "try to send your data and, if your message collides with another transmission, try resending later". In the present paper, we consider a time-slotted ALOHA model where users are allowed to renege before transmission completion. We focus on the scenario that leads to overload in the absence of impatience. Under mild assumptions, we show that the fluid (or law-of-large-numbers) limit of the system workload coincides a.s. with the unique solution to a certain integral equation. We also demonstrate that the fluid limits for distinct initial conditions converge to the same value as time tends to infinity.
Motivated by a capacity allocation problem within a finite planning period, we conduct a transien... more Motivated by a capacity allocation problem within a finite planning period, we conduct a transient analysis of a single-server queue with L\'evy input. From a cost minimization perspective, we investigate the error induced by using stationary congestion measures as opposed to time-dependent measures. Invoking recent results from fluctuation theory of L\'evy processes, we derive a refined cost function, that accounts for transient effects. This leads to a corrected capacity allocation rule for the transient single-server queue. Extensive numerical experiments indicate that the cost reductions achieved by this correction can by significant.
We consider a GI/GI/1 queue with the shortest remaining processing time discipline (SRPT) and lig... more We consider a GI/GI/1 queue with the shortest remaining processing time discipline (SRPT) and light-tailed service times. Our interest is focused on the tail behavior of the sojourn-time distribution. We obtain a general expression for its large-deviations decay rate. The value of this decay rate critically depends on whether there is mass in the endpoint of the service-time distribution or not. An auxiliary priority queue, for which we obtain some new results, plays an important role in our analysis. We apply our SRPT-results to compare SRPT with FIFO from a large-deviations point of view.
We study the convergence of the M/G/1 processor-sharing, queue length process in the heavy traffi... more We study the convergence of the M/G/1 processor-sharing, queue length process in the heavy traffic regime, in the finite variance case. To do so, we combine results pertaining to Lévy processes, branching processes and queuing theory. These results yield the convergence of long excursions of the queue length processes, toward excursions obtained from those of some reflected Brownian motion with drift, after taking the image of their local time process by the Lamperti transformation. We also show, via excursion theoretic arguments, that this entails the convergence of the entire processes to some (other) reflected Brownian motion with drift. Along the way, we prove various invariance principles for homogeneous, binary Crump-Mode-Jagers processes. In the last section we discuss potential implications of the state space collapse property, well known in the queuing literature, to branching processes.
For a GI/GI/1 queue, we show that the average sojourn time under the (blind) Randomized Multileve... more For a GI/GI/1 queue, we show that the average sojourn time under the (blind) Randomized Multilevel Feedback algorithm is no worse than that under the Shortest Remaining Processing Time algorithm times a logarithmic function of the system load. Moreover, it is verified that this bound is tight in heavy traffic, up to a constant multiplicative factor. We obtain this result by combining techniques from two disparate areas: competitive analysis and applied probability.