Pareto Distribution Research Papers - Academia.edu (original) (raw)

2025, arXiv (Cornell University)

In this paper, our interest is in the problem of simultaneous hypothesis testing when the test statistics corresponding to the individual hypotheses are possibly correlated. Specifically, we consider the case when the test statistics together have a multivariate normal distribution (with equal correlation between each pair) with an unknown mean vector and our goal is to decide which components of the mean vector are zero and which are non-zero. This problem was taken up earlier in for the case when the test statistics are independent normals. Asymptotic optimality in a Bayesian decision theoretic sense was studied in this context, the optimal precodures were characterized and optimality of some well-known procedures were thereby established. The case under dependence was left as a challenging open problem. We have studied the problem both theoretically and through extensive simulations and have given some permutation invariant rules. Though in , the asymptotic derivations were done in the context of sparsity of the non-zero means, our result does not require the assumption of sparsity and holds under a more general setup.

2025, South African Statistical Journal

The problem of multiple hypothesis testing with correlated test statistics is a very important problem in statistical literature. Specifically, we consider the case when the joint distribution of the test statistics is a multivariate normal distribution with an unknown mean vector and compound symmetric correlation structure. Our goal is to identify nonzero entries of the mean vector. Bogdan et al. ( ) solved this problem when test statistics are independent normals along with the study of asymptotic optimality in a Bayesian decision theoretic sense. The case under dependence was left as a challenging open problem. The solution is intuitive and permutation invariant, does not assume sparsity unlike Bogdan et al. ( ) and is validated through simulation studies.

2025, Scientific Programming

Education is mandatory, and much research has been invested in this sector. An important aspect of education is how to evaluate the learners' progress. Multiple-choice tests are widely used for this purpose. e tests for learners in the same exam should come in equal difficulties for fair judgment. us, this requirement leads to the problem of generating tests with equal difficulties, which is also known as the specific case of generating tests with a single objective. However, in practice, multiple requirements (objectives) are enforced while making tests. For example, teachers may require the generated tests to have the same difficulty and the same test duration. In this paper, we propose the use of Multiswarm Multiobjective Particle Swarm Optimization (MMPSO) for generating k tests with multiple objectives in a single run. Additionally, we also incorporate Simulated Annealing (SA) to improve the diversity of tests and the accuracy of solutions. e experimental results with various criteria show that our approaches are effective and efficient for the problem of generating multiple tests.

2025

One of the main empirical features of wealth inequality over the last century has been the emergence of right-skewed distributions with relatively large and slowly declining top shares. While mostly focusing on the upper tail of the wealth distribution, the existing literature has paid far less attention to the relationship between top wealth concentration and the structural properties at the bottom of the distribution, which may be characterized by distinct mobility patterns. We develop a simple framework of intergenerational transmission of human capital and bequests where credit market imperfections, indivisibilities in educational investment and the ensuing occupational choices interact in shaping the stationary (full) distribution of wealth. Our model is flexible enough to produce both ergodicity and non-ergodicity (history dependence) in the process of wealth accumulation, and delivers three main predictions: 1) When ergodicity is supported at equilibrium, the limit distributi...

2025, Fudma Journal of Sciences

In this study, a novel distribution called the two-parameter Sine Lomax distribution was introduced. The distribution was developed by combining the Sine generalized family of distributions with the Lomax distribution. Various statistical properties of this new distribution were investigated, including the survival function, hazard function, quantile function, rth moment, entropy, moment generating function, and order statistics. The probability density function (PDF) plot indicated that the distribution is skewed to the right. Additionally, the hazard plot of the Sine Lomax distribution showed both monotonic increase and monotonic decrease. To estimate the parameters of the newly proposed distribution, the maximum likelihood approach was employed. A simulation study was conducted to evaluate the consistency of the estimators. The simulation results indicated that the estimators are consistent, as the bias and mean square error decrease with increasing sample sizes. The performance of the Sine Lomax distribution was compared to other extensions of Lomax distributions and the baseline distribution which is the Lomax distribution using various evaluation criteria, including the Akaike Information Criterion (AIC), Consistent Akaike Information Criterion (CAIC), Bayesian Information Criterion (BIC), and Hannan-Quinn Information Criterion (HQIC). The proposed distribution demonstrated the lowest scores among the competing models, indicating its potential for accurately modeling real-world data sets. Based on the results, the proposed Sine Lomax distribution is recommended as a superior alternative to the competing models for modeling certain real-world data sets.

2025, Journal of Multivariate Analysis

We consider the location-scale family of distributions, which contains many standard lifetime distributions. We give conditions under which the largest order statistic of a set of random variables with different/the same location as well as different/the same scale parameters dominates that of another set of random variables with respect to various stochastic orders. Along with general results, we consider important special cases, namely, the Feller-Pareto, generalized Pareto, Burr, exponentiated Weibull, Power generalized Weibull, generalized gamma, Half-normal and Fréchet distributions.

2025, Statistics & Probability Letters

We propose two versions of asymptotically distribution free empirical processes. When a composite null hypothesis contains a family of distributions indexed by a one dimensional parameter space, and when that single parameter is estimated by maximum likelihood, the resulting distribution free goodness of fit tests are simpler than tests applying the Khmaladze transformation. For the Pareto distribution, the process we advocate is especially simple. The theory is illustrated by fitting the Pareto distribution to threshold exceedances of stock returns, and the Weibull distribution to fibre strength data.

2025, arXiv (Cornell University)

A new family of multivariate distributions, which shall be termed multivector variate distributions, based in the family of the multivariate contoured elliptically distribution is proposed. Several particular cases of multivector variate distributions are obtained and a number of published multivariate distributions in another contexts are found as simple corollaries. An application of interest in finance is full derived and compared with the traditional methods.

2025, Naval Research Logistics

We consider a bivariate Pareto distribution, as a generalization of the Lindley-Singpurwalla model, by incorporating the influence of the operating conditions on a two-component dependent system. The properties of the model and its... more

2025, Statistical Methodology

Bayes and classical estimators have been obtained for two parameters exponentiated Pareto distribution when sample is available from complete, type I and type II censoring scheme. Bayes estimators have been developed under squared error loss function as well as under LINEX loss function using noninformative type of priors for the parameters. It has been seen that the estimators obtained are not available in nice closed forms, although they can be easily evaluated for the given sample by using suitable numerical methods. The performance of the proposed estimators have been compared on the basis of their simulated risks obtained under squared error as well as under LINEX loss functions.

2025

This paper introduces a new family of distributions named the Alpha Power Type II-G (APII-G) family, which emerges as a groundbreaking modeling strategy for examining data governed by univariate continuous distributions. This family aims to enhance the modeling capabilities of continuous prior distributions to better fit the data utilizing a new function encompassing the additional parameter power. The innovative methodology implemented encompasses two continuous distributions: firstly, the oneparameter exponential distribution, which engendered a fresh two-parameter, Alpha Power II Exponential (APIIE) distribution, and secondly, the two-parameter Weibull distribution, which yielded a new three-parameter, Alpha Power II Weibull (APIIW) distribution. Moreover, a scrutiny of the characteristics and statistical functions, and the estimations of the parameters of the two distributions. The efficacy of these estimators is substantiated through simulation studies and finding the mean square error (MSE) and bias values of the estimators compared to sample sizes. It has been empirically proven that the two suggested models outperformed the asymptotic distributions they were compared against using multiple goodness-fit criteria as Akaike information criterion (AIC), Bayesian information criterion (BIC), corrected AIC (CAIC) and Hannan-Quinn information criterion (HQIC) on authentic datasets, The values of these criteria appeared to be the lowest for the two new distributions, which means that the new distributions are the best, especially in the context of the given data.

2025, arXiv (Cornell University)

In this article we show the relationship between the Pareto distribution and the gamma distribution. This shows that the second one, appropriately extended, explains some anomalies that arise in the practical use of extreme value theory. The results are useful to certain phenomena that are fitted by the Pareto distribution but, at the same time, they present a deviation from this law for very large values. Two examples of data analysis with the new model are provided. The first one is on the influence of climate variability on the occurrence of tropical cyclones. The second one on the analysis of aggregate loss distributions associated to operational risk management.

2025, International Journal of Mathematical Education in Science and Technology

The Weibull variate is commonly used as a lifetime distribution in reliability applications. Estimation of parameters is revisited in the two-parameterWeibull distribution. The method of product spacings, the method of quantile estimates and the method of least squares are applied to this distribution. A comparative study between a simple minded estimate, the maximum likelihood estimate, the product spacings estimate, the quantile estimate, the least squares estimate, and the adjusted least squares estimate is presented.

2025

The subject of assessing whether a data set is from a speciflc dis- tribution is crucial in statistical inference. This topic is critically important for uniform distributions as in generating random samples, a random sample is taken from the standard uniform distribution and then converted to the sam- ple as required. Two difierent modiflcations of the Anderson-Darling A2 test are presented. The critical values for the modiflcations and the usual A2 test are also re-computed for difierent sample sizes. This is followed by a power comparison of the various tests. 2000 Mathematics Subject Classiflcation: 62-04

2025, entropy

In view of some persistent recent reports on a singular kind of growth of the world wealth inequality, where a finite (often handful) number of people tend to possess more than the wealth of the planet’s 50% population, we explore here if the kinetic exchange models of the market can ever capture such features where a significant fraction of wealth can concentrate in the hands of a countable few when the market size N tends to infinity. One already existing example of such a kinetic exchange model is the Chakraborti or Yard-Sale model, where (in absence of tax redistribution etc) the entire wealth condenses in the hand of one (for any value of N), and the market dynamics stops. With tax redistribution etc, its steady state dynamics have been shown to have remarkable applicability in many cases of our extremely unequal world. We show here that another kinetic exchange model (called here the Banerjee model) has intriguing intrinsic dynamics, by which only ten rich traders or agents possess about 99.98% of the total wealth in the steady state (without any tax etc like external manipulation) for any large value of N. We will discuss in some detail the statistical features of this model using Monte Carlo simulations. We will also show, if the traders each have a non-vanishing probability f of following random exchanges, then these condensations of wealth (100% in the hand of one agent in the Chakraborti model, or about 99.98% in the hands ten agents in the Banerjee model) disappear in the large N limit. We will also see that due to the built-in possibility of random exchange dynamics in the earlier proposed Goswami-Sen model, where the exchange probability decreases with an inverse power of the wealth difference of the pair of traders, one did not see any wealth condensation phenomena. These aspects of the statistics of these intriguing models have been discussed here.

2025, arXiv: Applications

Flexibility in shape and scale of Burr XII distribution can make close approximation of numerous well-known probability density functions. Due to these capabilities, the usages of Burr XII distribution are applied in risk analysis, lifetime data analysis and process capability estimation. In this paper the Cross-Entropy (CE) method is further developed in terms of Maximum Likelihood Estimation (MLE) to estimate the parameters of Burr XII distribution for the complete data or in the presence of multiple censoring. A simulation study is conducted to evaluate the performance of the MLE by means of CE method for different parameter settings and sample sizes. The results are compared to other existing methods in both uncensored and censored situations.

2025, Journal of Computer Networks

This paper describes realistic modeling of source burst traffic using Pareto distribution. The source traffic generators, which was investigated depending on the Pareto distribution, is / ON OFF Pareto Pareto for generating the burst length and gap time. Network parameters such as allowed cell rate (ACR), switch input / output rate, memory access time, queue length and cell transfer delay (CTD) have been estimated considering six data sources. Mathematical model of the source data traffic has been developed and results for / ON OFF Pareto Pareto traffic generators are presented considering up to 1000 count values of the uniformly distributed random number S. The effect of shape parameter "α " of / ON OFF Pareto Pareto , is reported.

2025, Proceedings 1995 Canadian Conference on Electrical and Computer Engineering

2025, Civil Engineering and Environmental Systems

This article presents a Monte Carlo simulation model based on Clayton copula to generate synthetic sequences of rainfall storms whose likelihood of occurrence is in consonance with historical records. These sequences are used as rainfall input data for the Storm Water Management Model to analyse a 3 km 2 watershed in the city of Granada, Spain. The objective of our study is to estimate the optimal volume of a detention basin that would solve the problem created by a downstream combined sewer system (CSS), part of which is undersized and which cannot be enlarged without considerable cost. In this model, the synthetic rainfall sequences provide multiple inputs to the CSS model, which produce a range of model outputs. Risk-based analysis of these outputs helps to quantify the variability of the CSS response. As a result, our model enables decisions based on the comparison of expected outcomes and the costs of different choices. The rainfall storms are represented as a sequence of rectangular pulses, whose occurrences are driven by a Poisson process with a given arrival rate. Both storm durations and average storm intensities are fitted as Pareto distributions and the dependence between these variables is described using Clayton copula.

2025, Performance evaluation review

We consider policies for CPU load balancing in networks of workstations. We address the question of whether preemptive migration (migrating active processes) is necessary, or whether remote execution (migrating processes only at the time of birth) is sufficient for load balancing. We show that resolving this issue is strongly tied to understanding the process lifetime distribution. Our measurements indicate that the distribution of lifetimes for a UNIX process is Pareto (heavy-tailed), with a consistent functional form over a variety of workloads. We show how to apply this distribution to derive a preemptive migration policy that requires no hand-tuned parameters. We used a trace-driven simulation to show that our preemptive migration strategy is far more effective than remote execution, even when the memory transfer cost is high.

2025, Journal of Hydrology

In order to check the adequacy of the exponential distribution for, say, peaks-overthresholds data, a simple statistic is selected with high power for a too long or too short right tail. To that end the exponential distribution is embedded in the GPD (generalised Pareto distribution). The power of the finally proposed statistic (maximum/median) is compared with two other less simple statistics, one based on the behaviour of the ML estimator of the GPD-parameters and the other based on the subsequent slopes in the probability plot. An application is given. The ML estimation of the GPD parameters and the distribution function of the proposed statistic for exponential data are presented.

2025, Hydrological Sciences Journal

The Generalized Pareto distribution is fitted to Dutch peaks-over-threshold (POT) rainfall series. Maximum likelihood (ML) estimation of the parameters is discussed, with particular reference to bias reduction for the scale and shape parameter. The improvement of the fit in the right tail by left censoring of POT series is discussed, and the dependence of the shape parameter 9 of the fitted Generalized Pareto distribution on (a) mean annual number of peaks, and (b) rainfall duration, with fixed or moving starting point, is quantified. La répartition généralisée de Pareto est ajustée à des séries pluviomêtriques des Pays-Bas de pointes au-dessus d'un seuil [peaks-over-threshold (POT)]. L'estimation par la méthode du maximum de vraisemblance des paramètres est étudiée avec, en particulier, la réduction du biais pour les paramètres d'échelle et de forme. L'amélioration de l'ajustement dans la partie droite de la répartition,en tronquant les séries POT à gauche,est également discutée. Par ailleurs, on quantifie la dépendance du paramètre de forme 8 de la répartition généralisée de Pareto ajustée (a) par rapport au nombre normal moyen de pointes et (b) par rapport a durée des précipitations, avec un point de départ fixe ou mobile.

2025, arXiv (Cornell University)

2025

We study the extremal behavior of a stochastic integral driven by a multivariate Lévy process that is regularly varying with index α > 0. For predictable integrands with a finite (α + δ)-moment, for some δ > 0, we show that the extremal behavior of the stochastic integral is due to one big jump of the driving Lévy process and we determine its limit measure associated with regular variation on the space of càdlàg functions.

2025, Nature Communications

Urban areas exist in a wide variety of population sizes, from small towns to huge megacities. No proposed form for the statistical distribution of city sizes has received more attention than Zipf's law, a Pareto distribution with power law exponent equal to one. However, this distribution is typically violated by empirical evidence for small and large cities. Moreover, no theory presently exists to derive city size distributions from fundamental demographic choices while also explaining consistent variations. Here we develop a comprehensive framework based on demography to show how the structure of migration flows between cities, together with the differential magnitude of their vital rates, determine a variety of city size distributions. This approach provides a powerful mathematical methodology for deriving Zipf's law as well as other size distributions under specific conditions, and to resolve puzzles associated with their deviations in terms of concepts of choice, symmetry, information, and selection.

2025

The Bayesian approach to statistics has been growing rapidly in popularity as an alternative to the classical approach in the economic evaluation of health technologies, due to the significant benefits it affords. One of the most important advantages of Bayesian methods is their incorporation of prior information. Thus, use is made of a greater amount of information, and so stronger results are obtained than with frequentist methods. However, since Stevens and O'Hagan (2002) showed that the elicitation of a prior distribution on the parameters of interest plays a crucial role in a Bayesian cost-effectiveness analysis, relatively few papers have addressed this issue. In a cost-effectiveness analysis, the parameters of interest are the mean efficacy and mean cost of each treatment. The most common prior structure for these two parameters is the bivariate normal structure. In this paper, we study the use of a more general (and flexible) family of prior distributions for the parameters. In particular, we assume that the conditional densities of the parameters are all normal. The model is validated using data of a real clinical trial. The posterior distributions have been simulated using Markov Chain Monte Carlo techniques.

2025, Journal of Lightwave Technology

A simple dynamic model of the erbium-doped fiber amplifier (EDFA) that includes self-saturation by amplified spontaneous emission (ASE) is used to analyze the power and signalto-noise ratio (SNR) transients in wavelength division multiplexed (WDM) optical networks in which signals cross chains of EDFA's from source to destination. The model, which consists of solving sequentially one ordinary differential equation per amplifier, is used to 1) determine power and SNR excursions in the surviving channels along a chain of 35 EDFA's during isolated add-drop events in a 16-channel WDM circuit switching scenario and 2) run Monte Carlo simulations of the first five EDFA's of the same chain fed by burst-mode packet switching traffic on each of the 16 channels. Each packet source is modeled as an ON-OFF asynchronous transfer mode (ATM) source, with ON and OFF times having a heavy-tailed Pareto distribution. The aggregate source model is asymptotically self-similar, and well describes multimedia packet communications. The results are used to examine the influence of average network utilization and source ON-OFF time variance on the probability density function of signal power and SNR at each EDFA output. We demonstrate that selfsimilar traffic generates sizable power and SNR swings, especially at low network utilization. The simulations also indicate sizable broadening of the power and SNR density functions along the cascade of EDFA's, reaching levels in excess of 9 dBm and 4 dB for the power and SNR swings, respectively, at the 5th EDFA. The effect becomes more pronounced for longer EDFA chains. Such a large broadening may imply serious system impairments in burst-mode WDM packet networks.

2025

Power law behavior has been recognized to be a pervasive feature of many phenomena in natural and social sciences. While immense research efforts have been devoted to the analysis of behavioural mechanisms responsible for the ubiquity of power-law scaling, the strong theoretical foundation of power laws as a very general type of limiting behavior of large realizations of stochastic processes is less well known. In this chapter, we briefly present some of the key results of extreme value theory, which provide a statistical justification for the emergence of power laws as limiting behavior for extreme fluctuations. The remarkable generality of the theory allows to abstract from the details of the system under investigation, and therefore allows its application in many diverse fields. Moreover, this theory offers new powerful techniques for the estimation of the Pareto index, detailed in the second part of this chapter.

2025

In this paper we explicitly model the tail regions of the innovation distribution of two important series from the emerging financial markets of India, viz., Nifty, the equity index, and 30-day interest rate from the inter bank currency money market. Using the recent developments of extreme value theory, we estimate the tails by fitting a Generalised Pareto distribution to the observations lying beyond certain thresholds that mark the beginning of the tail regions. In line with the much discussed stylised features of financial returns, we find existence of tail-thickness, as indicated by the positive value of the tail indexes of the Generalised Pareto fit. Further, the left tail of each series is found to be heavier than the right tail. Thus, each series display asymmetric and heavy tailed behaviour.

2025, Journal of Economic Perspectives

Vilfredo Pareto, using data for England, a number of Italian cities, several German states, Paris, and Peru, plotted cumulative distributions of income for these countries on double logarithmic paper. He claimed that in each case the result was a straight line with about the same slope. Thus, he asserted a law of income distribution. I discuss Pareto's discovery of this relationship; his theory of income distribution; Pareto's Law and Pareto optimality; the attack on Pareto's Law; the counterattack; and the more recent literature. For all the excesses of the Paretian camp followers, there remains the significant insight that the history of all hitherto existing society is a history of social hierarchies. There is the feel of structure behind income distributions. Something is going on here.

2025, International Journal of Computer Applications

In this paper, the maximum likelihood and Bayesian estimation are developed based on Type-II progressive hybrid censoring scheme from the Pareto distribution. One and two-sample Bayesian prediction is also discussed using Type-II progressive hybrid censoring scheme. Finally, numerical example is presented for illustrating all the inferential procedures developed here.

2025, PLOS ONE

The Gull Alpha Power Lomax distribution is a new extension of the Lomax distribution that we developed in this paper (GAPL). The proposed distribution’s appropriateness stems from its usefulness to model both monotonic and non-monotonic hazard rate functions, which are widely used in reliability engineering and survival analysis. In addition to their special cases, many statistical features were determined. The maximum likelihood method is used to estimate the model’s unknown parameters. Furthermore, the proposed distribution’s usefulness is demonstrated using two medical data sets dealing with COVID-19 patients’ mortality rates, as well as extensive simulated data applied to assess the performance of the estimators of the proposed distribution.

2025, International Journal of Statistics and Probability

In this work, we present a five-parameter life time distribution called Harris power Lomax (HPL) distribution which is obtained by convoluting the Harris-G distribution and the Power Lomax distribution. When compared to the existing distributions, the new distribution exhibits a very flexible probability functions; which may be increasing, decreasing, J, and reversed J shapes been observed for the probability density and hazard rate functions. The structural properties of the new distribution are studied in detail which includes: moments, incomplete moment, Renyl entropy, order statistics, Bonferroni curve, and Lorenz curve etc. The HPL distribution parameters are estimated by using the method of maximum likelihood. Monte Carlo simulation was carried out to investigate the performance of MLEs. Aircraft wind shield data and Glass fibre data applications demonstrate the applicability of the proposed model.

2025, Modified Weighted Pareto Distribution Type I (MWPDTI)

In this paper, the Azzallini's method used to find a weighted distribution derived from the standard Pareto distribution of type I (SPDTI) by inserting the shape parameter (θ) resulting from the above method to cover the period (0, 1] which was neglected by the standard distribution. Thus, the proposed distribution is a modification to the Pareto distribution of the first type, where the probability of the random variable lies within the period [ 𝑘 𝜃 , ∞) , 𝑘 > 0, 𝜃 ≥ 1. The properties of the modified weighted Pareto distribution of the type I (MWPDTI) as the probability density function ,cumulative distribution function, Reliability function , Moment and the hazard function are found. The behaviour of probability density function for MWPDTI distribution by representing the values of 𝑥 = 𝑘 𝜃 This means, the probability density function of this distribution treats the period (0,1] which is ignore in SPDTI.

2025, Bayesian Estimation for Parameters and the Survival Function of Modified Weibull Distribution using Tierney and Kadane’s Approximation

In this paper, it has derived Bayesian estimation for parameters and survival function of modified Weibull distribution based on quadratic loss functions, Tierney and Kadane's approximation has been used to obtain those estimators. It is assumed that the parameter behaves as a random variable has prior distribution. and Mathematical formulas for these estimates were derived.

2025

This paper introduces a multi-objective EA, termed the Clustering Pareto Evolutionary Algorithm (CPEA). The CPEA finds and retains many local Pareto-optimal frontiers, rather than just the global frontier as is the case of most multi-objective EAs found in the literature. This has been achieved using a clustering technique commonly used in multivariate statistical analysis, which ensures that competition between individuals is local in variable space, allowing the population to grow to resolve as many Pareto-optimal frontiers as necessary. The performance of the CPEA is evaluated on several test problems taken from the literature which have either single optima or multiple local optima and is shown to be extremely effective. The present clustering method is computationally expensive and will be replaced with an incremental method in the near future.

2025, Physical Review E

Personal income distribution may exhibit a two-class structure, such that the lower income class of the population (85-98%) is described by exponential Boltzmann-Gibbs distribution, whereas the upper income class (15-2%) has a Pareto power-law distribution. We propose a method, based on a theoretical and numerical optimization scheme, which allows us to determine the crossover income between the distributions, the temperature of the Boltzmann-Gibbs distribution and the Pareto index. Using this method, the Brazilian income distribution data provided by the National Household Sample Survey was studied. The data was stratified into two dichotomies (sex/gender and color/race), so the model was tested using different subsets along with accessing the economic differences between these groups. Lastly, we analyse the temporal evolution of the parameters of our model and the Gini coefficient discussing the implication on the Brazilian income inequality. To our knowledge, for the first time an optimization method is proposed in order to find a continuous two-class income distribution, which is able to delimit the boundaries of the two distributions. It also gives a measure of inequality which is a function that depends only on the Pareto index and the percentage of people in the high income region. It was found a temporal dynamics relation, that may be general, between the Pareto and the percentage of people described by the Pareto tail.

2025, Bioinformatics

In our previous studies, we developed discrete-space Birth, Death and Innovation Models (BDIM) of genome evolution. These models explain the origin of the characteristic Pareto distribution of paralogous gene family sizes in genomes, and model parameters that provide for the evolution of these distributions within a realistic timeframe have been identified. However, extracting temporal dynamics of genome evolution from discrete-space BDIM was not technically feasible. We were interested in obtaining dynamic portraits of the genome evolution process by developing a diffusion approximation of BDIM. The diffusion version of BDIM belongs to a class of continuous-state models whose dynamics is described by the Fokker-Plank equation and the stationary solution could be any specified Pareto function. The diffusion models have time-dependent solutions of a special kind, namely, the generalized self-similar solutions, which describe the transition from one stationary distribution of the system to another; this provides for the possibility of examining the temporal dynamics of genome evolution. Analysis of the generalized self-similar solutions of the diffusion BDIM reveals a biphasic curve of genome growth in which the initial, relatively short, self-accelerating phase is followed by a prolonged phase of slow deceleration. This evolutionary dynamics was observed both when genome growth started from zero and proceeded via innovation (a potential model of primordial evolution) and when evolution proceeded from one stationary state to another. In biological terms, this regime of evolution can be tentatively interpreted as a punctuated-equilibrium-like phenomenon such that whereby evolutionary transitions are accompanied by rapid gene amplification and innovation, followed by slow relaxation to a new stationary state.

2025, Statistics

Lindley-Singpurwalla (1986)'s bivariate Pareto distribution is one of the most popular bivariate Pareto distribution. proposed a new bivariate Pareto distribution which also has Pareto marginals and it contains Lindley-Singpurwalla's bivariate Pareto model as a special case. It has several other interesting properties also. In this paper we re-visit Sankaran and Nair's bivariate Pareto model. We discuss several other new properties. The maximum likelihood estimators and two stage estimators are also investigated. We analyze two data sets for illustrative purposes. It is observed that this model can be used quite effectively to analyze competing risks data. Finally we propose some generalizations.

2025, SHS Web of Conferences

In this paper we study the possibility of construction indicators-precursors relying on one of the most power-law tailed distributions - Levy’s stable distribution. Here, we apply Levy’s parameters for 29 stock indices for the period from 1 March 2000 to 28 March 2019 daily values and show their effectiveness as indicators of crisis states on the example of Dow Jones Industrial Average index for the period from 2 January 1920 to 2019. In spite of popularity of the Gaussian distribution in financial modeling, we demonstrated that Levy’s stable distribution is more suitable due to its theoretical reasons and analysis results. And finally, we conclude that stability α and skewness β parameters of Levy’s stable distribution which demonstrate characteristic behavior for crash and critical states, can serve as an indicator-precursors of unstable states.

2025, International Journal of Climatology

Wind speeds from tropical cyclones (TCs) occurring near the USA are modeled with climate variables (covariates) using quantile regression. The influences of Atlantic sea-surface temperature (SST), the Pacific El Niño, and the North Atlantic oscillation (NAO) on near-coastal TC intensity are in the direction anticipated from previous studies using Poisson regression on cyclone counts and are, in general, strongest for higher intensity quantiles. The influence of solar activity, a new covariate, peaks near the median intensity level, but the relationship switches sign for the highest quantiles. An advantage of the quantile regression approach over a traditional parametric extreme value model is that it allows easier interpretation of model coefficients (parameters) with respect to changes to the covariates since coefficients vary as a function of quantile. It is proven mathematically that parameters of the Generalized Pareto Distribution (GPD) for extreme events can be used to estimate regression coefficients for the extreme quantiles. The mathematical relationship is demonstrated empirically using the subset of TC intensities exceeding 96 kt (49 m/s).

2025, Computer Networks

There have been many queuing analyses for a single server queue fed by an M/G/∞ traffic process, in which G is a Pareto Distribution, that focus on certain limiting conditions. In this paper we enhance the so-called Quasi-Stationary (QS) approximation -a queuing analysis introduced previously that provides an algorithm for computation of an accurate approximation for the stationary queue distribution, applicable to the entire range of system parameters. By numerical evaluation of the QS approximation and the asymptotic approximations (large buffer, many sources, and heavy traffic) over an extremely wide range of parameter values we are able to graphically display consistency of the QS approximation with all the asymptotic results. We demonstrate that the accuracy of the asymptotic approximations is satisfactory only in limited regions of the system parameter space.

2025, DergiPark (Istanbul University)

The generalized Pareto distribution, which is a special case of both exponential and Wakeby distribution, has good potential for the analysis of flood peaks because of its inherent properties. In this paper, the parameter estimation methods of the moments, probability-weighted moments, maximum likelihood, principle of maximum entropy, and least squares to estimate the parameters in the three-parameter generalized Pareto distribution are compared. The usefulness and applicability of each method is discussed by application to observed annual discharge data for 50 different rivers, most of them in Turkey. The comparisons are based on the ability of each method to predict the elements of the sample series whose non-exceedence probabilities were determined by the Cunnane plotting position formula. Altogether the results demonstrate that for the annual discharge time series considered in this paper the moments method is superior to all the other parameter estimation methods employed.

2025, IEEE Communications Letters

2025, Geophysical Research Letters

Tropical hydroelectric reservoirs generally constitute an appreciable source of CH 4 (methane), a potent greenhouse gas. In this letter, we investigate the statistical characteristics of methane ebullition fluxes in hydroelectric reservoirs. To this end, we use CH 4 flux measurements obtained in Manso (wet season, 2004) and Corumba ´(dry and wet seasons, 2005) reservoirs, located respectively in Mato Grosso and Goia ´s, Brazil. Methane ebullition fluxes were measured using open dynamic chambers, connected to an infrared photo-acoustic trace gas analyzer (TGA). Our main result indicates that when properly rescaled, all methane ebullition data collapse into a single statistic well described by a Generalized Pareto distribution, with shape parameter well above zero. The approach presented here, which combines high-frequency CH 4 ebullition data and Extreme Value theory analytical tools, shows that, although bubbling patterns appear to be highly complex and unpredictable, they may still be described by a rather simple (but non trivial) dynamics.

2025

Using a simple model of family decision making we examine the processes by which the wealth distribution changes over the generations, focusing in particular on the division of fortunes through inheritance and the union of fortunes through marriage. We show that the equilibrium wealth distribution can be characterized in a simple way for a variety of inheritance rules and marriage patterns. The shape of the distribution is principally determined by the size distribution of families. We show how changes in fertility, inheritance rules and inheritance taxation a ect long-run inequality.

2025, Applied Ocean Research

Results on horizontal asymmetry and steepness distributions from analyses of ocean wave data collected during 10 severe storms in the northern North Sea are presented. The data have been collected at a sampling rate of 5 Hz using laser altimeters mounted on a fixed platform permitting the shapes of individual waves to be quite closely defined. This has allowed the steepness of the fronts and backs of wave crests and troughs to be examined. It is found that, on an average, as the non-dimensionalised wave height increases, the horizontal asymmetry becomes more pronounced. That is, the fronts of large wave crests tend to be steeper than their backs. Regression has been used to establish relationships between individual non-dimensionalised wave heights and steepness measures. The generalised Pareto distribution has then been used to establish a simple model for predicting the probability of extreme wave steepness conditional on the non-dimensionalised wave height.

2025, The 31st International Ocean and Polar Engineering Conference

To evaluate the sloshing loads from the peak pressures of the model test, a proper statistical model and its parameter estimator should be properly selected. The three-parameter Weibull distribution with 5 different estimators and the generalized Pareto distribution with 3 different estimators have been studied based on the 2,321 model test cases. Three different goodness-of-fitness tests have been used to test their fitness. Expected results of each statistical model are described and possible selection of the statistical model is suggested to obtain well-fitted distribution to the sloshing peak samples.

2025, SSRN Electronic Journal

There is a vast literature on the selection of an appropriate index of income inequality and on what desirable properties such a measure (or index) should contain. The Gini index is, of course, the most popular. There is a concurrent literature on the use of hypothetical statistical distributions to approximate and describe an observed distribution of incomes. Pareto and others observed early on that incomes tend to be heavily right-tailed in their distribution. These asymmetries led to approximating the observed income distributions with extreme value hypothetical statistical distributions, such as the Pareto distribution. But these income distribution functions (IDFs) continue to be described with a single index (such as the Gini) that poorly detect the extreme values present in the underlying empirical IDF. This paper introduces a new inequality measure to supplement, but not to replace, the Gini that measures more accurately the inherent asymmetries and extreme values that are present in observed income distributions. The new measure is based on a third-order term of a Legendre polynomial from the logarithm of a share function (or Lorenz curve). We advocate using the two measures together to provide a better description of inequality inherent in empirical income distributions with extreme values.