Pareto Distribution Research Papers - Academia.edu (original) (raw)

The Z-value is an attempt to estimate the statistical significance of a Smith-Waterman dynamic alignment score (SW-score) through the use of a Monte-Carlo process. It partly reduces the bias induced by the composition and length of the... more

The Z-value is an attempt to estimate the statistical significance of a Smith-Waterman dynamic alignment score (SW-score) through the use of a Monte-Carlo process. It partly reduces the bias induced by the composition and length of the sequences. This paper is not a theoretical study on the distribution of SW-scores and Z-values. Rather, it presents a statistical analysis of Z-values on large datasets of protein sequences, leading to a law of probability that the experimental Z-values follow. First, we determine the relationships between the computed Z-value, an estimation of its variance and the number of randomizations in the Monte-Carlo process. Then, we illustrate that Z-values are less correlated to sequence lengths than SW-scores. Then we show that pairwise alignments, performed on 'quasi-real' sequences (i.e., randomly shuffled sequences of the same length and amino acid composition as the real ones) lead to Z-value distributions that statistically fit the extreme val...

The beta rank function (BRF), is the normalized and continuous rank of an observation has wide applications in fitting real-world data. The underlying probability density function (pdf) is not expressible in terms of elementary functions... more

The beta rank function (BRF), is the normalized and continuous rank of an observation has wide applications in fitting real-world data. The underlying probability density function (pdf) is not expressible in terms of elementary functions except for specific parameter values. We show however that it is approximately a unimodal skewed two-sided power law, or double -Pareto, or log-Laplacian distribution. We give closed-form expressions of both pdf's in terms of Fox-H functions and propose numerical algorithms to approximate them. We suggest a way to elucidate if a data set follows a one-sided power law, a lognormal, a two-sided power law or a BRF. Finally, we illustrate the usefulness of these distributions in data analysis through a few examples.

In 2003 the Roadmapping Initiative of the European Center of Power Electronics (ECPE) has been started based on a future vision of society in 2020 in order to define the future role of power electronics, and to identify technological... more

In 2003 the Roadmapping Initiative of the European Center of Power Electronics (ECPE) has been started based on a future vision of society in 2020 in order to define the future role of power electronics, and to identify technological barriers and prepare new technologies well in time. In the framework of this initiative a new mathematically supported approach for the roadmapping in power electronics has been developed. As described in this paper the procedure relies on a comprehensive mathematical modeling and subsequent multi-objective optimization of a converter system. The relationship between the technological base and the performance of the system then exists as a mathematical representation, whose optimization assures the best possible exploitation of the available degrees of freedom and technologies. Thus an objective Technology Node of a system is obtained, whereby physical limits are implicitly taken into account. Furthermore, the sensitivity of the system performance with ...

A memetic algorithm for tackling multiobjective optimization problems is presented. The algorithm employs the proven local search strategy used in the Pareto archived evolution strategy (PAES) and combines it with the use of a population... more

A memetic algorithm for tackling multiobjective optimization problems is presented. The algorithm employs the proven local search strategy used in the Pareto archived evolution strategy (PAES) and combines it with the use of a population and recombination. Verification of the new M-PAES (memetic PAES) algorithm is carried out by testing it on a set of multiobjective 0/1 knapsack problems. On each problem instance, a comparison is made between the new memetic algorithm, the (1+ 1)-PAES local searcher, and the strength ...

A common problem in many areas of water resources engineering is that of analyzing hydrological and meteorological events for planning and design projects. For these purposes, information is required on rainfall events, flow depths,... more

A common problem in many areas of water resources engineering is that of analyzing hydrological and meteorological events for planning and design projects. For these purposes, information is required on rainfall events, flow depths, discharges, evapotranspiration levels, etc. that can be expected for a selected probability or return period. In the paper the software tool RAINBOW is presented which is

We consider the problem of maximum likelihood estimation of the parameters of the Pareto Type II (Lomax) distribution. We show that in certain parametrization and after modification of the parameter space to include exponential... more

We consider the problem of maximum likelihood estimation of the parameters of the Pareto Type II (Lomax) distribution. We show that in certain parametrization and after modification of the parameter space to include exponential distribution as a special case, the MLEs of parameters always exist. Moreover, the MLEs have a non standard asymptotic distribution in the exponential case due to the lack of regularity. Further, we develop a likelihood ratio test for exponentiality versus Pareto II distribution. We emphasize that this problem is non standard, and the limiting null distribution of the deviance statistic in not chi-square. We derive relevant asymptotic theory as well as a convenient computational formula for the critical values for the test. An empirical power study and power comparisons with other tests are also provided. A problem from climatology involving precipitation data from hundreds of meteorological stations across North America provides a motivation for and an illustration of the new test.

This paper describes a simulation tool to aid the design of nutrient monitoring programmes in coastal waters. The tool is developed by using time series of water quality data from a Smart Buoy, an in situ monitoring device. The tool... more

This paper describes a simulation tool to aid the design of nutrient monitoring programmes in coastal waters. The tool is developed by using time series of water quality data from a Smart Buoy, an in situ monitoring device. The tool models the seasonality and temporal dependence in the data and then filters out these features to leave a white noise series. New data sets are then simulated by sampling from the white noise series and re-introducing the modelled seasonality and temporal dependence. Simulating many independent realisations allows us to study the performance of different monitoring designs and assessment methods. We illustrate the approach using total oxidised nitrogen (TOxN) and chlorophyll data from Liverpool Bay, U.K. We consider assessments of whether the underlying mean concentrations of these water quality variables are sufficiently low; i.e. below specified assessment concentrations. We show that for TOxN, even when mean concentrations are at background, daily data from a Smart Buoy or multi-annual sampling from a research vessel would be needed to obtain adequate power. Copyright © 2009 Crown Copyright

Vilfredo Pareto (1848 – 1923) was studying the inequality of welfare distribution in Italy during the nineteenth century and developed a useful tool named „the principle 80:20“, which was later adopted in many fields to explain that a... more

Vilfredo Pareto (1848 – 1923) was studying the inequality of welfare distribution in Italy during the nineteenth century and developed a useful tool named „the principle 80:20“, which was later adopted in many fields to explain that a small number of causes can be responsible for a large percentage of effects. The principle can be applied to indicate the priority of problem solving and determine the direction of business drivers’ development. Separating the vital few from the trivial many, the management staff can improve firm performance. This paper, in particularly links the Pareto principle postulates to the decision making techniques and proposes different points of view for improving the purchasing process in a particular firm. The firm Mix Metal is a small trader in iron scrap present on the Croatian market since 2004. In this case the Pareto principle is adopted to rationalize the purchasing process and ensure better long term sales margins. The aim of the paper is to develop several points of view from which the root causes arise and problems can be interpreted. In particularly the paper tries to find for which suppliers and which types of material, the purchasing process must be reviewed and even end. The case is developed and presented using the case study methodology.

During the rainy season 2007 international institutions (e.g. WFP) and news agencies reported floods in the Sahel. Especially in August and September some news gave the impression that the whole Sahel was flooded, in contrast to the... more

During the rainy season 2007 international institutions (e.g. WFP) and news agencies reported floods in the Sahel. Especially in August and September some news gave the impression that the whole Sahel was flooded, in contrast to the droughts more frequently reported for that region. But it is well known that the precipitation patterns in the Sahel are characterized by a

We propose a new method for statistical analysis of functional magnetic resonance imaging (fMRI) data. The discrete wavelet transformation is employed as a tool for efficient and robust signal representation. We use structural magnetic... more

We propose a new method for statistical analysis of functional magnetic resonance imaging (fMRI) data. The discrete wavelet transformation is employed as a tool for efficient and robust signal representation. We use structural magnetic resonance imaging (MRI) and fMRI to empirically estimate the distribution of the wavelet coefficients of the data both across individuals and spatial locations. An anatomical subvolume probabilistic atlas is used to tessellate the structural and functional signals into smaller regions each of which is processed separately. A frequency-adaptive wavelet shrinkage scheme is employed to obtain essentially optimal estimations of the signals in the wavelet space. The empirical distributions of the signals on all the regions are computed in a compressed wavelet space. These are modeled by heavy-tail distributions because their histograms exhibit slower tail decay than the Gaussian. We discovered that the Cauchy, Bessel K Forms, and Pareto distributions provide the most accurate asymptotic models for the distribution of the wavelet coefficients of the data. Finally, we propose a new model for statistical analysis of functional MRI data using this atlas-based wavelet space representation. In the second part of our investigation, we will apply this technique to analyze a large fMRI dataset involving repeated presentation of sensory-motor response stimuli in young, elderly, and demented subjects.