Theory and Properties of a New Pareto-Gumbel Distribution
Akinyele T.W., Ogunwale O. D., Odukoya E.A., Olayinka K. P.
Department of Statistics Faculty of Science, Ekiti State University, Ado-Ekiti, Ekiti State, Nigeria
DOI: https://doi.org/10.51244/IJRSI.2025.120700025
Received: 18 June 2025; Accepted: 26 June 2025; Published: 30 July 2025
A new family of continuous distribution called Pareto-Gumbel is developed using the transform-Transformer techniques. We theoretically combine two distributions (Pareto and Gumbel) to form a new probability distribution called Pareto-Gumbel Distribution (PGD) to checkmate the limitations of the two distribution in order to address the shortcomings of the respective distributions. The Pareto distribution is often used to model the tails of another distribution, and the shape parameter ξ relates to tail-behavior, distributions with tails that decrease exponentially are modeled with shape ξ = 0,while distributions with tails that decrease as a polynomial are modeled with a positive shape parameter, distributions with finite tails are modeled with a negative shape parameter and Gumbel Extreme Value distribution, are widely applied for extreme value analysis and this distribution has certain drawbacks, because it is a non-heavy-tailed distribution and is characterized by constant skewness and kurtosis. The Gumbel distribution remains one of the mostly used statistical distributions in the frequency analysis of extreme events. This aspect is mainly due to the simple parameter estimation expressions, as well as the simple and accessible expression of the Statistical properties, the main advantage of the Gumbel distribution is the simplicity and accessibility of expressions and relationships. It is essential to understand that most distributions described in the literature were developed using transformed transformer (T-X) method. This method was proposed by Alzaatreh, et al., (2013), Adewusi, et al., (2019) and Ajewole et al (2025). This study develops a new family of continuous distribution called the Pareto-Gumbel, which had been developed by combining Pareto and Gumbel distribution using T-X techniques to form Pareto-Gumbel distribution. Several expressions for distribution theory and properties were explored and obtained; the maximum likelihood estimation approach was used to estimate the distribution parameters, with simulations conducted to assess the asymptotic behavior of these estimates.
Keywords: Pareto-Gumbel, Transform-Transformer techniques, Statistical properties, Skewness, Kurtosis Maximum Likelihood Estimation.
The Pareto distribution is used in describing social, scientific, quality control, actuarial and geophysical phenomena in a society. It is widely known for modelling phenomena where a small proportion of occurrences account for the majority of the effect. The distribution is often linked to the Pareto Principle, also called the 80/20 rule, which states that roughly 80% of effects come from 20% of causes. Empirical observation showed that this 80-20 distribution fits a wide range of cases, including natural phenomena and human activities. It is often used to model the tails of other distributions, It is specified by three parameters: location, μ, scale, β, and shape ξ. Sometimes it is specified by only scale and shape parameters and sometimes only by its shape parameter. Some literatures give the shape parameter K= − ξ in generalized extreme value distributions. It is equivalent to the exponential distribution when both μ = 0 and ξ = 0, and it is equivalent to the Pareto distribution when μ = β /ξ and ξ >0. Distributions with tails that decrease exponentially are modeled with shape ξ = 0, while distributions with tails that decrease as a polynomial are modeled with a positive shape parameter, distributions with finite tails are modeled with a negative shape parameter. Pareto distribution does not necessarily mean that the input and output must be equal to 100 percent and its distribution continually present a critical limitation in characterizing data of discrete forms and it assumes that the distribution of causes and effect is static and unchanging (`Dunindu Tennakon 2023). The Gumbel distribution (also known as the type-I generalized extreme value distribution) is used to model the distribution of the maximum (or the minimum) of a number of samples of various distributions. The potential applicability of the Gumbel distribution to represent the distribution of maxima relates to extreme value theory, which indicates that it is likely to be useful if the distribution of the underlying sample data is of the normal or exponential type. The Gumbel extreme value distribution, are widely applied for extreme value analysis and Gumbel distribution has certain drawbacks because it is a non-heavy-tailed distribution and is characterized by constant skewness and kurtosis. The applicability of the Gumbel distribution is diverse, being mainly used for frequency analysis of maximum flows, maximum precipitation and the construction of intensity duration frequency curves. The advantage of the Gumbel distribution is the simplicity and accessibility of expressions and relationships, regarding the disadvantages, it can be stated that the main disadvantage of the limited flexibility of this distribution in modeling various skewness, which generally leads to the limitation of the application of this distribution. It is a distribution whose statistical indicators have constant values, it is essential to understand that most distributions described in the literature were developed using transformed transformer (T-X) method. This method was proposed by Alzaatreh, et al., (2013), also Adewusi, et al., (2019), and Ajewole et al (2025), other researches are: Akarawak et al (2017), introduced Gamma- Rayleigh distribution as the new member of the Gamma -X family of generalized distribution. The Transformed-Transformer method was used to combine the Gamma and Rayleigh distributions. Oguntunde et.al (2015) introduces a three parameter probability model called weibull -exponential distribution using the weibull generalized family of distribution. Yazar et.al (2015). In their study, a family of generalized gamma distributions, T -gamma family, is introduced using the T -R{Y} framework. Marcelo. et.al (2015), developed a four-parameter model within this class named the exponentiated generalized Gumbel distribution using T-X techiques. Mohieddine and Ayman (2018). Introduced a new two-parameter lifetime distribution, called a new generalized of the exponential-logarithmic distribution. Adamidis K. et.al (2005) developed on an extension of the exponential-geometric distribution. Almetwally M. et.al (2020)
In this study, we proposed and explored a new distribution called Pareto-Gumbel distributions using the T-X techniques
Theorem 1: Let be continuous independent random variable such that;
follows an Pareto distribution and, let
and
be the probability density function and cumulative distribution function of Pareto distribution given as:
(1)
Cumulative distribution function for Pareto distribution is given as:
(2)
Theorem 2: Let be continuous independent random variable such that;
follows Gumbel distribution and, let
and
be the Probability density function and Cumulative distribution function of Gumbel distribution given as:
where
(3)
Cumulative Distribution function
(4)
The Formulation of the New Pareto-Gumbel Distribution
Theorem 3
Let be a continuous independent random variable such that
and let
and
be the Probability density function and Cumulative density function of Pareto-Gumbel distribution given as;
(5)
Proof
(6)
and
(7)
(8)
Putting equation 7 into equation 8 using T-X techniques
(9)
=
(10)
(11)
(12)
(10)
(12)
(13)
.
(14)
.
(15)
(16)
Equation (16) is the PDF of the new PG. The distribution has two parameters namely: (Shape),
(scale)
Statistical properties
In this section, the statistical properties of Pareto-Gumbel, particularly the first four moments, variance, and coefficient of variation, moment generating function, characteristic function, skewness, and kurtosis are obtained.
B. Moment
Theorem 2: If is a random variable distributed as an PG
having parameter
then the
non-central moment of X is given by:
Proof:
(17)
(18)
(19)
(20)
(21)
Substituting r =1, 2, 3 and 4 in equation (10 ) we obtain the first (mean), second, third and the fourth moments by for PG: we obtain the variance by the association
(ii) Mean = (22)
(iii) =
(23)
(24)
(25)
(iv) (26)
Then the 3rd and the 4th moment is given as:
(v)
(27)
(vi) =
respectively (28)
C. Moment generating function
Theorem 3: If is a continuous random variable distributed as an PG
, then the moment generating function is given as
(29)
(30)
Let,
then
,
so that (30) is reduced to
(31)
(32)
D. Characteristic Function (CF)
Theorem 5: If is a random variable distributed as an PG
, then the characteristics function
is defined as
Proof:
(33)
(34)
Let ,
,
,
(35)
(36)
(37)
(38)
E. Coefficient of Variation (C.V) is a standardized measure of dispersion of a probability distribution and is given as:
(39)
(40)
(41)
F. Skewness and Kurtosis
Skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean and is given as:
(42)
(43)
Kurtosis is a descriptor of the shape of a probability distribution and is given as;
(45)
(46)
G. Cumulative Distribution Function (CDF)
The cumulative distribution function of a random variable X evaluated at x is the probability that X will take a value less than or equal to x and is defined as;
(47)
Theorem 6: If is a continuous random variable from the Pereto Gumbel, the cumulative density function (CDF) is defined by
(48)
Proof:
(49)
(50)
(51)
(52)
(53)
(54)
(55)
(56)
PDF
Fig 1. PDF plots of PGD ξ
CDF
Fig 2. CDF plots for PGD
From figure 1: The plot reveals various possible shapes of the PGD density function, including (approximately) symmetric, skewed, and unimodal were produced. It can be seen that the tail of the distribution is longer and shorter on both right side and left side for different combinations of the values of the parameters. This demonstrates the great flexibility of the PG distribution, which makes it suitable for various real data and CDF plot shown in Figure 2 that PGD starts from zero on the y axis and tend to 1 on x axis , which is an indication that the PGD is a valid distribution because it satisfies the basic property of a valid probability distribution which states that the probability of any event is greater than or equal to zero and the sum of the cumulative probabilities of events is equal to one
H. Reliability Function
The reliability function also known as survival function is a function that measures the likelihood that a patient, device, or other object of interest will survive beyond a specific time range and is defined as:
(57)
whereis the cumulative distribution function of X, substituting,
(58)
I. Hazard Function
The hazard function also called the force of mortality, instantaneous failure rate, instantaneous death rate, or age-specific failure rate is the instantaneous risk that the event of interest happens, within a very narrow time frame and is defined as;
(59)
where and
are pdf and survival function of PG then,
Hazard function of PG (60)
J. The Maximum Likelihood Estimator
Theorem 7: Let be a random sample of size n from Pareto-Gumbel distribution (PG) with pdf
(61)
By taking the natural logarithm of (61), the log-likelihood function is obtained as;
(62)
Therefore, the MLE which maximizes (3.35) must satisfy the following normal equations;
(63)
(64)
Therefore
(65)
Differentiating equation (62) with respect to λ and
give the maximum likelihood estimates of the model parameters that generate the solution of the nonlinear system of equations. The parameters can be estimated numerically by solving (63), (64), and (65), while solving it analytically is very cumbersome and tasking. The numerical solution can also be obtained directly using some data sets in Python but there are other programming languages that could do the work.
Table 1: Empirical means and standard deviations (in parentheses) for PGD distributions.
Parameters | n = 20 | n=50 | n=100 | |||
1,0.5,0.5, | 1.1527 | 0.288 | 1.0551 | 0.7736 | 1.0721 | 0.7111 |
-0.551 | -0.2057 | -0.2887 | -0.3372 | -0.3222 | -0.2466 | |
0.5,0.5,0.5, | 0.5267 | 0.5426 | 0.5011 | 0.5129 | 0.3755 | 0.612 |
-0.2622 | -0.3451 | -0.1254 | -0.1771 | -0.1111 | -0.1288 | |
2,2,2 | 2.0666 | 1.6561 | 2.0116 | 1.5552 | 2.0154 | 1.5531 |
-0.2655 | -0.2205 | -0.2712 | -0.1725 | -0.3113 | -0.1252 | |
3,0.1,2 | 3.0265 | 0.0656 | 3.0055 | 0.1011 | 3.114 | 0.1025 |
-0.2647 | -0.1157 | -0.2002 | -0.0281 | -0.1157 | -0.0222 | |
1,2,3 | 1.2061 | 1.5588 | 1.0516 | 2.0235 | 1.0512 | 1.6555 |
-0.2122 | -0.2654 | -0.2507 | -0.2012 | -0.2077 | -0.153 | |
4,3,2 | 4.0175 | 3.0045 | 4.015 | 3.0126 | 4.0156 | 3.0162 |
-0.1576 | -0.1574 | -0.0516 | -0.0515 | -0.0347 | -0.0624 |
Simulations were conducted to assess the accuracy of the maximum likelihood estimators (MLEs) for the parameters of the PGD distribution. The aim is to determine whether the MLEs consistently converge toward the actual parameter values as the sample size increases. In this simulation study, 1000 samples were generated using sample sizes ranging from 20, 50, and 100 for PGD distributions. The performance of estimates is evaluated based on their bias of the MLEs of the model parameter for the simulation study; the empirical means and standard deviation of the parameters were obtained as follows in Table 1
The simulation study demonstrates that with larger sample sizes, the empirical means approach the true parameter values more closely and the estimates become more consistent, as indicated by the decreasing standard deviations. These findings strongly suggest that the maximum likelihood method is highly effective for estimating the Pareto-Gumbel Distribution (PGD) parameters
A new distribution was developed called Pareto-Gumbel distribution using T-X techniques. The newly developed distribution as two parameters. We explored and generated several expressions for distribution theory and properties including the first four moment, moment generating function, characteristics function cumulative distribution function skewness, kurtosis ,Hazard function, Reliability function and the maximum likelihood estimation approach was use to estimate the parameters of the distribution, and simulation demonstrating that, as sample sizes increase, the empirical means converge to the true parameter values, and biases and mean squared errors (MSEs) approach zero. Additionally, the standard deviations decreased in all cases with larger sample sizes, confirming that the pareto-Gumbel distribution provides stable and reliable parameter estimates.