[This lecture was partially created with the support of Artificial Intelligence tools (ChatGPT)]
Introduction
Non-stationary extreme value distributions extend classical extreme value theory to situations where the statistical properties of extremes change over time or depend on external factors. In many real-world systems, the assumption of stationarity—constant distribution parameters over time—is unrealistic. Climate change, urbanization, economic development, and technological shifts can all introduce trends or variability in the behavior of extreme events. For example, the intensity of rainfall extremes may increase with rising temperatures, or financial market crashes may become more frequent during periods of economic instability.
To account for such changes, non-stationary models allow the parameters of extreme value distributions to vary as functions of time or covariates, like for instance temperature or CO2 concentration. Commonly used extreme value models include the Generalized Extreme Value (GEV) distribution for block maxima and the Generalized Pareto Distribution (GPD) for threshold exceedances. In a non-stationary framework, parameters such as location, scale, or shape can be expressed as functions of explanatory variables, such as time, climate indices, or other environmental drivers.
This approach enables researchers to capture evolving risk patterns and better understand how extremes respond to underlying processes. Non-stationary extreme value models are widely used in fields such as hydrology, climate science, engineering, finance, and environmental risk assessment. They are particularly important for estimating return levels—values expected to be exceeded on average once every specified number of years—under changing conditions.
However, modeling non-stationarity introduces additional challenges. Researchers must select appropriate covariates, ensure sufficient data to estimate time-varying parameters, and avoid overfitting complex models. Careful statistical diagnostics and model validation are therefore essential.
Overall, non-stationary extreme value analysis provides a powerful framework for studying extremes in dynamic systems. By relaxing the assumption of constant statistical behavior, it allows scientists and practitioners to produce more realistic assessments of rare but impactful events in a changing world.
An interesting experiment: a "homemade" non-stationary Gumbel distribution
Let's now try an interesting experiment. With the series of the annual maxima, therefore in the domain of the block maxima method, let's try to estimate the parameters of non-stationary Gumbel distribution by:
- Computing the frequency of not exceedance of each value and assuming it is a function of time (besides being a function of the value of random variable);
- Picking up randomly parameter values for the Gumbel distribution, by assuming that they are also a function of time, and compute the probability of each value accordingly. Note: assuming that scale and location depend on time means introducing additional parameters in the body of the distribution;
- Finding the parameter values that minimise the absolute differences between frequency of not exceedance and probability (both depending on time).
In doing so, we are again approximating a maximum likelihood estimation as we did here.
Let's try to make this experiment by working with simulated data, with the following R code:
#Define the maxima vector
maxg=c()
#Generation of random samples and extraction of the maxima
for (i in 1:1000)
maxg[i]=max(rnorm(100),0+0.0004*i,1+0.0004*i)
#Compute frequency not exceedance of the maxima
#Note: maxima are kept in the original order, as reordering would destroy the dependence on time
freqne=ppoints(maxg)
freqne[order(maxg)]=freqne
Let's estimate on the obtained a Gumbel distribution with \(\mu\) and \(\sigma\) linearly increasing with time, therefore given by:
\(\mu=\mu_0+\mu_1*t\)
and
\(\sigma=\sigma_0+\sigma_1*t\)
therefore obtaining a distribution with four parameters, \(\mu_0\), \(\mu_1\), \(\sigma_0\) and \(\sigma_1\) to which we assign trial values and estimate their optimal value by maximising our "homemade" likelihood. <\p>
#Define vector of probability of pmax and parameters
pmax=c()
#Set trial values for the parameters describing the variability of mu and sigma on time
trial=c(mean(maxg),0.1,sd(maxg),0.1)
#Define the function to compute the probabilities with the Gumbel distribution and compute the objective function to be optimised
stgumbel=function(trial,maxg)
{
#Compute probability of maxima according to the Gumbel
for(i in 1:1000)
{
#Set parameters varying in time
mu=trial[1]+trial[2]*i
sigma=trial[3]+trial[4]*i
pmax[i]=exp(-exp(-(maxg[i]-mu)/sigma))
}
diffp=sum((freqne-pmax)^2)
return(diffp)
}
#Find the vector trial that minimises diffp
paramgumbel=optim(trial,fn=stgumbel,maxg=maxg)
Let's check the fit graphically and compute KS statistic.
#Compute probability of maxima according to the estimated Gumbel
for(i in 1:1000)
{
#Set parameters varying in time
mu=paramgumbel$par[1]+paramgumbel$par[2]*i
sigma=paramgumbel$par[3]+paramgumbel$par[4]*i
pmax[i]=exp(-exp(-(maxg[i]-mu)/sigma))
}
#Compare sample probability with theoretical probability graphically
plot(freqne,pmax,xlab="Sample frequency",ylab="Theoretical probability")
grid()
abline(c(0,0),c(1,1))
diffp=freqne-pmax
#Compute KS statistic
KS=max(abs(diffp))
For the sake of comparison let's now estimate a stationary Gumbel.
#Compute mean and standard deviation
#Open a new plot window
dev.new()
#Define vector of probability of pmax and parameters
pmax=c()
#Set trial values for the parameters describing the variability of mu and sigma on time
trial=c(mean(maxg),sd(maxg))
#Define the function to compute the probabilities with the Gumbel distribution and compute the objective function to be optimised
stgumbel=function(trial,maxg)
{
#Compute probability of maxima according to the Gumbel
for(i in 1:1000)
{
#Set parameters varying in time
mu=trial[1]
sigma=trial[2]
pmax[i]=exp(-exp(-(maxg[i]-mu)/sigma))
}
diffp=sum((freqne-pmax)^2)
return(diffp)
}
#Find the vector trial that minimises diffp
paramgumbel1=optim(trial,fn=stgumbel,maxg=maxg)
#Let's check the fit graphically and compute KS statistic
#Compute probability of maxima according to the estimated Gumbel
for(i in 1:1000)
{
#Set parameters varying in time
mu=paramgumbel1$par[1]
sigma=paramgumbel1$par[2]
pmax[i]=exp(-exp(-(maxg[i]-mu)/sigma))
}
#Compare sample probability with theoretical probability graphically
plot(freqne,pmax,xlab="Sample frequency",ylab="Theoretical probability")
grid()
abline(c(0,0),c(1,1))
diffp1=freqne-pmax
#Compute KS statistic
KS1=max(abs(diffp1))
#And compare KS and KS1 it with the critical value for p=0.05
crit=1.36/sqrt(1000)
After gaining experience with this first application of a non-stationary Gumbel, let's now explore the literature for the general case of the non-stationary GEV.
GEV distribution with time-varying parameters
As we know, the standard GEV distribution is characterized by three parameters: location, scale, and shape. The location parameter determines the central tendency of the extreme values. The scale parameter controls the spread or variability of extremes. The shape parameter governs the behavior of the tail and determines whether extremes are bounded or heavy-tailed.
In a time-varying (non-stationary) GEV model, one or more parameters change as a function of time. Typically, the location parameter is modeled as a linear or nonlinear function of time or external covariates. For example, increasing temperatures associated with Climate Change may lead to a gradual shift in the location parameter of extreme temperature distributions. Similarly, the scale parameter may vary to reflect increasing variability in extremes. Allowing parameters to evolve over time enables the model to capture trends and structural changes in extreme events.
Allowing such variability in time means including additional parameters in the structure of the GEV distribution, namely, we need to add the parameters controlling the change along time of the classical GEV parameters. For instance,
\(\mu(t)=\beta_0+\beta_1t\)
\(\sigma(t)=exp(\gamma_0+\gamma_1t)\)
\(\xi(t)=\xi_0\)
Here, the location parameter evolves linearly with time, allowing the center of the distribution of extremes to shift. The exponential form for the scale parameter ensures that \(σ(t)>0\) while allowing the variability of extremes to change gradually. In many applications, the shape parameter is kept constant because it is difficult to estimate reliably, although it may also be modeled as time-dependent.
More generally, parameters can depend on external covariates \(z(t)\), such as temperature or climate indices:
\(μ(t)=\beta_0+\beta_1z(t)\)
\(σ(t)=exp(\gamma_0+\gamma_1z(t))\)
In the above equations, \(\beta_0\), \(\beta_1\), \(\gamma_0\) and \(\gamma_1\) are additional parameters to be estimated. Increasing the number of parameters increase the variance of the estimation and, therefore, uncertainty. Therefore the extension to the non-stationary case is to be carefully evaluated depending on data availability, the characteristics of the design project and the underlying knowledge about climate change.
Operationally the estimation of a non.stationary GEV can be easily carried out with the help of software. In R we can use the code that follows.
install.packages("ismev")
library(ismev)
# Data. We make reference to the Po river case
poflow=read.table("Po-pontelagoscuro1920-2026.txt", header = FALSE, sep = " ", col.names = c("Date", "Flow"), stringsAsFactors = FALSE, skip=1)
# Convert Date column to Date format
poflow$Date=as.Date(poflow$Date, format = "%d/%m/%Y")
#Extract annual maxima
poflow$Year=format(poflow$Date, "%Y")
annual_max=aggregate(Flow ~ Year, data = poflow, max)
flow=poflow$Flow
year=poflow$Flow$Year
# Fit a non-stationary GEV with time-dependent location parameter. Scale and shape remain constant.
fit=gev.fit(annual_max$Flow, ydat = annual_max$Year, mul = 1)
# mul=1 allows the location parameter to change in time
# View results
fit$mle
fit$se
# Allow both location and scale to vary with time
fit1=gev.fit(annual_max$Flow, ydat = annual_max$Year, mul = 1,sigl = 1)
# sigl=1 allows scale depending on year
# Diagnostic plots gev.diag(fit)
This produces:
- QQ plot;
- return level plot;
- residual diagnostics.
Now let's assess the variability of the location parameter in time:
beta0=fit$mle[1]
beta1=fit$mle[2]
mu_t=beta0 + beta1 * year
plot(year, mu_t, type="l",xlab="Year",ylab="Location parameter",main="Time-varying GEV location parameter")
An analogous assessment can be made to check the variability in time of the shape parameter.
In summary, A possible modelling strategy may be:
- Fit stationary GEV;
- Fit non-stationary GEV;
- Compare models using likelihood ratio test or AIC.
Box: the Akaike Information Criterion
The Akaike Information Criterion (AIC) is a statistical measure used for model selection. It helps choose the model that best balances good fit and parsimonious structure, in terms of number of parameters. Parsimony means reduced uncertainty and therefore it is a desirable feature in technical applications.
Suppose we have a statistical model fitted by maximum likelihood. Let \(L\) be the maximum value of the likelihood function for the model and let \(k\) be the number of estimated parameters in the model itself. The AIC is defined as:
\(AIC = 2k - 2\ln(L)\).
The first term \(2k\) penalizes model complexity, while the second term \(-2\ln(L)\) rewards goodness of fit. Therefore, a model with higher likelihood has a smaller (-2\ln(L)), but adding many parameters increases \(k\), and therefore the penalty, thus avoid overfitting.
When comparing models, the solution with the lowest AIC may be preferable for seeking a compromise between goodness-of-the-fit and robustness. AIC does not measure absolute model quality, as it rather measures relative quality.
For small samples, a corrected version is often used, given by:
\(AIC_c = AIC + \frac{2k(k+1)}{n-k-1}\)
An example of application to the above estimation of GEV for the Po River is given by the code below
An alternative approach to fitting a non-stationary GEV distribution is given by the application of Generalized Additive Models for Location, Scale and Shape (GAMLSS), which allow nonlinear trends in time instead of just linear ones. In R this is implemented in the package "gamlss". This approach lets you model each parameter of the GEV (location \(\mu\), scale \(\sigma\), shape \(\xi\)) as smooth functions of time. The code is:
install.packages(c("gamlss","gamlss.dist"))
library(gamlss)
library(gamlss.dist)
# Prepare the data
datapo=data.frame(annual_max$Flow, annual_max$Year)
#Fit a nonlinear time-dependent GEV. Here the location parameter varies smoothly with time fit_ns=gamlss(annual_max$Flow ~ pb(annual_max$Year), sigma.formula = ~ 1, nu.formula = ~ 1, family = GEV, data = datapo)
#This corresponds to location being a smooth function of year, scale and shape constant
#If we want to allow scale to vary with time as well then: fit_ns=gamlss(annual_max$Flow ~ pb(annual_max$Year), sigma.formula =~pb(annual_max$Year), nu.formula = ~ 1, family = GEV, data = datapo)
Now, let's move forward with model diagnostic:
plot(fit_ns)
wp(fit_ns) # worm plot diagnostics
#Plot the estimated time-varying location parameter
mu_hat = fitted(fit_ns, "mu")
plot(year, mu_hat, type="l", xlab="Year", ylab="Estimated location parameter", main="Non-stationary GEV location parameter")
# Compare stationary vs non-stationary models fit_stat=gamlss(annual_max$Flow ~ pb(annual_max$Year), sigma.formula = ~ 1, nu.formula = ~ 1, family = GEV, data = datapo)
AIC(fit_stat, fit_ns)
#Lower AIC indicates the better model.
Time varying return level
In the non-stationary GEV model, return period id varying with time (e.g., 50-year flood changing with time). This is often the key result in extreme-value studies. The code is as follows.
sigma_t=fitted(fit_ns, "sigma")
xi_t=fitted(fit_ns, "nu")
year=annual_max$Year
#Compute return levels
#Example for a **50-year return level**: T=50
p=1 - 1/T
RL50=qGEV(p, mu = mu_t, sigma = sigma_t, nu = xi_t)
#Here `qGEV()` comes from gamlss.dist
#Plot the time-varying return level plot(year, RL50, type = "l", lwd = 2, xlab = "Year", ylab = "50-year return level", main = "Time-varying 50-year flood estimate")
This plot shows how the estimated 50-year extreme flow changes through time. Now, let's compare different return periods.
RL100 = qGEV(1 - 1/100, mu_t, sigma_t, xi_t)
plot(year, RL50, type="l", lwd=2, ylim=range(c(RL20,RL100)), xlab="Year", ylab="Return level", main="Time-varying return levels") lines(year, RL20, lty=2)
lines(year, RL100, lty=3)
legend("topleft", legend=c("20-year","50-year","100-year"), lty=c(2,1,3), bty="n")
If the curve increases with time, extreme floods are becoming larger. If flat, extremes look stationary. If decreasing, extremes are weakening. This approach is widely used in **non-stationary flood frequency analysis**.
Last modified on March 17, 2026
- 36 viste




