 #jsDisabledContent { display:none; } My Account | Register | Help Flag as Inappropriate This article will be permanently flagged as inappropriate and made unaccessible to everyone. Are you certain this article is inappropriate?          Excessive Violence          Sexual Content          Political / Social Email this Article Email Address:

# Gaussian copula

Article Id: WHEBN0020996532
Reproduction Date:

 Title: Gaussian copula Author: World Heritage Encyclopedia Language: English Subject: Collection: Publisher: World Heritage Encyclopedia Publication Date:

### Gaussian copula

In probability theory and statistics, a copula is a multivariate probability distribution for which the marginal probability of each variable is uniformly distributed. Copulas are used to describe the dependence between random variables. They are named for their resemblance to grammatical copulas in linguistics.

Sklar's Theorem states that any multivariate joint distribution can be written in terms of univariate marginal distribution functions and a copula which describes the dependence structure between the variables.

Copulas are popular in statistical applications as they allow one to easily model and estimate the distribution of random vectors by estimating marginals and copula separately. There are many parametric copula families available, which usually have parameters that control the strength of dependence. Some popular parametric copula models are outlined below. The formula was also adapted to Wall Street, where it took on a life of its own, used to estimate the probability distribution of losses on pools of loans or bonds. The formula was used to estimate risk, creating "evaluation cultures" that took the predictions of the formula as hard probabilities with which to make risk assessments.

## Mathematical definition

Consider a random vector $\left(X_1,X_2,\dots,X_d\right)$. Suppose its margins are continuous, i.e. the marginal CDFs $F_i\left(x\right) = \mathbb\left\{P\right\}\left[X_i\leq x\right]$ are continuous functions. By applying the probability integral transform to each component, the random vector

$\left(U_1,U_2,\dots,U_d\right)=\left\left(F_1\left(X_1\right),F_2\left(X_2\right),\dots,F_d\left(X_d\right)\right\right)$

has uniform margins.

The copula of $\left(X_1,X_2,\dots,X_d\right)$ is defined as the joint cumulative distribution function of $\left(U_1,U_2,\dots,U_d\right)$:

$C\left(u_1,u_2,\dots,u_d\right)=\mathbb\left\{P\right\}\left[U_1\leq u_1,U_2\leq u_2,\dots,U_d\leq u_d\right] .$

The copula C contains all information on the dependence structure between the components of $\left(X_1,X_2,\dots,X_d\right)$ whereas the marginal cumulative distribution functions $F_i$ contain all information on the marginal distributions.

The importance of the above is that the reverse of these steps can be used to generate pseudo-random samples from general classes of multivariate probability distributions. That is, given a procedure to generate a sample $\left(U_1,U_2,\dots,U_d\right)$ from the copula distribution, the required sample can be constructed as

$\left(X_1,X_2,\dots,X_d\right) = \left\left(F_1^\left\{-1\right\}\left(U_1\right),F_2^\left\{-1\right\}\left(U_2\right),\dots,F_d^\left\{-1\right\}\left(U_d\right)\right\right).$

The inverses $F_i^\left\{-1\right\}$ are unproblematic as the $F_i$ were assumed to be continuous. The above formula for the copula function can be rewritten to correspond to this as:

$C\left(u_1,u_2,\dots,u_d\right)=\mathbb\left\{P\right\}\left[X_1\leq F_1^\left\{-1\right\}\left(u_1\right),X_2\leq F_2^\left\{-1\right\}\left(u_2\right),\dots,X_d\leq F_d^\left\{-1\right\}\left(u_d\right)\right] .$

## Definition

In probabilistic terms, $C:\left[0,1\right]^d\rightarrow \left[0,1\right]$ is a d-dimensional copula if C is a joint cumulative distribution function of a d-dimensional random vector on the unit cube $\left[0,1\right]^d$ with uniform marginals.

In analytic terms, $C:\left[0,1\right]^d\rightarrow \left[0,1\right]$ is a d-dimensional copula if

• $C\left(u_1,\dots,u_\left\{i-1\right\},0,u_\left\{i+1\right\},\dots,u_d\right)=0$, the copula is zero if one of the arguments is zero,
• $C\left(1,\dots,1,u,1,\dots,1\right)=u$, the copula is equal to u if one argument is u and all others 1,
• C is d-increasing, i.e., for each hyperrectangle $B=\prod_\left\{i=1\right\}^\left\{d\right\}\left[x_i,y_i\right]\subseteq \left[0,1\right]^d$ the C-volume of B is non-negative:
$\int_B dC\left(u\right) =\sum_\left\{\mathbf z\in \times_\left\{i=1\right\}^\left\{d\right\}\\left\{x_i,y_i\\right\}\right\} \left(-1\right)^\left\{N\left(\mathbf z\right)\right\} C\left(\mathbf z\right)\ge 0,$
where the .

For instance, in the bivariate case, $C:\left[0,1\right]\times\left[0,1\right]\rightarrow \left[0,1\right]$ is a bivariate copula if $C\left(0,u\right) = C\left(u,0\right) = 0$, $C\left(1,u\right) = C\left(u,1\right) = u$ and $C\left(u_2,v_2\right)-C\left(u_2,v_1\right)-C\left(u_1,v_2\right)+C\left(u_1,v_1\right) \geq 0$ for all $0 \leq u_1 \leq u_2 \leq 1$ and $0 \leq v_1 \leq v_2 \leq 1$.

## Sklar's theorem

Sklar's theorem, named after Abe Sklar, provides the theoretical foundation for the application of copulas. Sklar's theorem states that a multivariate cumulative distribution function

$H\left(x_1,\dots,x_d\right)=\mathbb\left\{P\right\}\left[X_1\leq x_1,\dots,X_d\leq x_d\right]$

of a random vector $\left(X_1,X_2,\dots,X_d\right)$ with marginals $F_i\left(x\right) = \mathbb\left\{P\right\}\left[X_i\leq x\right]$ can be written as

$H\left(x_1,\dots,x_d\right) = C\left\left(F_1\left(x_1\right),\dots,F_d\left(x_d\right) \right\right),$

where $C$ is a copula.

The theorem also states that, given $H$, the copula is unique on $\operatorname\left\{Ran\right\}\left(F_1\right)\times\cdots\times \operatorname\left\{Ran\right\}\left(F_d\right)$, which is the cartesian product of the ranges of the marginal cdf's. This implies that the copula is unique if the marginals $F_i$ are continuous.

The converse is also true: given a copula $C:\left[0,1\right]^d\rightarrow \left[0,1\right]$ and margins $F_i\left(x\right)$ then $C\left\left(F_1\left(x_1\right),\dots,F_d\left(x_d\right) \right\right)$ defines a d-dimensional cumulative distribution function.

## Fréchet–Hoeffding copula bounds

The Fréchet–Hoeffding Theorem (after Maurice René Fréchet and Wassily Hoeffding ) states that for any Copula $C:\left[0,1\right]^d\rightarrow \left[0,1\right]$ and any $\left(u_1,\dots,u_d\right)\in\left[0,1\right]^d$ the following bounds hold:

$W\left(u_1,\dots,u_d\right) \leq C\left(u_1,\dots,u_d\right) \leq M\left(u_1,\dots,u_d\right).$

The function W is called lower Fréchet–Hoeffding bound and is defined as

$W\left(u_1,\ldots,u_d\right) = \max\left\\left\{1-d+\sum\limits_\left\{i=1\right\}^d \left\{u_i\right\} , 0 \right\\right\}.$

The function M is called upper Fréchet–Hoeffding bound and is defined as

$M\left(u_1,\ldots,u_d\right) = \min \\left\{u_1,\dots,u_d\\right\}.$

The upper bound is sharp: M is always a copula, it corresponds to comonotone random variables.

The lower bound is point-wise sharp, in the sense that for fixed u, there is a copula $\tilde\left\{C\right\}$ such that $\tilde\left\{C\right\}\left(u\right) = W\left(u\right)$. However, W is a copula only in two dimensions, in which case it corresponds to countermonotonic random variables.

In two dimensions, i.e. the bivariate case, the Fréchet–Hoeffding Theorem states

$\max\left(u+v-1,0\right) \leq C\left(u,v\right) \leq \min\\left\{u,v\\right\}$

## Families of copulas

Several families of copulae have been described.

### Gaussian copula

The Gaussian copula is a distribution over the unit cube $\left[0,1\right]^d$. It is constructed from a multivariate normal distribution over $\mathbb\left\{R\right\}^d$ by using the probability integral transform.

For a given correlation matrix $R\in\mathbb\left\{R\right\}^\left\{d\times d\right\}$, the Gaussian copula with parameter matrix $R$ can be written as

$C_R^\left\{\text\left\{Gauss\right\}\right\}\left(u\right) = \Phi_R\left\left(\Phi^\left\{-1\right\}\left(u_1\right),\dots, \Phi^\left\{-1\right\}\left(u_d\right) \right\right),$

where $\Phi^\left\{-1\right\}$ is the inverse cumulative distribution function of a standard normal and $\Phi_R$ is the joint cumulative distribution function of a multivariate normal distribution with mean vector zero and covariance matrix equal to the correlation matrix $R$.

The density can be written as

$c_R^\left\{\text\left\{Gauss\right\}\right\}\left(u\right)$

= \frac{1}{\sqrt{\det{R}}}\exp\left(-\frac{1}{2} \begin{pmatrix}\Phi^{-1}(u_1)\\ \vdots \\ \Phi^{-1}(u_d)\end{pmatrix}^T \cdot \left(R^{-1}-\mathbf{I}\right) \cdot \begin{pmatrix}\Phi^{-1}(u_1)\\ \vdots \\ \Phi^{-1}(u_d)\end{pmatrix} \right), where $\mathbf\left\{I\right\}$ is the identity matrix.

### Archimedean copulas

Archimedean copulas are an associative class of copulas. Most common Archimedean copulas admit an explicit formula, something not possible for instance for the Gaussian copula. In practice, Archimedean copulas are popular because they allow modeling dependence in arbitrarily high dimensions with only one parameter, governing the strength of dependence.

A copula C is called Archimedean if it admits the representation

$C\left(u_1,\dots,u_d;\theta\right) = \psi^\left\left(\psi\left(u_1;\theta\right)+\cdots+\psi\left(u_d;\theta\right);\theta\right\right) \,$

where $\psi\!:\left[0,1\right]\times\Theta \rightarrow \left[0,\infty\right)$ is a continuous, strictly decreasing and convex function such that $\psi\left(1;\theta\right)=0$. $\theta$ is a parameter within some parameter space $\Theta$. $\psi$ is the so-called generator function and $\psi^$ is its pseudo-inverse defined by

$\psi^\left(t;\theta\right) = \left\\left\{\begin\left\{array\right\}\left\{ll\right\} \psi^\left\{-1\right\}\left(t;\theta\right) & \mbox\left\{if \right\}0 \leq t \leq \psi\left(0;\theta\right) \\ 0 & \mbox\left\{if \right\}\psi\left(0;\theta\right) \leq t \leq\infty. \end\left\{array\right\}\right. \,$

Moreover, the above formula for C yields a copula for $\psi^\left\{-1\right\}\,$ if and only if $\psi^\left\{-1\right\}\,$ is d-monotone on $\left[0,\infty\right)$. That is, if it is $d-2$ times differentiable and the derivatives satisfy

$\left(-1\right)^k\psi^\left\{-1,\left(k\right)\right\}\left(t;\theta\right) \geq 0 \,$

for all $t\geq 0$ and $k=0,1,\dots,d-2$ and $\left(-1\right)^\left\{d-2\right\}\psi^\left\{-1,\left(d-2\right)\right\}\left(t;\theta\right)$ is nonincreasing and convex.

The following table highlights the most prominent bivariate Archimedean copulas with their corresponding generator. Note that not all of them are completely monotone, i.e. d-monotone for all $d\in\mathbb\left\{N\right\}$ or d-monotone for certain $\theta \in \Theta$ only.

Table with the most important generators
name bivariate copula $\,C_\theta\left(u,v\right)$ generator $\,\psi_\left\{\theta\right\}\left(t\right)$ generator inverse $\,\psi_\left\{\theta\right\}^\left\{-1\right\}\left(t\right)$ parameter $\,\theta$
Clayton $\frac\left\{1\right\}\left\{\theta\right\}\,\left(t^\left\{-\theta\right\}-1\right)\,$ $\left\left(1+\theta t\right\right)^\left\{-1/\theta\right\}$ $\theta\in\left[-1,\infty\right)\backslash\\left\{0\\right\}$
Ali-Mikhail-Haq $\frac\left\{uv\right\}\left\{1-\theta \left(1-u\right)\left(1-v\right)\right\}$ $\log\!\left\left(\frac\left\{1-\theta \left(1-t\right)\right\}\left\{t\right\}\right\right)$ $\frac\left\{1-\theta\right\}\left\{\exp\left(t\right)-\theta\right\}$ $\theta\in\left[-1,1\right)$
Gumbel $\exp\!\left\left( -\left\left( \left(-\log\left(u\right)\right)^\theta + \left(-\log\left(v\right)\right)^\theta \right\right)^\left\{1/\theta\right\} \right\right)$ $\left\left(-\log\left(t\right)\right\right)^\theta$ $\exp\!\left\left(-t^\left\{1/\theta\right\}\right\right)$ $\theta\in\left[1,\infty\right)$
Frank $-\frac\left\{1\right\}\left\{\theta\right\} \log\!\left\left( 1+\frac\left\{\left(\exp\left(-\theta u\right)-1\right)\left(\exp\left(-\theta v\right)-1\right)\right\}\left\{\exp\left(-\theta\right)-1\right\} \right\right)$ $-\log\!\left\left(\frac\left\{\exp\left(-\theta t\right)-1\right\}\left\{\exp\left(-\theta\right)-1\right\}\right\right)$ $-\frac\left\{1\right\}\left\{\theta\right\}\,\log\left(1+\exp\left(-t\right)\left(\exp\left(-\theta\right)-1\right)\right)$ $\theta\in \mathbb\left\{R\right\}\backslash\\left\{0\\right\}$
Joe $1-\left\left( \left(1-u\right)^\theta + \left(1-v\right)^\theta - \left(1-u\right)^\theta\left(1-v\right)^\theta \right\right)^\left\{1/\theta\right\}$ $1-\left\left(1-\exp\left(-t\right)\right\right)^\left\{1/\theta\right\}$ $-\log\!\left\left(1-\left(1-t\right)^\theta\right\right)$ $\theta\in\left[1,\infty\right)$
Independence $uv$ $\exp\left(-t\right)\,$ $-\log\left(t\right)\,$

## Empirical copulas

When studying multivariate data, one might want to investigate the underlying copula. Suppose we have observations

$\left(X_1^i,X_2^i,\dots,X_d^i\right), \, i=1,\dots,n$

from a random vector $\left(X_1,X_2,\dots,X_d\right)$ with continuous margins. The corresponding "true" copula observations would be

$\left(U_1^i,U_2^i,\dots,U_d^i\right)=\left\left(F_1\left(X_1^i\right),F_2\left(X_2^i\right),\dots,F_d\left(X_d^i\right)\right\right), \, i=1,\dots,n.$

However, the marginal distribution functions $F_i$ are usually not known. Therefore, one can construct pseudo copula observations by using the empirical distribution functions

$F_k^n\left(x\right)=\frac\left\{1\right\}\left\{n\right\} \sum_\left\{i=1\right\}^n \mathbf\left\{1\right\}\left(X_k^i\leq x\right)$

instead. Then, the pseudo copula observations are defined as

$\left(\tilde\left\{U\right\}_1^i,\tilde\left\{U\right\}_2^i,\dots,\tilde\left\{U\right\}_d^i\right)=\left\left(F_1^n\left(X_1^i\right),F_2^n\left(X_2^i\right),\dots,F_d^n\left(X_d^i\right)\right\right), \, i=1,\dots,n.$

The corresponding empirical copula is then defined as

$C^n\left(u_1,\dots,u_d\right) = \frac\left\{1\right\}\left\{n\right\} \sum_\left\{i=1\right\}^n \mathbf\left\{1\right\}\left\left(\tilde\left\{U\right\}_1^i\leq u_1,\dots,\tilde\left\{U\right\}_d^i\leq u_d\right\right).$

The components of the pseudo copula samples can also be written as $\tilde\left\{U\right\}_k^i=R_k^i/n$, where $R_k^i$ is the rank of the observation $X_k^i$:

$R_k^i=\sum_\left\{j=1\right\}^n \mathbf\left\{1\right\}\left(X_k^j\leq X_k^i\right)$

Therefore, the empirical copula can be seen as the empirical distribution of the rank transformed data.

## Monte Carlo integration for copula models

In statistical applications, many problems can be formulated in the following way. One is interested in the expectation of a response function $g:\mathbb\left\{R\right\}^d\rightarrow\mathbb\left\{R\right\}$ applied to some random vector $\left(X_1,\dots,X_d\right)$. If we denote the cdf of this random vector with $H$, the quantity of interest can thus be written as

$\mathbb\left\{E\right\} \left\left[ g\left(X_1,\dots,X_d\right) \right\right] = \int_\left\{\mathbb\left\{R\right\}^d\right\} g\left(x_1,\dots,x_d\right) \, dH\left(x_1,\dots,x_d\right).$

If $H$ is given by a copula model, i.e.,

$H\left(x_1,\dots,x_d\right)=C\left(F_1\left(x_1\right),\dots,F_d\left(x_d\right)\right)$

this expectation can be rewritten as

$\mathbb\left\{E\right\}\left\left[g\left(X_1,\dots,X_d\right)\right\right]=\int_\left\{\left[0,1\right]^d\right\}g\left(F_1^\left\{-1\right\}\left(u_1\right),\dots,F_d^\left\{-1\right\}\left(u_d\right)\right) \, dC\left(u_1,\dots,u_d\right).$

In case the copula C is absolutely continuous, i.e. C has a density c, this equation can be written as

$\mathbb\left\{E\right\}\left\left[g\left(X_1,\dots,X_d\right)\right\right]=\int_\left\{\left[0,1\right]^d\right\}g\left(F_1^\left\{-1\right\}\left(u_1\right),\dots,F_d^\left\{-1\right\}\left(u_d\right)\right)c\left(u_1,\dots,u_d\right) \, du_1\cdots du_d.$

If copula and margins are known (or if they have been estimated), this expectation can be approximated through the following Monte Carlo algorithm:

1. Draw a sample $\left(U_1^k,\dots,U_d^k\right)\sim C\;\;\left(k=1,\dots,n\right)$ of size n from the copula C
2. By applying the inverse marginal cdf's, produce a sample of $\left(X_1,\dots,X_d\right)$ by setting $\left(X_1^k,\dots,X_d^k\right)=\left(F_1^\left\{-1\right\}\left(U_1^k\right),\dots,F_d^\left\{-1\right\}\left(U_d^k\right)\right)\sim H\;\;\left(k=1,\dots,n\right)$
3. Approximate $\mathbb\left\{E\right\}\left\left[g\left(X_1,\dots,X_d\right)\right\right]$ by its empirical value:
$\mathbb\left\{E\right\}\left\left[g\left(X_1,\dots,X_d\right)\right\right]\approx \frac\left\{1\right\}\left\{n\right\}\sum_\left\{k=1\right\}^n g\left(X_1^k,\dots,X_d^k\right)$

## Applications

### Quantitative finance

In risk/portfolio management, copulas are used to perform stress-tests and robustness checks that are especially important during “downside/crisis/panic regimes” where extreme downside events may occur (i.e., the global financial crisis of 2008–2009) During a downside regime, a large number of investors who have held positions in riskier assets such as equities or real estate may seek refuge in ‘safer’ investments such as cash or bonds. This is also known as a flight-to-quality effect and investors tend to exit their positions in riskier assets in large numbers in a short period of time. As a result, during downside regimes, correlations across equities are greater on the downside as opposed to the upside and this may have disastrous effects on the economy.   For example, anecdotally, we often read financial news headlines reporting the loss of hundreds of millions of dollars on the stock exchange in a single day; however, we rarely read reports of positive stock market gains of the same magnitude and in the same short time frame.

Copulas are useful in portfolio/risk management and help us analyse the effects of downside regimes by allowing the modelling of the marginals and dependence structure of a multivariate probability model separately. For example, consider the stock exchange as a market consisting of a large number of traders each operating with his/her own strategies to maximize profits. The individualistic behaviour of each trader can be described by modelling the marginals. However, as all traders operate on the same exchange, each traders’ actions have an interaction effect with other traders'. This interaction effect can be described by modelling the dependence structure. Therefore, copulas allow us to analyse the interaction effects which are of particular interesting during downside regimes as investors tend to herd their trading behaviour and decisions.

Previously, scalable copula models for large dimensions only allowed the modelling of elliptical dependence structures (i.e., Gaussian and Student-t copulas) that do not allow for correlation asymmetries where correlations differ on the upside or downside regimes. However, the recent development of vine copulas (also known as pair copulas) enables the flexible modelling of the dependence structure for portfolios of large dimensions.  The Clayton canonical vine copula allows for the occurrence of extreme downside events and has been successfully applied in portfolio choice and risk management applications. The model is able to reduce the effects of extreme downside correlations and produces improved statistical and economic performance compared to scalable elliptical dependence copulas such as the Gaussian and Student-t copula.  Other models developed for risk management applications are panic copulas that are glued with market estimates of the marginal distributions to analyze the effects of panic regimes on the portfolio profit and loss distribution. Panic copulas are created by Monte Carlo simulation, mixed with a re-weighting of the probability of each scenario.

As far as derivatives pricing is concerned, dependence modelling with copula functions is widely used in applications of financial risk assessment and actuarial analysis – for example in the pricing of collateralized debt obligations (CDOs). Some believe the methodology of applying the Gaussian copula to credit derivatives to be one of the reasons behind the global financial crisis of 2008–2009. Despite this perception, there are documented attempts of the financial industry, occurring before the crisis, to address the limitations of the Gaussian copula and of copula functions more generally, specifically the lack of dependence dynamics and the poor representation of extreme events. There have been attempts to propose models rectifying some of the copula limitations.

While the application of copulas in credit has gone through popularity as well as misfortune during the global financial crisis of 2008–2009, it is arguably an industry standard model for pricing CDOs. Copulas have also been applied to other asset classes as a flexible tool in analyzing multi-asset derivative products. The first such application outside credit was to use a copula to construct an implied basket volatility surface, taking into account the volatility smile of basket components. Copulas have since gained popularity in pricing and risk management  of options on multi-assets in the presence of volatility smile/skew, in equity, foreign exchange and fixed income derivative business. Some typical example applications of copulas are listed below:

• Analyzing and pricing volatility smile/skew of exotic baskets, e.g. best/worst of;
• Analyzing and pricing volatility smile/skew of less liquid FX cross, which is effectively a basket: C = S1/S2 or C = S1*S2;

### Civil engineering

Recently, copula functions have been successfully applied to the database formulation for the reliability analysis of highway bridges, and to various multivariate simulation studies in civil, mechanical and offshore engineering.

### Medicine

Copula functions have been successfully applied to the analysis of spike counts in neuroscience. 

### Weather research

Copulas have been extensively used in climate and weather related research.

### Random vector generation

Large synthetic traces of vectors and stationary time series can be generated using empirical copula while preserving the entire dependence structure of small datasets. Such empirical traces are useful in various simulation-based performance studies.