Out of Sample Value-at-Risk and Backtesting with the Standardized Pearson Type-IV Skewed Distribution

An early version of this paper was presented at the Annual International Conference on Macroeconomics Analysis and International Finance, May 24-26, 2012, at University of Crete, Rethymno. The authors would like to thank the discussant Georgios Tsiotas, the chair of the session Sophie Béreau, and the conference participants for many helpful comments and discussions. Out of Sample Value-at-Risk and Backtesting with the Standardized Pearson Type-IV Skewed Distribution

In the last decades a variety of financial crises took place such as the worldwide market collapse in 1987, the Mexican crisis in 1995, the Asian and Russian financial crises in 1997-1998, the Orange County default, the Barings Bank, the dot.com bubble and Long Term Capital Management bankruptcy, and the financial crisis of 2007-2009 which lead several banks to bankruptcy, with Lehman Brothers being the most noticeable case.Such financial uncertainty has increased the likelihood of financial institutions to suffer substantial losses as a result of their exposure to unpredictable market changes, and financial regulators as well as the supervisory committee of banks have favored quantitative risk techniques that can be used for the evaluation of the potential loss.
Basle I Agreement, introduced in late 1980's, was the first main vehicle in setting up the regulatory framework as a consequence of the aforementioned financial disasters that took place.The main point was the risk classification on assets, forcing the banks to provide sufficient capital adequacy against these assets, based on their respective risks.However, since in this framework banks were given an incentive to transfer risky assets of their balance sheets, and the fact that it was possible for banks to treat assets that were insured as government securities with zero risk, it turned out that this attempt had adverse effects, due to the fact that Basle I put a low risk weight on loans by banks to financial institutions.Attempting to remedy some of these problems created since the implementation of Basel I Agreement, Basel II was introduced PANOECONOMICUS, 2013, 2, Special Issue, pp.231-247 in the 1990's and put in full implementation in 2007.A central feature of the modified Basel II Accord was to allow banks to develop and use their own internal risk management models, under the condition that these models were "back tested" and "stress tested" under extreme circumstances.
Value-at-Risk (VaR), defined by Fred Stambaugh (1996) and Jorion Philippe (2000) as a certain amount lost on a portfolio of financial assets with a given probability over a fixed number of days, has become a standard tool used by financial analysts to measure market risk, because of its simplicity to quantify market risk by a single number.Since it has a probabilistic point of view, several approaches in estimating the profit and loss distribution function of portfolio returns have been developed in the last decades, and a substantial literature of empirical applications have emerged, providing an overall support of VaR as an appropriate measure of risk.Initially there was a focus on the left tail of the distribution which corresponds to negative returns, indicating the computation of VaR for a long trading position portfolio, but more recent approaches deal with modeling VaR for both the long and short trading position.
A stylized fact in the literature is that stock returns for mature and emerging stock markets behave as martingale processes with leptokurtic distributions (Benoit Mandelbrot 1963;Eugene F. Fama 1965), and conditionally heteroskedastic errors (Mandelbrot 1967;Bruce D. Fielitz 1971).According to Paul De Grauwe (2009), the Basle Accords have failed to provide stability to the banking sector because the risks linked with universal banks are tail risks associated with bubbles and crises.From the probabilistic point of view, the precise prediction of the tail probability of an asset's return is an important issue in VaR, because the extreme movements in the tail provide critical information on the data generation stochastic process.Although there is a variety of empirical models to account for the volatility clustering and conditional heteroskedasticity like, GARCH (Tim Bollerslev 1986), IGARCH (Robert F. Engle and Bollerslev 1986), EGARCH (Daniel B. Nelson 1991), TARCH (Lawrence R. Glosten, Ravi Jagannathan, and David E. Runkle 1993), APARCH (Zhuanxin Ding, Clive William John Granger, and Engle 1993), FIGARCH (Richard T. Baillie, Bollerslev, and Hans Ole Mikkelsen 1996), FIGARCHC (C.F. Chung 1999), FIE-GARCH (Bollerslev and Mikkelsen 1996), FIAPARCH (Yiu Kuen Tse 1998), FIAPARCHC (Chung 1999), HYGARCH (James Davidson 2004), there are few options for the financial analyst regarding the probability density function (PDF) schemes that can be used.These include, the standard normal distribution (Engle 1982), which does not account for fat-tails and it is symmetric, the Student-t distribution (Bollerslev 1987), which is fat-tailed but symmetric, and the Generalized Error Distribution (GED), which is more flexible than the Student-t including both fat and thick tails, introduced by Mikhail F. Subbotin (1923) and applied by Nelson (1991).However, taking in account that in the VaR framework both the long and short positions should be considered, Pierre Giot and Sébastien Laurent (2003) have shown that models which rely on symmetric density distribution for the error term underperform, due to the fact that the PDF of asset returns is non-symmetric, and the use of the skewed Student-t distribution, in the sense of Carmen Fernandez and Mark F. J. Steel (1998) has been implemented (Philippe Lambert and Sébastien Laurent 2000).Recently skewed distributions have also been studied by Georgios Tsiotas (2012).
The aim of this paper is to reconsider the Value-at-Risk where the volatility clustering and returns are modelled via a typical GARCH (1,1) model, and the innovations process follows a standardized form of the Pearson type-IV distribution.The model and the distribution are fitted to the data via maximization of the logarithm of the maximum likelihood estimator (MLE).As a case study we consider the last 5000 returns of the Dow Jones Industrial Average (DJIA) up to 31 December 2010, including the recent 2007-2009 financial crisis.We examine the in sample and out of sample efficiency of the model for both the long and short trading position, and VaR backtesting is performed by the success-failure ratio, the Kupiec Likelihood-ratio (LR) test, the Christoffersen independence and conditional coverage test, the expected shortfall with related measures, and the dynamic quantile test of Engle and Manganelli.The results, compared with the skewed Student-t distribution, in the sense of Fernandez and Steel (1998), indicate that the Pearson type-IV distribution improves the value of the MLE and gives accurate VaR results.The remainder of the paper is organized as follows.Section 1 reviews briefly the Pearson type-IV distribution.In Section 2 we present the financial data used and the econometric methodology followed.Section 3 reports on the VaR analysis, Section 4 provide the in sample and out of sample procedure and VaR results and Section 5 discusses the concluding remarks.

The Pearson Type-IV Distribution
The Pearson system of distributions is a generalization of the differential equation leading to the Gaussian distribution, to a differential equation with solution: Such an attempt indicated a way to construct probability distributions in which the skewness (standardized third cumulant) and kurtosis (standardized fourth cumulant) could be adjusted equally freely, in order to fit theoretical models to datasets that exhibited skewness.In a series of papers Karl Pearson (1893Pearson ( , 1895Pearson ( , 1901Pearson ( , 1916) ) classified seven types of distributions from Equation (1), the Gaussian distribution (Pearson type-0), the Beta (Pearson type-I), the Gamma distribution (Pearson type-III), the Beta prime distribution (Pearson type-VI), and the Student-t distribution (Pearson type-VII), while some extra classes IX-XII are also discussed (Pearson 1916).In the case where the discriminant is negative, after rearrangement of the terms in Equation (1) we conclude on the Pearson type-IV distribution in its recent form in the literature (Yuichi Nagahara 1999Nagahara , 2004Nagahara , 2007)): In Equation ( 2  . (4) Application to financial time series using the method of moments has been performed by Kurt Brännäs and Niklas Nordman (2003), Gamini Premaratne and Anil K. Bera (2005), Malay Bhattacharyya, Abhishek Chaudhary, and Gaurav Yadav (2008), and a review is provided by Michael A. Magdalinos and George P. Mitsopoulos (2007).Since the variable domain of the Pearson type-IV distribution is ) , (   , Shaohua Chen and Hong Nie (2008) proposed a lognormal sum approximation using a variant of the Pearson distribution to account for the ) , 0 (  domain.R. Willink (2008) indicated a closed-form expression for the Pearson Type IV Distribution Function.David Ashton and Mark Tippett (2006), derived the Pearson type-IV distribution from a stochastic differential equation with standard Markov properties, and they commented on the distributional properties on selected time series.Matteo Grigoletto andFrancesco Lisi (2009, 2011), incorporated constant and dynamic conditional skewness and kurtosis into a GARCH-type structure with the Pearson type-IV distribution, and they performed in and out of sample VaR with the Kupiec and Christoffersen tests.Fabio Pizzutilo (2012) analyzed the European market using the Pearson system of continuous distributions, and compared the results using the different types of the Pearson system.
The cumulative density function (CDF) needed for the calculation of the constants at the confidence intervals, was recently calculated by Joel Heinrich (2004): for the long position and, for the short position.

The Data
If the value of an asset has been recorded for a sufficient time, a common way to analyze the time evolution of the returns is successive differences of the natural logarithm of price t P , 100 . As a case study we consider the last 5000 returns of the Dow Jones Industrial Average (DJIA) up to 31.December 2010.

The Model
We consider a univariate time series GARCH (1,1) model where the innovations follow a Pearson type-IV distribution (Stavros Stavroyiannis et al. 2012): To keep the ARCH tradition, it is important to express the density in terms of the mean and of the variance of the distribution, and not as used up to now, in terms of the scale and location coefficient, in order to acquire a standardized distribution with zero mean and unit variance, to satisfy the martingale hypothesis used in financial time series.Solving for  and a , and substituting into the PDF we get, 1   , we have for the new variable to be used, Replacing the tail coefficient of the Pearson type-IV equations to a Student-like version by, Gamma functions, and integrating the PDF to acquire the normalization constant, we conclude on the standardized form of the Pearson type-IV distribution, The log-likelihood for the proposed model is as follows, where, The squared ratio of the complex Gamma function (Milton Abramowitz and Irene A. Stegun 1965) in Equation ( 13) was calculated by transcribing the C++ source code (Heinrich 2004) to Matalb®.Within an error of the order ) 10 ( 10 O , instead of using Equations ( 5) and ( 6) with the cost of the complex GHF, the constants at the confidence intervals can be also computed, to speed up computational time, using an adaptive quadrature (Walter Gander and Walter Gautschi 2000; Lawrence F. Shampine 2008) based on a Gauss-Kronrod pair (15 th and 7 th order formulas), via numerical integration of the normalized PDF.The algorithmic recurrence in Equation ( 8) uses the sample mean of squared residuals to start recursion, and for the numerical optimization, the Broyden-Fletcher-Goldfarb-Shanno (BFGS) method for the inverse Hessian update is used.
The optimization results for the in sample case, the constant in mean  , the constant in variance  , the ARCH term  , the GARCH term  , the persistence of the model (    ), the tail coefficient m , the asymmetry coefficient  , the associated t-statistics in the parentheses1 , and the value of the MLE, are shown in Table 1, for the Pearson type-IV and the skewed Student-t distributions.The Pearson type-IV distribution appears to describe better the assets return distribution leading to an improved value of the MLE.

Value-at-Risk Models
Having estimated the unknown parameters of the model, the VaR for the a - percentile of the assumed distribution can be calculated straightforward using the equation  Diamandis et al. 2011), and over the last decade a variety of tests have been proposed that can be used to investigate the fundamental properties of a proposed VaR model.The accuracy of these VaR estimates is of concern to both financial institutions and their regulators.As noted by Francis X. Diebold and Jose A. Lopez (1996), it is unlikely that forecasts from a model will exhibit all the properties of accurate forecasts.Thus, evaluating VaR estimates solely upon whether a specified property is present may yield only limited information regarding their accuracy (Yu Chuan Huang and Bor-Jing Lin 2004).In this work we consider five accuracy measures; the success-failure ratio, the Kupiec LR-test, the Christoffersen independence and conditional coverage tests, the expected shortfall with related measures, and the dynamic quantile test of Engle and Manganelli.

Success -Failure Ratio
A typical way to examine a VaR model is to count the number of VaR violations when portfolio losses exceed the VaR estimates.An accurate VaR approach produces a number of VaR breaks as close as possible to the number of VaR breaks specified by the confidence level.If the number of violations is more than the selected confidence level would indicate then the model underestimates the risk.On the other hand, if the number of violations is less, then the model overestimates the risk.The test is conducted as T x / , where T is the total number of observations, and x is the number of violations for the specific confidence level.

Kupiec LR Test
However, it is rarely the case that the exact amount suggested by the confidence level is observed therefore, it comes down to whether the number of violations is reasonable or not before a model is accepted or rejected.The most widely known test based PANOECONOMICUS, 2013, 2, Special Issue, pp.231-247 on failure rates is the Proportion of Failures (POF) by Paul H. Kupiec (1995).Measuring whether the number of violations is consistent with the confidence level, under null hypothesis that the model is correct the number of violations follows the binomial distribution.The Kupiec test (unconditional coverage) is best conducted as a likelihood-ratio (LR) test where the test statistics takes the form: where, T is the total number of observations, x is the number of violations, and p is the specified confidence level.Under the null hypothesis that the model is correct,  distribution, the null hy- pothesis is rejected and the model is considered to be inaccurate.Therefore, the risk model is rejected if it generates too many or too few violations; however, based on that assumption a model that generates dependent exceptions can be also accepted as accurate.

Christoffersen Independence, and Conditional Coverage Tests
In order to check whether the exceptions are spread evenly over time or they form clustering, the Christoffersen interval forecast test (conditional coverage) is used (Peter F. Christoffersen 1998).This Markov test examines whether or not the likelihood of a VaR violation depends on whether or not a VaR violation occurred on the previous day.If the VaR measure accurately reflects the underlying risk then the chance of violating today's VaR should be independent of whether or not yesterday's VaR was violated.Assigning an indicator that takes the value 1 if VaR is exceeded and 0 otherwise, and defining ij n the number of days where the condition j occurred assuming that condition i occurred the previous day, the results can be displayed in a contingency 2 2  table.
and is asymptotically 2  distributed with one degree of freedom.In the case where 0 11  n , indicating no violation clustering, either due to few observations or rather high confidence levels, the test is conducted as (Christoffersen and Denis Pelletier 2004), which discards as a result the NaN's (not a number) that appear in several works in the literature.
Joining the two criteria, the Kupiec test and the Christoffersen independence test, the Christoffersen conditional coverage (CC) is achieved.The test statistics for conditional coverage is asymptotically 2  distributed with two degrees of freedom:

Expected Shortfall and Tail Measures
In the sense of Philippe Artzner et al. (1997Artzner et al. ( , 1999)), VaR is not considered as a coherent measure of risk since, in the properties a coherent measure functional must satisfy on an appropriate probabilistic space, the sub-additivity property does not hold for all cases.Specific portfolios can be constructed where the risk of a portfolio with two assets can be greater than the sum of the individual risks therefore, violating sub-additivity and in general the diversification principle (Oliver Scaillet 2000).Expected shortfall is a coherent measure of risk and it is defined as the expected value of the losses conditional on the loss being larger than the VaR.One expected shortfall measure associated with a confidence level p  1 denoted as p  , is the Tail Conditional Expectation (TCE) of a loss given that the loss is larger than p  , that is: Darryll Hendricks (1996) indicates that two measures can be constructed, the ESF1 which is the expected value of loss exceeding the VaR level, and ESF2 which is the expected value of loss exceeding the VaR level, divided by the associated VaR values.

Dynamic Quantile Test of Engle-Manganelli
Engle and Simeone Manganelli (1999Manganelli ( , 2004) ) suggest using a linear regression model linking current violations to past violations so as to test the conditional efficiency hypothesis.Let be the demeaned process on a associated to ) (a Considering the following regression model, where t  is an i.i.d.process and where (.) g is a function of past violations and of variables k t z  , from the available information set 1   t .Whatever the chosen speci- fication, the null hypothesis test of conditional efficiency corresponds to testing the joint nullity of coefficients, k  , k  , and of constant  : Therefore, the current VaR violations are uncorrelated to past violations since (consequence of the independence hypothesis), whereas the unconditional coverage hypothesis is verified when 0   .The Wald statistics, noted CC DQ , in association with the test of conditional efficiency hypothesis then verifies,

In Sample and Out of Sample Procedure and VaR Results
We examine the validity and accuracy of the econometric model by performing the aforementioned statistical test and the results are compared to the skewed Student-t distribution2 (Jurgen A. Doornik 2009).As rule of thumb, we consider a result to be better if there is a change at the second decimal place.A better result is indicated with bold fonts at the tables, an equal result, meaning that the results are equal or there is a change beyond the second decimal point, is indicated in italics fonts, and a worst result is indicated in regular fonts.

In Sample VaR Results
We use the estimation results to compute the one-step-ahead VaR for the long and short trading position for several confidence levels which range from 5% to 0.1%.The results are shown in Table 2 which includes the success/failure ratio, the Kupiec likelihood ratio and p-value, the Christoffersen independence (unconditional coverage) likelihood ratio and p-value, the Christoffersen joint test (conditional coverage) likelihood ratio and p-value, the expected shortfall measures ESF1 and ESF2, and the statistics and p-value for the dynamic quantile test.

Out of Sample VaR Results
The testing methodology in the previous subsection is equivalent to back-testing the model on the estimation sample.In the literature, it is argued that this should be favorable to the tested model and out-of-sample forecasts, where the model is estimated on the known returns and the VaR forecast is made for some period , where h is the time horizon of the forecasts.In our implementation the testing procedure for the long and short VaR assumes 1  h day, and we use the approach described in Giot and Laurent (2003).The first estimation sample is the complete sample for which the data is available less the last five years.The predicted one-day-ahead VaR (both for long and short positions) is then compared with the observed return and both results are recorded for later assessment using the statistical tests.At the i -th iteration where i runs from 2 to 5·252 (five years of data), the es- timation sample is augmented to include one more day and the VaR are forecasted and recorded.Whenever i is a multiple of 50, the model is re-estimated to update the Pearson type-IV GARCH parameters.Therefore, the model parameters are updated every 50 trading days and a "stability window" of 50 days for the parameters is assumed.The procedure is iterated until all days (less the last one) have been included in the estimation sample.Corresponding failure rates are then computed by comparing the long and short forecasted 1  t VaR with the observed return 1  t y for all days in the five years period.Using the aforementioned in sample statistical tests for the out of sample VaR the results are shown in Table 3.

Discussion and Conclusions
In this work we have presented the implementation of an econometric model where the volatility clustering is modeled by a GARCH(1,1) process and the innovations follow a Pearson type-IV distribution.The model was tested in-sample and out-ofsample and the accuracy was examined by a variety of statistical tests, the success/failure ratio, the Kupiec-LR test, the two Christoffersen tests accounting for independence and conditional coverage, the ESF1 and ESF2 measures, and the Dynamic Quantile test of Engle and Manganelli.The main findings are: The Pearson type-IV distribution improves the maximum likelihood estimator in all cases we have studied, compared with the skewed Student-t distribution, for both the in-sample (Stavroyiannis et al. 2011) and out-of sample cases.This indicates PANOECONOMICUS, 2013, 2, Special Issue, pp.231-247 that it approaches better the skewness and leptokurtosis of the PDF of financial assets returns and therefore, the underlying associated data generation process.Another issue is that, in contrast to the skewed Student-t distribution which is an artifact distribution, the Pearson type-IV distribution describes the whole PDF using one function resulting from a solid differential equation, capable of transforming, according to the conditions, to the most common distributional schemes.
The Pearson type-IV distribution appears to perform better than the skewed Student-t at the Kupiec-LR test and the joint test of Christoffersen.However due to the small number of out of sample observations, it is difficult to judge the unconditional Christoffersen test at high confidence levels, therefore the joint test has been left out of the comparison.In the Expected Shortfall measures and the DQ-test cases, the proposed model performs very well for the out of sample case as shown in Table 3.
In conclusion, the VaR and statistical tests results indicate that the model is accurate, within the general financial risk modeling perspective, and it provides the financial analyst with an additional distributional scheme to be used in econometric modeling.

Appendix
) the parameters are described as follows; 0  a is the scale parameter,  is the location parameter, 2 / 1  m controls the kurtosis,  the asymmetry of the distribution.The distribution is negatively skewed Student's t-distribution (Pearson type-VII) with m degrees of freedom.The mean and the variance of the distribution which are of interest are given by: , for algorithmic and fast convergence issues of the PANOECONOMICUS, 2013, 2, Special Issue, pp.231-247 Lun Tang and Shwu-Jane Shieh 2006) which under the Pearson type-IV distribution for the long and short position is, of the cumulative distribution function at the specific confidence level is understood.Each time an observation exceeds the VaR border it is called a VaR violation, or VaR breech, or VaR break.Verifying the accuracy of risk models used in setting the market risk capital requirements demands backtesting (David G. McMillanand and Dimos Kambouroudis 2009; Anastassios A. Drakos, Georgios P. Kouretas, and Zarangas 2010; Nikolaos Giannellis, Angelos Kanas, and Athanasios P. Papadopoulos 2010; Panayiotis F.


distributed with one degree of freedom.If the value of the POF LR -statistic exceeds the critical value of the 2 Letting i  represent a probability of observing a violation conditional on state i on the previous day,

Table 2
In Sample Results with the Standardized Pearson Type-IV Distribution Source: Authors' calculations.

Table 3
Out of Sample Results with the Standardized Pearson Type-IV Distribution Source: Authors' calculations.