Variance-Gamma approximation via Stein's method

Robert E. Gaunt

Introduction

In 1972, Stein introduced a powerful method for deriving bounds for normal approximation. Since then, this method has been extended to many other distributions, such as the Poisson , Gamma , , Exponential , and Laplace , . Through the use of differential or difference equations, and various coupling techniques, Stein’s method enables many types of dependence structures to be treated, and also gives explicit bounds for distances between distributions.

At the heart of Stein’s method lies a characterisation of the target distribution and a corresponding characterising differential or difference equation. For example, Stein’s method for normal approximation rests on the following characterization of the normal distribution, which can be found in Stein , namely $Z\sim N(\mu,\sigma^{2})$ if and only if

for all sufficiently smooth $f$ . This gives rise to the following inhomogeneous differential equation, known as the Stein equation:

where $Z\sim N(\mu,\sigma^{2})$ , and the test function $h$ is a real-valued function. For any bounded test function, a solution $f$ to (1.2) exists (see Lemma 2.4 of Chen et al. ). There are a number of techniques for obtaining Stein equations, such as the density approach of Stein et al. , the scope of which has recently been extended by Ley and Swan . Another commonly used technique is a generator approach, introduced by Barbour . This approach involves recognising the target the distribution as the stationary distribution of Markov process and then using the theory of generators of stochastic process to arrive at a Stein equation; for a detailed overview of this method see Reinert . Luk used this approach to obtain the following Stein equation for the $\Gamma(r,\lambda)$ distribution:

The next essential ingredient of Stein’s method is smoothness estimates for the solution of the Stein equation. This can often be done by solving the Stein equation using standard solution methods for differential equations and then using direct calculations to bound the required derivatives of the solution (Stein used the approach to bound the first two derivatives of the solution to the normal Stein equation (1.2)). The generator approach is also often used to obtain smoothness estimates. The use of probabilistic arguments to bound the derivatives of the solution often make it easier to arrive at smoothness estimates than through the use of analytical techniques. Luk and Pickett used the generator approach to bound $k$ -th order derivatives of the solution of the $\Gamma(r,\lambda)$ Stein equation (1.3). Pickett’s bounds are as follows

In this paper we obtain the key ingredients required to extend Stein’s method to the class of Variance-Gamma distributions. The Variance-Gamma distributions are defined as follows (this parametrisation is similar to that given in Finlay and Seneta ).

The Variance-Gamma distributions were introduced to the financial literature by Madan and Seneta . For certain parameter values the Variance-Gamma distributions have semi heavy tails that decay slower than the tails of the normal distribution, and therefore are often appropriate for financial modelling.

The class of Variance-Gamma distributions includes the Laplace distribution as a special case and in the appropriate limits reduces to the normal and Gamma distributions. This family of distributions also contains many other distributions that are of interest, which we list in the following proposition (the proof is given in Appendix A). As far as the author is aware, this is the first list of characterisations of the Variance-Gamma distributions to appear in the literature.

The representations of the Variance-Gamma distributions given in Proposition 1.2 enable us to determine a number of statistics that may have asymptotic Variance-Gamma distributions.

One of the main results of this paper (see Lemma 3.1) is the following Stein equation for the Variance-Gamma distributions:

In Section 3, we analyse the Stein equation (1.7). In particular, we show that the normal Stein equation (1.2) and Gamma Stein equation (1.3) are special cases. As a Stein equation for a given distribution is not unique (see Barbour ), the fact that in the appropriate limit the Variance-Gamma Stein equation (1.7) reduces to the known normal and Gamma Stein equation is an attractive feature.

Stein’s method has also recently been extended to the Laplace distribution (see Pike and Ren and Döbler ), although the Laplace Stein equation obtained by differs from the Laplace Stein equation that arises as a special case of (1.7); see Section 3.1.1 for a more detailed discussion. Another special case of the Stein equation (1.7) is a Stein equation for the product of two independent central normal random variables, which is in agreement with the Stein equation for products of independent central normal that was recently obtained by Gaunt . Therefore, the results from this paper allow the existing literature for Stein’s method for normal, Gamma, Laplace and product normal approximation to be considered in a more general framework.

More importantly, our development of Stein’s method for the Variance-Gamma distributions allows a number of new situations to be treated by Stein’s method. In Section 4, we illustrate our method by obtaining a bound for the distance between the statistic

local approach couplings, and symmetry arguments, that were introduced by Pickett , we obtain a $O(m^{-1}+n^{-1})$ bound for smooth test functions. A similar phenomena was observed in chi-square approximation by Pickett, and also by Goldstein and Reinert in which they obtained $O(n^{-1})$ convergence rates in normal approximation, for smooth test functions, under the assumption of vanishing third moments. For non-smooth test functions we would, however, expect a $O(m^{-1/2}+n^{-1/2})$ convergence rate (cf. Berry-Esséen Theorem (Berry and Esséen ) to hold; see Remark 4.11.

The rest of this paper is organised as follows. In Section 2, we introduce the Variance-Gamma distributions and state some of their standard properties. In Section 3, we obtain a characterising lemma for the Variance-Gamma distributions and a corresponding Stein equation. We also obtain the unique bounded solution of the Stein equation, and present uniform bounds for the first four derivatives of the solution for the case $\theta=0$ . In Section 4, we use Stein’s method for Variance-Gamma approximation to bound the distance between the statistic (1.8) and its limiting Variance-Gamma distribution. We then apply this bound to an application of binary sequence comparison, which is a simple special case of the more general problem of word sequence comparison. In Appendix A, we include the proofs of some technical lemmas that are required in this paper. Appendix B provides a list of some elementary properties of modified Bessel functions that we make use of in this paper.

The class of Variance-Gamma distributions

In this section we present the Variance-Gamma distributions and some of their standard properties. Throughout this paper we will make use of two different parametrisations of the Variance-Gamma distributions; the first parametrisation was given in Section 1, and making the change of variables

leads to another useful parametrisation. This parametrisation can be found in Eberlein and Hammerstein .

The first parametrisation leads to simple characterisations of the Variance-Gamma distributions in terms of normal and Gamma distributions, and therefore in many cases it allows us to recognise statistics that will have an asymptotic Variance-Gamma distribution. For this reason, we state our main results in terms of this parametrisation. However, the second parametrisation proves to be very useful in simplifying the calculations of Section 3, as the solution of the Variance-Gamma Stein equation has a simpler representation for this parametrisation. We can then state the results in terms of the first parametrisation by using (2.1).

The Variance-Gamma distributions have moments of arbitrary order (see Eberlein and Hammerstein ), in particular the mean and variance (for both parametrisations) of a random variable $X$ with a Variance-Gamma distribution are given by

The following proposition, which can be found in Bibby and Sørensen , shows that the class of Variance-Gamma distributions is closed under convolution, provided that the random variables have common values of $\theta$ and $\sigma$ (or, equivalently, common values of $\alpha$ and $\beta$ in the second parametrisation).

Variance-Gamma random variables can be characterised in terms of independent normal and Gamma random variables. This characterisation is given in the following proposition, which can be found in Barndorff-Nielsen et al. .

Using Proposition 2.4 we can establish the following useful representation of the Variance-Gamma distributions, which appears to be a new result. Indeed, the representation allows us to see that the statistic (1.8) has an asymptotic Variance-Gamma distribution.

Let $X_{1},X_{2},...,X_{r}$ and $Y_{1},Y_{2},...,Y_{r}$ be sequences of independent standard normal random variables. Then $X_{i}^{2}$ , $i=1,2,...,m$ , has a $\chi_{(1)}^{2}$ distribution, that is a $\Gamma(1/2,1/2)$ distribution. Define

Stein’s method for Variance-Gamma distributions

Note that the tails are in general not symmetric.

Firstly, we consider $I_{1}$ . Let $A(x)=x$ , $B(x)=2\nu+1+2\beta x$ and $C(x)=(2\nu+1)\beta-(1-\beta^{2})x$ . Then applying integration by parts twice gives

as $K_{\nu}(x)$ is a solution of the modified Bessel differential equation (see (B.10)).

Using formula (B.7) to differentiate $K_{\nu}(x)$ gives

We now calculate the limit in the above expression. We first consider the case $\nu>0$ . Applying the asymptotic formula (B.2) gives

since $\nu\Gamma(\nu)=\Gamma(\nu+1)$ . Now consider the case $\nu=0$ . We use the fact that $K_{1}(x)=K_{-1}(x)$ to obtain

since $\Gamma(1)=\Gamma(2)$ . Therefore we have

Finally, we consider the case $-1/2<\nu<0$ . We use the fact that $K_{-\lambda}(x)=K_{\lambda}(x)$ to obtain

As $f$ is continuous, it follows that $I_{1}=-I_{2}$ (and so $I_{1}+I_{2}=0$ ), which completes the proof of necessity.

This solution and its first derivative are bounded (see Lemma 3.3) and is piecewise twice differentiable. As $f_{z}$ and $f_{z}^{\prime}$ are bounded, they satisfy the condition (3.3) (with $\alpha=1$ ) and $f_{z}^{\prime\prime}$ must also satisfy the condition because, from (3.5),

for some constants $A$ and $B$ . Hence, if (3.2) holds for all piecewise twice continuously differentiable functions satisfying (3.3) (with $\alpha=1$ ), then by (3.5),

which, recalling (1.3), we recognise as the $\Gamma(s,\lambda)$ Stein equation (1.3) of Luk (up to a multiplicative factor).

which in the limit $r\rightarrow\infty$ is the classical $N(\mu,\sigma^{2})$ Stein equation.

Taking $r=1$ , $\sigma=\sigma_{X}\sigma_{Y}$ and $\mu=0$ in (1.7) gives the following Stein equation for distribution of the product of independent $N(0,\sigma_{X}^{2})$ and $N(0,\sigma_{Y}^{2})$ random variables (see part (iii) of Proposition 1.2):

This Stein equation is in agreement with the Stein equation for the product of two independent, zero mean normal random variables that was obtained by Gaunt .

They have also solved (3.9) and have obtained uniform bounds for the solution and its first three derivatives. Their characterisation was obtained by a repeated application of the density method, and is similar to the characterisation for the Exponential distribution that results from the density method (see Stein et al. , Example 1.6), which leads to the Stein equation

1.2 Applications of Lemma 3.1

The main application of Lemma 3.1 that is considered in this paper involves the use of the resulting Stein equation in the proofs of the limit theorems of Section 4. There are, however, other interesting results that follow from Lemma 3.1. We consider a couple here.

Solving this equation subject to the condition $M(0)=1$ then gives that the moment generating function of the Variance-Gamma distribution with $\mu=0$ is

which in terms of the first parametrisation is

We have that $M_{0}=1$ and $M_{1}=(2\nu+1)\beta/(\alpha^{2}-\beta^{2})=r\theta$ (see (2.3)), and thus we can solve these recurrence equations by forward substitution to obtain the moments of the Variance-Gamma distributions. As far as the author is aware, these recurrence equations are new, although Scott et al. have already established a formula for the moments of general order of the Variance-Gamma distributions.

2 Smoothness estimates for the solution of the Stein equation

In the following lemma we give the solution to the Stein equation. The proof is given in Appendix A.

is very useful when it comes to obtaining smoothness estimates for the solution to the Stein equation. The equality ensures that we can restrict our attention to bounding the derivatives in the region $x\geq 0$ , provided we obtain these bounds for both positive and negative $\beta$ .

The bounds given in Lemma 3.5 are of order $\nu^{-1/2}$ as $\nu\rightarrow\infty$ , except when $2\nu$ is not equal to an integer, but is sufficiently close to an integer that

Gaunt remarked that the rogue $1/\sin(\pi\nu)$ term appeared to be an artefact of the analysis that was used to obtain the bounds.

Limit theorems for Symmetric-Variance Gamma distributions

We now consider the Symmetric Variance-Gamma ( $\theta=0$ ) limit theorem that we discussed in the introduction. Let $\mathbf{X}$ be a $m\times r$ matrix of independent and identically random variables $X_{ik}$ with zero mean and unit variance. Similarly, we let $\mathbf{Y}$ be a $n\times r$ matrix of independent and identically random variables $Y_{jk}$ with zero mean and unit variance, where the $Y_{jk}$ are independent of the $X_{ik}$ . Then the statistic

We first consider the case $r=1$ ; the general $r$ case follows easily as $W_{r}$ is a linear sum of independent $W_{1}$ . For ease of reading, in the statement of the following theorem and in its proof we shall set $X_{i}\equiv X_{i1}$ , $Y_{j}\equiv Y_{j1}$ and $W\equiv W_{1}$ . Then we have the following:

Notice that the statistic $W=\frac{1}{\sqrt{mn}}\sum_{i,j=1}^{m,n}X_{i}Y_{j}$ is symmetric in $m$ and $n$ and the random variables $X_{i}$ and $Y_{j}$ , and yet the bound (4.1) of Theorem 4.1 is not symmetric in $m$ and $n$ and the moments of $X$ and $Y$ . This asymmetry is a consequence of the local couplings that we used to obtain the bound.

Before proving Theorem 4.1, we introduce some notation and preliminary lemmas. We define the standardised sum $S$ and $T$ by

and we have that $W=ST$ . In our proof we shall make use of the sums

which are independent of $X_{i}$ and $Y_{j}$ , respectively. We therefore have the following formulas

In the proof of Theorem 4.1 we use the following lemma, which can be found in Pickett , Lemma 4.3.

Due to the independence of the $X_{i}$ and $Y_{j}$ variables, we are in the realms of the local approach coupling. We Taylor expand $f(W)$ about $S_{i}T$ to obtain

As $ST-S_{i}T=\frac{1}{\sqrt{m}}X_{i}T$ , we obtain

We begin by bounding $R_{1}$ and $R_{2}$ . Taylor expanding $f^{\prime\prime}(S_{i}T)$ about $W$ and using (4.2) gives

The bound for $R_{2}$ is immediate. We have

Taylor expanding $f^{\prime\prime}(W)$ about $S_{i}T$ gives

where we used independence and that the $X_{i}$ have zero mean to obtain the final inequality. Putting this together we have that

Noting that $T^{2}=\frac{1}{\sqrt{n}}\sum_{j=1}^{n}Y_{j}T=\frac{1}{\sqrt{n}}\sum_{j=1}^{n}Y_{j}(\frac{1}{\sqrt{n}}Y_{j}+T_{j})$ , we may write $N_{1}$ as

We first consider $R_{4}$ . Taylor expanding $f^{\prime}(W)$ about $ST_{j}$ and using that $ST-ST_{j}=\frac{1}{\sqrt{n}}Y_{j}S$ gives

Putting this together we have the following bound for $R_{4}$ :

Using independence and that the $Y_{j}$ have zero mean and then Taylor expanding $f^{(3)}(ST_{j})$ about $W$ gives

To bound $R_{7}$ we Taylor expand $f^{\prime\prime}(W)$ about $ST_{j}$ and use independence and that the $Y_{j}$ have zero mean to obtain

1.2 Proof Part II: Symmetry Argument for Optimal Rate

We begin by considering the bivariate standard normal Stein equation (see, for example, Goldstein and Rinott ) with test functions $g_{1}(s,t)=sf^{\prime\prime}(st)$ , $g_{2}(s,t)=st^{2}f^{\prime\prime}(st)$ and $g_{3}(s,t)=t^{3}f^{\prime\prime}(st)$ . The bivariate standard normal Stein equation with test function $g_{k}(s,t)$ , $k=1,2,3,$ and solution $\psi_{k}$ is given by

where $Z_{1}$ and $Z_{2}$ are independent standard normal random variables.

with $\phi_{1}$ , $\phi_{2}$ , $\phi_{3}$ , $\phi_{4}$ $\in(0,1)$ .

Before we bound the remainder terms, we need bounds for the third order partial derivatives of the solution $\psi_{k}$ in terms of the derivatives of $f$ . We achieve this task by using the following lemma, the proof of which is given in Appendix A. Before stating the lemma, we define the double factorial function. The double factorial of a positive integer $n$ is given by

and we define $(-1)!!=0!!=1$ (Arfken , p.547).

With these bounds it is straightforward to bound the remainder terms. The following lemma allows us to easily deduce bounds for the remainder terms $R_{8}^{k}$ , $R_{9}^{k}$ , $R_{10}^{k}$ and $R_{11}^{k}$ , $k=1,2,3$ .

We prove that the bound for $R_{8}^{k}$ holds; the bound for $R_{9}^{k}$ then follows by symmetry. We begin by defining $S_{i}^{*}=S_{i}+\frac{\phi_{1}}{\sqrt{m}}X_{i}$ . We note the following simple bound for $|S_{i}^{*}|^{p}$ , for $p\geq 1$ :

Using our bound (4.5) for the third order partial derivative of $\psi$ with respect to $s$ , we have

We can bound $R_{8}^{k}$ , $R_{9}^{k}$ , $R_{10}^{k}$ and $R_{11}^{k}$ by using the bounds in Lemma 4.8. We illustrate the argument by bounding $R_{11}^{1}$ . In this case we have $g_{1}(s,t)=sf^{\prime\prime}(st)$ , that is $a=1$ and $b=0$ . We have

2 Extension to the case r>1𝑟1r>1

For the case of $r>1$ , we have the following generalisation of Theorem 4.1:

Since $\|g^{(n)}(x+c)\|=\|g^{(n)}(x)\|$ for any constant $c$ , we may use bound (4.1) from Theorem 4.1 and the bounds of Theorem 3.6 for the derivatives of the solution of the $VG_{1}(r,0,1,0)$ Stein equation to bound the above expression, which yields (4.8). ∎

The terms $M_{r,1}^{k}(h)$ , for $k=2,3,4$ , are of order $r^{-1/2}$ as $r\rightarrow\infty$ (recall Theorem 3.6), and therefore the bound of Theorem 4.9 is of order $r^{1/2}(m^{-1}+n^{-1})$ . This in agreement with bound of Theorem 4.7 of Pickett for chi-square approximation, which is of order $d^{1/2}m^{-1}$ .

3 Application: Binary Sequence Comparison

We now consider a straightforward application of Theorem 4.1 to binary sequence comparison. This example is a simple special case of a more general problem of word sequence comparison, which is of particular importance to biological sequence comparisons. One way of comparing the sequences uses $k$ -tuples (a sequence of letters of length $k$ ). If two sequences are closely related, we would expect the $k$ -tuple content of both sequences to be very similar. A statistic for sequence comparison based on $k$ -tuple content, known as the $D_{2}$ statistic was suggested by Blaisdell (for other statistics based on $k$ -tuple content see Reinert et al. ). Letting $\mathcal{A}$ denote an alphabet of size $d$ , and $X_{\mathbf{w}}$ and $Y_{\mathbf{w}}$ the number of occurrences of the word $\mathbf{w}\in\mathcal{A}^{k}$ in the first and second sequences, respectively, then the $D_{2}$ statistic is defined by

Due to the complicated dependence structure at both the local and global level (for a detailed account of the dependence structure see Reinert et al. ) approximating the asymptotic distribution of $D_{2}$ is a difficult problem. However, for certain parameter regimes $D_{2}$ has been shown to be asymptotically normal and Poisson; see Lippert et al. for a detailed account of the asymptotic distributions of $D_{2}$ for different parameter values.

We now consider the standardised $D_{2}$ statistic,

References

Appendix A Proofs from the text

Here we prove the lemmas that we stated in the main text without proof.

(ii) This follows by applying the formula $K_{\frac{1}{2}}(x)=\sqrt{\frac{\pi}{2x}}e^{-x}$ to the density (1.5).

(iv) Taking $\theta=0$ in Corollary (2.5) leads to the general representation. The representation for the Laplace distribution now follows from part (ii).

(v) This follows on letting $\sigma\rightarrow 0$ in Proposition 2.4 and then using the fact that if $Y\sim\Gamma(\alpha,\beta)$ then $kY\sim\Gamma(\alpha,\beta/k)$ .

(vi) Theorem 6 of Holm and Alouini gives the following formula for the probability density function of $Z=U-V$ :

We can write the density of $Z$ as follows

A.2 Proof of Lemma 3.3

We begin by proving that there is at most one bounded solution to the Variance-Gamma Stein equation (3.7) when $\nu\geq 0$ . Suppose $u$ and $v$ are solutions to the Stein equation that satisfy $\|u^{(k)}\|,\textrm{ }\|v^{(k)}\|<\infty$ . Define $w=u-v$ . Then $w$ satisfies $\|w^{(k)}\|=\|u^{(k)}-v^{(k)}\|\leq\|u^{(k)}\|+\|v^{(k)}\|<\infty$ , and is a solution to the following differential equation

This homogeneous differential equation has general solution

From the asymptotic formula (B.3) for $I_{\nu}(x)$ , it follows that to have a bounded solution we must take $B=0$ . From the asymptotic formula (B.2) for $K_{\nu}(x)$ , we see that $w(x)$ has a singularity at the origin if $\nu\geq 0$ . Therefore if $\nu\geq 0$ , then for $w(x)$ to be bounded we must take $A=0$ , and therefore $w=0$ and so $u=v$ .

We now use variation of parameters (see Collins ) to solve the Stein equation equation (3.7). The method allows us to solve differential equations of the form

Suppose $v_{1}(x)$ and $v_{2}(x)$ are linearly independent solutions of the homogeneous equation

Then the general solution to the inhomogeneous equation is given by

where $a$ and $b$ are arbitrary constants and $W(t)=W(v_{1},v_{2})=v_{1}v_{2}^{\prime}-v_{2}v_{1}^{\prime}$ is the Wronskian.

It is easy to verify that a pair of linearly independent solutions to the homogeneous equation

Formula (B.6) states that $K_{\nu}(-x)=(-1)^{\nu}K_{\nu}(x)-\pi iI_{\nu}(x)$ and therefore

where we used (B.5) to obtain the equality in the above display. Therefore the general solution to the inhomogeneous equation is given by

This solution is clearly bounded everywhere except possibly for $x=0$ or in the limits $x\rightarrow\pm\infty$ . We therefore choose $a$ and $b$ so that our solution is bounded at these points and thus for all real $x$ . To ensure the solution is bounded at the origin we must take $a=0$ . We choose $b$ so that the solution is bounded in the limits $x\rightarrow\pm\infty$ . If we take $b=\infty$ then we obtain solution (3.3). Taking $b=-\infty$ would lead to the same solution (see Remark 3.4).

A.3 Proof of Lemma 4.5

Taylor expanding $f^{\prime\prime}(W)$ about $ST_{j}$ gives

Taylor expanding the $f^{(3)}(ST_{j})$ about $W$ allows us to write $N_{1}$ as

Putting this together, we have shown that

Rearranging and apply the triangle inequality now gives

and summing up the remainder terms completes the proof.

A.4 Proof of Lemma 4.7

We prove that inequality (4.5) holds; inequality (4.6) then follows by symmetry. We begin by obtaining a formula for the third order partial derivative of $\psi$ with respect to $s$ . Using a straightforward generalisation of the proof of Lemma 3.2 of Raič it can be shown that

We now use the simple inequality that $|p+q|^{n}\leq 2^{n-1}(|p|^{n}+|q|^{n})$ to obtain the following bound on $z_{s}$

and a similar inequality holds for $z_{t}$ . With these inequalities we have the following bound

Applying this bound to equation (A.3) gives the following bound on the third order partial derivative of $\psi$ with respect to $s$ :

Appendix B Elementary properties of modified Bessel functions

Here we list standard properties of modified Bessel functions that are used throughout this paper. All these formulas can be found in Olver et al. , except for the inequalities, which are given in Gaunt .

B.2 Basic properties

B.3 Asymptotic expansions

B.4 Identities

B.5 Differentiation

B.6 Modified Bessel differential equation

The modified Bessel differential equation is

The general solution is $f(x)=AI_{\nu}(x)+BK_{\nu}(x).$

B.7 Inequalities

Let $-1<\beta<1$ and $n=0,1,2,\ldots$ , then for $x\geq 0$ we have

Acknowledgements

During the course of this research the author was supported by an EPSRC DPhil Studentship and an EPSRC Doctoral Prize. The author would like to thank Gesine Reinert for the valuable guidance she provided on this project. The author would also like to thank two anonymous referees for their helpful comments which have lead to a substantial improvement in the presentation of this paper.