Variance-Gamma approximation via Stein's method

Robert E. Gaunt

Introduction

In 1972, Stein introduced a powerful method for deriving bounds for normal approximation. Since then, this method has been extended to many other distributions, such as the Poisson , Gamma , , Exponential , and Laplace , . Through the use of differential or difference equations, and various coupling techniques, Stein’s method enables many types of dependence structures to be treated, and also gives explicit bounds for distances between distributions.

At the heart of Stein’s method lies a characterisation of the target distribution and a corresponding characterising differential or difference equation. For example, Stein’s method for normal approximation rests on the following characterization of the normal distribution, which can be found in Stein , namely ZN(μ,σ2)Z\sim N(\mu,\sigma^{2}) if and only if

for all sufficiently smooth ff. This gives rise to the following inhomogeneous differential equation, known as the Stein equation:

where ZN(μ,σ2)Z\sim N(\mu,\sigma^{2}), and the test function hh is a real-valued function. For any bounded test function, a solution ff to (1.2) exists (see Lemma 2.4 of Chen et al. ). There are a number of techniques for obtaining Stein equations, such as the density approach of Stein et al. , the scope of which has recently been extended by Ley and Swan . Another commonly used technique is a generator approach, introduced by Barbour . This approach involves recognising the target the distribution as the stationary distribution of Markov process and then using the theory of generators of stochastic process to arrive at a Stein equation; for a detailed overview of this method see Reinert . Luk used this approach to obtain the following Stein equation for the Γ(r,λ)\Gamma(r,\lambda) distribution:

The next essential ingredient of Stein’s method is smoothness estimates for the solution of the Stein equation. This can often be done by solving the Stein equation using standard solution methods for differential equations and then using direct calculations to bound the required derivatives of the solution (Stein used the approach to bound the first two derivatives of the solution to the normal Stein equation (1.2)). The generator approach is also often used to obtain smoothness estimates. The use of probabilistic arguments to bound the derivatives of the solution often make it easier to arrive at smoothness estimates than through the use of analytical techniques. Luk and Pickett used the generator approach to bound kk-th order derivatives of the solution of the Γ(r,λ)\Gamma(r,\lambda) Stein equation (1.3). Pickett’s bounds are as follows

In this paper we obtain the key ingredients required to extend Stein’s method to the class of Variance-Gamma distributions. The Variance-Gamma distributions are defined as follows (this parametrisation is similar to that given in Finlay and Seneta ).

The Variance-Gamma distributions were introduced to the financial literature by Madan and Seneta . For certain parameter values the Variance-Gamma distributions have semi heavy tails that decay slower than the tails of the normal distribution, and therefore are often appropriate for financial modelling.

The class of Variance-Gamma distributions includes the Laplace distribution as a special case and in the appropriate limits reduces to the normal and Gamma distributions. This family of distributions also contains many other distributions that are of interest, which we list in the following proposition (the proof is given in Appendix A). As far as the author is aware, this is the first list of characterisations of the Variance-Gamma distributions to appear in the literature.

The representations of the Variance-Gamma distributions given in Proposition 1.2 enable us to determine a number of statistics that may have asymptotic Variance-Gamma distributions.

One of the main results of this paper (see Lemma 3.1) is the following Stein equation for the Variance-Gamma distributions:

In Section 3, we analyse the Stein equation (1.7). In particular, we show that the normal Stein equation (1.2) and Gamma Stein equation (1.3) are special cases. As a Stein equation for a given distribution is not unique (see Barbour ), the fact that in the appropriate limit the Variance-Gamma Stein equation (1.7) reduces to the known normal and Gamma Stein equation is an attractive feature.

Stein’s method has also recently been extended to the Laplace distribution (see Pike and Ren and Döbler ), although the Laplace Stein equation obtained by differs from the Laplace Stein equation that arises as a special case of (1.7); see Section 3.1.1 for a more detailed discussion. Another special case of the Stein equation (1.7) is a Stein equation for the product of two independent central normal random variables, which is in agreement with the Stein equation for products of independent central normal that was recently obtained by Gaunt . Therefore, the results from this paper allow the existing literature for Stein’s method for normal, Gamma, Laplace and product normal approximation to be considered in a more general framework.

More importantly, our development of Stein’s method for the Variance-Gamma distributions allows a number of new situations to be treated by Stein’s method. In Section 4, we illustrate our method by obtaining a bound for the distance between the statistic

local approach couplings, and symmetry arguments, that were introduced by Pickett , we obtain a O(m1+n1)O(m^{-1}+n^{-1}) bound for smooth test functions. A similar phenomena was observed in chi-square approximation by Pickett, and also by Goldstein and Reinert in which they obtained O(n1)O(n^{-1}) convergence rates in normal approximation, for smooth test functions, under the assumption of vanishing third moments. For non-smooth test functions we would, however, expect a O(m1/2+n1/2)O(m^{-1/2}+n^{-1/2}) convergence rate (cf. Berry-Esséen Theorem (Berry and Esséen ) to hold; see Remark 4.11.

The rest of this paper is organised as follows. In Section 2, we introduce the Variance-Gamma distributions and state some of their standard properties. In Section 3, we obtain a characterising lemma for the Variance-Gamma distributions and a corresponding Stein equation. We also obtain the unique bounded solution of the Stein equation, and present uniform bounds for the first four derivatives of the solution for the case θ=0\theta=0. In Section 4, we use Stein’s method for Variance-Gamma approximation to bound the distance between the statistic (1.8) and its limiting Variance-Gamma distribution. We then apply this bound to an application of binary sequence comparison, which is a simple special case of the more general problem of word sequence comparison. In Appendix A, we include the proofs of some technical lemmas that are required in this paper. Appendix B provides a list of some elementary properties of modified Bessel functions that we make use of in this paper.

The class of Variance-Gamma distributions

In this section we present the Variance-Gamma distributions and some of their standard properties. Throughout this paper we will make use of two different parametrisations of the Variance-Gamma distributions; the first parametrisation was given in Section 1, and making the change of variables

leads to another useful parametrisation. This parametrisation can be found in Eberlein and Hammerstein .

The first parametrisation leads to simple characterisations of the Variance-Gamma distributions in terms of normal and Gamma distributions, and therefore in many cases it allows us to recognise statistics that will have an asymptotic Variance-Gamma distribution. For this reason, we state our main results in terms of this parametrisation. However, the second parametrisation proves to be very useful in simplifying the calculations of Section 3, as the solution of the Variance-Gamma Stein equation has a simpler representation for this parametrisation. We can then state the results in terms of the first parametrisation by using (2.1).

The Variance-Gamma distributions have moments of arbitrary order (see Eberlein and Hammerstein ), in particular the mean and variance (for both parametrisations) of a random variable XX with a Variance-Gamma distribution are given by

The following proposition, which can be found in Bibby and Sørensen , shows that the class of Variance-Gamma distributions is closed under convolution, provided that the random variables have common values of θ\theta and σ\sigma (or, equivalently, common values of α\alpha and β\beta in the second parametrisation).

Variance-Gamma random variables can be characterised in terms of independent normal and Gamma random variables. This characterisation is given in the following proposition, which can be found in Barndorff-Nielsen et al. .

Using Proposition 2.4 we can establish the following useful representation of the Variance-Gamma distributions, which appears to be a new result. Indeed, the representation allows us to see that the statistic (1.8) has an asymptotic Variance-Gamma distribution.

Let X1,X2,...,XrX_{1},X_{2},...,X_{r} and Y1,Y2,...,YrY_{1},Y_{2},...,Y_{r} be sequences of independent standard normal random variables. Then Xi2X_{i}^{2}, i=1,2,...,mi=1,2,...,m, has a χ(1)2\chi_{(1)}^{2} distribution, that is a Γ(1/2,1/2)\Gamma(1/2,1/2) distribution. Define

Stein’s method for Variance-Gamma distributions

Note that the tails are in general not symmetric.

Firstly, we consider I1I_{1}. Let A(x)=xA(x)=x, B(x)=2ν+1+2βxB(x)=2\nu+1+2\beta x and C(x)=(2ν+1)β(1β2)xC(x)=(2\nu+1)\beta-(1-\beta^{2})x. Then applying integration by parts twice gives

as Kν(x)K_{\nu}(x) is a solution of the modified Bessel differential equation (see (B.10)).

Using formula (B.7) to differentiate Kν(x)K_{\nu}(x) gives

We now calculate the limit in the above expression. We first consider the case ν>0\nu>0. Applying the asymptotic formula (B.2) gives

since νΓ(ν)=Γ(ν+1)\nu\Gamma(\nu)=\Gamma(\nu+1). Now consider the case ν=0\nu=0. We use the fact that K1(x)=K1(x)K_{1}(x)=K_{-1}(x) to obtain

since Γ(1)=Γ(2)\Gamma(1)=\Gamma(2). Therefore we have

Finally, we consider the case 1/2<ν<0-1/2<\nu<0. We use the fact that Kλ(x)=Kλ(x)K_{-\lambda}(x)=K_{\lambda}(x) to obtain

As ff is continuous, it follows that I1=I2I_{1}=-I_{2} (and so I1+I2=0I_{1}+I_{2}=0), which completes the proof of necessity.

This solution and its first derivative are bounded (see Lemma 3.3) and is piecewise twice differentiable. As fzf_{z} and fzf_{z}^{\prime} are bounded, they satisfy the condition (3.3) (with α=1\alpha=1) and fzf_{z}^{\prime\prime} must also satisfy the condition because, from (3.5),

for some constants AA and BB. Hence, if (3.2) holds for all piecewise twice continuously differentiable functions satisfying (3.3) (with α=1\alpha=1), then by (3.5),

which, recalling (1.3), we recognise as the Γ(s,λ)\Gamma(s,\lambda) Stein equation (1.3) of Luk (up to a multiplicative factor).

which in the limit rr\rightarrow\infty is the classical N(μ,σ2)N(\mu,\sigma^{2}) Stein equation.

Taking r=1r=1, σ=σXσY\sigma=\sigma_{X}\sigma_{Y} and μ=0\mu=0 in (1.7) gives the following Stein equation for distribution of the product of independent N(0,σX2)N(0,\sigma_{X}^{2}) and N(0,σY2)N(0,\sigma_{Y}^{2}) random variables (see part (iii) of Proposition 1.2):

This Stein equation is in agreement with the Stein equation for the product of two independent, zero mean normal random variables that was obtained by Gaunt .

They have also solved (3.9) and have obtained uniform bounds for the solution and its first three derivatives. Their characterisation was obtained by a repeated application of the density method, and is similar to the characterisation for the Exponential distribution that results from the density method (see Stein et al. , Example 1.6), which leads to the Stein equation

1.2 Applications of Lemma 3.1

The main application of Lemma 3.1 that is considered in this paper involves the use of the resulting Stein equation in the proofs of the limit theorems of Section 4. There are, however, other interesting results that follow from Lemma 3.1. We consider a couple here.

Solving this equation subject to the condition M(0)=1M(0)=1 then gives that the moment generating function of the Variance-Gamma distribution with μ=0\mu=0 is

which in terms of the first parametrisation is

We have that M0=1M_{0}=1 and M1=(2ν+1)β/(α2β2)=rθM_{1}=(2\nu+1)\beta/(\alpha^{2}-\beta^{2})=r\theta (see (2.3)), and thus we can solve these recurrence equations by forward substitution to obtain the moments of the Variance-Gamma distributions. As far as the author is aware, these recurrence equations are new, although Scott et al. have already established a formula for the moments of general order of the Variance-Gamma distributions.

2 Smoothness estimates for the solution of the Stein equation

In the following lemma we give the solution to the Stein equation. The proof is given in Appendix A.

is very useful when it comes to obtaining smoothness estimates for the solution to the Stein equation. The equality ensures that we can restrict our attention to bounding the derivatives in the region x0x\geq 0, provided we obtain these bounds for both positive and negative β\beta.

The bounds given in Lemma 3.5 are of order ν1/2\nu^{-1/2} as ν\nu\rightarrow\infty, except when 2ν2\nu is not equal to an integer, but is sufficiently close to an integer that

Gaunt remarked that the rogue 1/sin(πν)1/\sin(\pi\nu) term appeared to be an artefact of the analysis that was used to obtain the bounds.

Limit theorems for Symmetric-Variance Gamma distributions

We now consider the Symmetric Variance-Gamma (θ=0\theta=0) limit theorem that we discussed in the introduction. Let X\mathbf{X} be a m×rm\times r matrix of independent and identically random variables XikX_{ik} with zero mean and unit variance. Similarly, we let Y\mathbf{Y} be a n×rn\times r matrix of independent and identically random variables YjkY_{jk} with zero mean and unit variance, where the YjkY_{jk} are independent of the XikX_{ik}. Then the statistic

We first consider the case r=1r=1; the general rr case follows easily as WrW_{r} is a linear sum of independent W1W_{1}. For ease of reading, in the statement of the following theorem and in its proof we shall set XiXi1X_{i}\equiv X_{i1}, YjYj1Y_{j}\equiv Y_{j1} and WW1W\equiv W_{1}. Then we have the following:

Notice that the statistic W=1mni,j=1m,nXiYjW=\frac{1}{\sqrt{mn}}\sum_{i,j=1}^{m,n}X_{i}Y_{j} is symmetric in mm and nn and the random variables XiX_{i} and YjY_{j}, and yet the bound (4.1) of Theorem 4.1 is not symmetric in mm and nn and the moments of XX and YY. This asymmetry is a consequence of the local couplings that we used to obtain the bound.

Before proving Theorem 4.1, we introduce some notation and preliminary lemmas. We define the standardised sum SS and TT by

and we have that W=STW=ST. In our proof we shall make use of the sums

which are independent of XiX_{i} and YjY_{j}, respectively. We therefore have the following formulas

In the proof of Theorem 4.1 we use the following lemma, which can be found in Pickett , Lemma 4.3.

Due to the independence of the XiX_{i} and YjY_{j} variables, we are in the realms of the local approach coupling. We Taylor expand f(W)f(W) about SiTS_{i}T to obtain

As STSiT=1mXiTST-S_{i}T=\frac{1}{\sqrt{m}}X_{i}T, we obtain

We begin by bounding R1R_{1} and R2R_{2}. Taylor expanding f(SiT)f^{\prime\prime}(S_{i}T) about WW and using (4.2) gives

The bound for R2R_{2} is immediate. We have

Taylor expanding f(W)f^{\prime\prime}(W) about SiTS_{i}T gives

where we used independence and that the XiX_{i} have zero mean to obtain the final inequality. Putting this together we have that

Noting that T2=1nj=1nYjT=1nj=1nYj(1nYj+Tj)T^{2}=\frac{1}{\sqrt{n}}\sum_{j=1}^{n}Y_{j}T=\frac{1}{\sqrt{n}}\sum_{j=1}^{n}Y_{j}(\frac{1}{\sqrt{n}}Y_{j}+T_{j}), we may write N1N_{1} as

We first consider R4R_{4}. Taylor expanding f(W)f^{\prime}(W) about STjST_{j} and using that STSTj=1nYjSST-ST_{j}=\frac{1}{\sqrt{n}}Y_{j}S gives

Putting this together we have the following bound for R4R_{4}:

Using independence and that the YjY_{j} have zero mean and then Taylor expanding f(3)(STj)f^{(3)}(ST_{j}) about WW gives

To bound R7R_{7} we Taylor expand f(W)f^{\prime\prime}(W) about STjST_{j} and use independence and that the YjY_{j} have zero mean to obtain

1.2 Proof Part II: Symmetry Argument for Optimal Rate

We begin by considering the bivariate standard normal Stein equation (see, for example, Goldstein and Rinott ) with test functions g1(s,t)=sf(st)g_{1}(s,t)=sf^{\prime\prime}(st), g2(s,t)=st2f(st)g_{2}(s,t)=st^{2}f^{\prime\prime}(st) and g3(s,t)=t3f(st)g_{3}(s,t)=t^{3}f^{\prime\prime}(st). The bivariate standard normal Stein equation with test function gk(s,t)g_{k}(s,t), k=1,2,3,k=1,2,3, and solution ψk\psi_{k} is given by

where Z1Z_{1} and Z2Z_{2} are independent standard normal random variables.

with ϕ1\phi_{1}, ϕ2\phi_{2}, ϕ3\phi_{3}, ϕ4\phi_{4} (0,1)\in(0,1).

Before we bound the remainder terms, we need bounds for the third order partial derivatives of the solution ψk\psi_{k} in terms of the derivatives of ff. We achieve this task by using the following lemma, the proof of which is given in Appendix A. Before stating the lemma, we define the double factorial function. The double factorial of a positive integer nn is given by

and we define (1)!!=0!!=1(-1)!!=0!!=1 (Arfken , p.547).

With these bounds it is straightforward to bound the remainder terms. The following lemma allows us to easily deduce bounds for the remainder terms R8kR_{8}^{k}, R9kR_{9}^{k}, R10kR_{10}^{k} and R11kR_{11}^{k}, k=1,2,3k=1,2,3.

We prove that the bound for R8kR_{8}^{k} holds; the bound for R9kR_{9}^{k} then follows by symmetry. We begin by defining Si=Si+ϕ1mXiS_{i}^{*}=S_{i}+\frac{\phi_{1}}{\sqrt{m}}X_{i}. We note the following simple bound for Sip|S_{i}^{*}|^{p}, for p1p\geq 1:

Using our bound (4.5) for the third order partial derivative of ψ\psi with respect to ss, we have

We can bound R8kR_{8}^{k}, R9kR_{9}^{k}, R10kR_{10}^{k} and R11kR_{11}^{k} by using the bounds in Lemma 4.8. We illustrate the argument by bounding R111R_{11}^{1}. In this case we have g1(s,t)=sf(st)g_{1}(s,t)=sf^{\prime\prime}(st), that is a=1a=1 and b=0b=0. We have

2 Extension to the case r>1𝑟1r>1

For the case of r>1r>1, we have the following generalisation of Theorem 4.1:

Since g(n)(x+c)=g(n)(x)\|g^{(n)}(x+c)\|=\|g^{(n)}(x)\| for any constant cc, we may use bound (4.1) from Theorem 4.1 and the bounds of Theorem 3.6 for the derivatives of the solution of the VG1(r,0,1,0)VG_{1}(r,0,1,0) Stein equation to bound the above expression, which yields (4.8). ∎

The terms Mr,1k(h)M_{r,1}^{k}(h), for k=2,3,4k=2,3,4, are of order r1/2r^{-1/2} as rr\rightarrow\infty (recall Theorem 3.6), and therefore the bound of Theorem 4.9 is of order r1/2(m1+n1)r^{1/2}(m^{-1}+n^{-1}). This in agreement with bound of Theorem 4.7 of Pickett for chi-square approximation, which is of order d1/2m1d^{1/2}m^{-1}.

3 Application: Binary Sequence Comparison

We now consider a straightforward application of Theorem 4.1 to binary sequence comparison. This example is a simple special case of a more general problem of word sequence comparison, which is of particular importance to biological sequence comparisons. One way of comparing the sequences uses kk-tuples (a sequence of letters of length kk). If two sequences are closely related, we would expect the kk-tuple content of both sequences to be very similar. A statistic for sequence comparison based on kk-tuple content, known as the D2D_{2} statistic was suggested by Blaisdell (for other statistics based on kk-tuple content see Reinert et al. ). Letting A\mathcal{A} denote an alphabet of size dd, and XwX_{\mathbf{w}} and YwY_{\mathbf{w}} the number of occurrences of the word wAk\mathbf{w}\in\mathcal{A}^{k} in the first and second sequences, respectively, then the D2D_{2} statistic is defined by

Due to the complicated dependence structure at both the local and global level (for a detailed account of the dependence structure see Reinert et al. ) approximating the asymptotic distribution of D2D_{2} is a difficult problem. However, for certain parameter regimes D2D_{2} has been shown to be asymptotically normal and Poisson; see Lippert et al. for a detailed account of the asymptotic distributions of D2D_{2} for different parameter values.

We now consider the standardised D2D_{2} statistic,

References

Appendix A Proofs from the text

Here we prove the lemmas that we stated in the main text without proof.

(ii) This follows by applying the formula K12(x)=π2xexK_{\frac{1}{2}}(x)=\sqrt{\frac{\pi}{2x}}e^{-x} to the density (1.5).

(iv) Taking θ=0\theta=0 in Corollary (2.5) leads to the general representation. The representation for the Laplace distribution now follows from part (ii).

(v) This follows on letting σ0\sigma\rightarrow 0 in Proposition 2.4 and then using the fact that if YΓ(α,β)Y\sim\Gamma(\alpha,\beta) then kYΓ(α,β/k)kY\sim\Gamma(\alpha,\beta/k).

(vi) Theorem 6 of Holm and Alouini gives the following formula for the probability density function of Z=UVZ=U-V:

We can write the density of ZZ as follows

A.2 Proof of Lemma 3.3

We begin by proving that there is at most one bounded solution to the Variance-Gamma Stein equation (3.7) when ν0\nu\geq 0. Suppose uu and vv are solutions to the Stein equation that satisfy u(k), v(k)<\|u^{(k)}\|,\textrm{ }\|v^{(k)}\|<\infty. Define w=uvw=u-v. Then ww satisfies w(k)=u(k)v(k)u(k)+v(k)<\|w^{(k)}\|=\|u^{(k)}-v^{(k)}\|\leq\|u^{(k)}\|+\|v^{(k)}\|<\infty, and is a solution to the following differential equation

This homogeneous differential equation has general solution

From the asymptotic formula (B.3) for Iν(x)I_{\nu}(x), it follows that to have a bounded solution we must take B=0B=0. From the asymptotic formula (B.2) for Kν(x)K_{\nu}(x), we see that w(x)w(x) has a singularity at the origin if ν0\nu\geq 0. Therefore if ν0\nu\geq 0, then for w(x)w(x) to be bounded we must take A=0A=0, and therefore w=0w=0 and so u=vu=v.

We now use variation of parameters (see Collins ) to solve the Stein equation equation (3.7). The method allows us to solve differential equations of the form

Suppose v1(x)v_{1}(x) and v2(x)v_{2}(x) are linearly independent solutions of the homogeneous equation

Then the general solution to the inhomogeneous equation is given by

where aa and bb are arbitrary constants and W(t)=W(v1,v2)=v1v2v2v1W(t)=W(v_{1},v_{2})=v_{1}v_{2}^{\prime}-v_{2}v_{1}^{\prime} is the Wronskian.

It is easy to verify that a pair of linearly independent solutions to the homogeneous equation

Formula (B.6) states that Kν(x)=(1)νKν(x)πiIν(x)K_{\nu}(-x)=(-1)^{\nu}K_{\nu}(x)-\pi iI_{\nu}(x) and therefore

where we used (B.5) to obtain the equality in the above display. Therefore the general solution to the inhomogeneous equation is given by

This solution is clearly bounded everywhere except possibly for x=0x=0 or in the limits x±x\rightarrow\pm\infty. We therefore choose aa and bb so that our solution is bounded at these points and thus for all real xx. To ensure the solution is bounded at the origin we must take a=0a=0. We choose bb so that the solution is bounded in the limits x±x\rightarrow\pm\infty. If we take b=b=\infty then we obtain solution (3.3). Taking b=b=-\infty would lead to the same solution (see Remark 3.4).

A.3 Proof of Lemma 4.5

Taylor expanding f(W)f^{\prime\prime}(W) about STjST_{j} gives

Taylor expanding the f(3)(STj)f^{(3)}(ST_{j}) about WW allows us to write N1N_{1} as

Putting this together, we have shown that

Rearranging and apply the triangle inequality now gives

and summing up the remainder terms completes the proof.

A.4 Proof of Lemma 4.7

We prove that inequality (4.5) holds; inequality (4.6) then follows by symmetry. We begin by obtaining a formula for the third order partial derivative of ψ\psi with respect to ss. Using a straightforward generalisation of the proof of Lemma 3.2 of Raič it can be shown that

We now use the simple inequality that p+qn2n1(pn+qn)|p+q|^{n}\leq 2^{n-1}(|p|^{n}+|q|^{n}) to obtain the following bound on zsz_{s}

and a similar inequality holds for ztz_{t}. With these inequalities we have the following bound

Applying this bound to equation (A.3) gives the following bound on the third order partial derivative of ψ\psi with respect to ss:

Appendix B Elementary properties of modified Bessel functions

Here we list standard properties of modified Bessel functions that are used throughout this paper. All these formulas can be found in Olver et al. , except for the inequalities, which are given in Gaunt .

B.2 Basic properties

B.3 Asymptotic expansions

B.4 Identities

B.5 Differentiation

B.6 Modified Bessel differential equation

The modified Bessel differential equation is

The general solution is f(x)=AIν(x)+BKν(x).f(x)=AI_{\nu}(x)+BK_{\nu}(x).

B.7 Inequalities

Let 1<β<1-1<\beta<1 and n=0,1,2,n=0,1,2,\ldots, then for x0x\geq 0 we have

Acknowledgements

During the course of this research the author was supported by an EPSRC DPhil Studentship and an EPSRC Doctoral Prize. The author would like to thank Gesine Reinert for the valuable guidance she provided on this project. The author would also like to thank two anonymous referees for their helpful comments which have lead to a substantial improvement in the presentation of this paper.