next up previous
Next: Robust Bootstrap Up: Approach Previous: The Empirical Approximation of

   
Bootstrap

Since its introduction by Efron (1979), the bootstrap method has received a lot of attention in the literature (see Efron 1979, 1982; Efron and Tibshirani, 1993; Hall, 1986a, 1990, 1992 and Wu, 1986). Given a random sample ${\bf x}'=\left(x_1, \ldots, x_n \right)$ and a statistic $R \left( {\bf x}, F \right)$ that depends on the sample and possibly on the underlying distribution F, the aim is to estimate the distribution of R,

\begin{displaymath}{\cal L}_F \left( R \left( {\bf x}, F \right) \right),
\end{displaymath}

by that of

\begin{displaymath}{\cal L}_{F_n} \left( R \left( {\bf x}^*, F_n \right) \right),
\end{displaymath}

where ${\bf x}^*$ denotes a random sample taken from the distribution Fn. The replacement of F by Fn is called the ``plug-in'' principle (see Efron and Tibshirani, 1993). Efron (1979) showed that ${\cal L}_{F_n} \left( R \left( {\bf x}^*, F_n \right)
\right) $ is a reasonable estimate in some simple cases and established the validity of the principle for a general class of statistics when the sampling space is finite. When applying this principle to a particular statistic $R \left( {\bf x}, F \right)$ we should check whether this approximation is ``good''. The definition of ``goodness'' might depend on the particular problem at hand. For example, the criteria could be different if we want to estimate moments or percentiles of the distribution ${\cal L}_F \left( R \left( {\bf x}, F \right) \right)$. In some cases we are interested in the asymptotic distribution of

 \begin{displaymath}
n^\alpha \left( R \left( {\bf x}, F \right) - \theta
\left( F \right) \right)
\end{displaymath} (19)

where $\alpha$ is a real number and $\theta
\left( F \right)$ denotes a parameter of the distribution F. If we want to estimate the limiting distribution of this sequence by the plug-in principle, we should show that the sequence

\begin{displaymath}n^\alpha \left( R \left( {\bf x}^*, F_n \right) - \theta
\left( F_n \right) \right)
\end{displaymath}

has the same limiting distribution as (19). Bickel and Freedman (1981) give some general asymptotic theory to answer this question. They prove that under certain regularity conditions the bootstrap approximation to the asymptotic distribution works for sample means, von Mises functionals (Fernholz, 1983), quantiles, and trimmed means among others. In particular they establish the two following theorems for means and smooth functions of means respectively. Let ${\bf x}_1, \ldots, {\bf x}_n$ be n independent vector-valued random variables in ${\mbox{I}\!\mbox{R}}^p$ with common cumulative distribution function F. Let $\bar{{\bf x}}_n$ denote their sample mean and Fn their empirical distribution function. Similarly, let ${\bf x}^*_1, \ldots, {\bf x}^*_m$ be m independent vector-valued random variables in ${\mbox{I}\!\mbox{R}}^p$ with common cumulative distribution function Fn. Let $\bar{{\bf x}}^*_m$ denote their sample mean. Assume that $E \left[ \left\Vert {\bf x}_1 \right\Vert^2 \right] <
\infty$ and let $\Sigma$ be the covariance matrix of ${\bf x}_1$. We have the following theorem (see Bickel and Freedman, 1981):

Theorem 2 (Bickel and Freedman)   For almost all sample sequences, given ${\bf x}_1, \ldots, {\bf x}_n$, as n and m tend to infinity the conditional distribution of $\sqrt{m} \left( \bar{{\bf x}}^*_m - \bar{{\bf x}}_n \right) $ converges weakly to $N \left( 0, \Sigma
\right)$.

To state the second theorem we need some notation. Let $\left({\bf x}_1, \ldots, {\bf x}_n \right)$ and $\left({\bf x}^*_1, \ldots, {\bf x}^*_m \right)$ be as in the previous theorem. Let

\begin{displaymath}S_n = \sum_{i=1}^nh \left( {\bf x}_i \right) \hspace{.2in}
\...
...e{.2in}
\mu = E_F \left[ h \left( {\bf x}_1 \right) \right],
\end{displaymath}

where $h: {\mbox{I}\!\mbox{R}}^p \rightarrow {\mbox{I}\!\mbox{R}}^k$. Let

\begin{displaymath}\tilde{S}_n = \sum_{i=1}^nh \left( {\bf x}_i^* \right),
\end{displaymath}

and let $g \left( {\bf v}\right): {\cal R}^k \rightarrow {\cal R}$ be a real valued function with finite differential

\begin{displaymath}\dot{g} \left( {\bf v}\right) = \left( \partial g /
\partial...
...ots, \partial g / \partial v_k \left( {\bf v}\right)
\right)
\end{displaymath}

at $\mu$. The following theorem states that under regularity conditions bootstrapping commutes with smooth functions.

Theorem 3 (Bickel and Freedman)   Let Sn, $\tilde{S}_n$, $\mu$ and g be as in the previous paragraphs. If $E \left[ \left\Vert h \left( {\bf x}_1 \right)
\right\Vert^2 \right] < \infty$,

\begin{displaymath}\sqrt{n} \left\{ g \left( \frac{\tilde{S}_n}{n} \right)
- g ...
...tilde{S}_n}{n} - \frac{S_n}{n} \right) + o_p \left( 1 \right)
\end{displaymath}

Many papers in the literature deal with the problem of determining the accuracy and order of coverage level of the bootstrap confidence intervals (see for example, Hall 1986a, 1986b, 1988a and 1990). At this stage of our work we are only interested in the asymptotic validity of our methods. Further refinements will be considered in future work (see item 6 in section (5.4)). There are results in the literature showing that the bootstrap also works to approximate the asymptotic distribution of robust estimates. See, for example, Shorack (1982), Parr (1985), Yang (1985), Lohse (1987), Shao (1990), Cheng (1991), Arcones et. al. (1992). See also Dumbgen (1993) and Cuevas et. al. (1993). Bootstrap can be used to calculate confidence intervals via the estimation of the asymptotic variance or by getting approximate percentiles for the limiting distribution (see Hall 1988a for a comprehensive discussion). Two serious problems arise in either case: first, since bootstrap samples are taken with replacement, the proportion of outliers in the bootstrap sample may be higher than in the original one; second, the computational complexity of the robust estimates imposes an upper bound on the number of recalculations that are feasible. We will call the first problem ``lack of robustness'' of the classical bootstrap. In agreement with Shao (1990), we found that the bootstrap distribution has heavy tails that produce inflated variance estimators and unduly long confidence intervals. Between 2,000 and 3,000 bootstrap samples are needed to estimate the percentiles for a confidence interval (Efron and Tibshirani, 1993). That many recalculations of a robust regression estimate are in practice unfeasible with today's technology.
next up previous
Next: Robust Bootstrap Up: Approach Previous: The Empirical Approximation of
Department Web Master
2000-05-29