High Dimensional Bootstrap and Asymptotic Expansion for the $k$ -th Largest Coordinate

Long Feng
School of Statistics and Data Science, LEBPS, KLMDASR,
AAIS and LPMC, Nankai University

Abstract

We study bootstrap inference for the $k$ th largest coordinate of a normalized sum of independent high-dimensional random vectors. Existing second-order theory for maxima does not directly extend to order statistics, because the event $\{T_{n,[k]}\leq t\}$ is not a rectangle and its local structure is governed by exceedance counts rather than by a single boundary. We develop an approach based on factorial moments and weighted inclusion–exclusion that reduces the problem to a collection of rare-orthant probabilities and allows high-dimensional Edgeworth and Cornish–Fisher expansions to be transferred to the order-statistic setting. Under moment, variance, and weak-dependence conditions, we derive a second-order coverage expansion for wild-bootstrap critical values of the $k$ th order statistic. In particular, a third-moment matching wild bootstrap achieves coverage error of order $n^{-1}$ up to logarithmic factors, and the same second-order accuracy is obtained for a prepivoted double wild bootstrap. We also show that the maximal-correlation condition can be replaced by a stationary Gaussian exponential-mixing assumption at the price of an explicit dependence remainder $r_{d}$ , and this remainder can itself be of order $n^{-1}$ when the dimension is sufficiently large relative to the sample size. These results extend recent second-order Gaussian and bootstrap approximation theory from maxima to the $k$ th order statistic in high dimension.

Keywords: bootstrap coverage expansion; high-dimensional Gaussian approximation; $k$ th order statistic; second-order accuracy; wild bootstrap.

1 Introduction

High-dimensional Gaussian approximation for maxima and rectangular probabilities is now a basic tool in modern high-dimensional inference. For the maximum of a sum of independent random vectors, the seminal work of Chernozhukov et al. (2013) established Gaussian approximation and Gaussian multiplier bootstrap validity when the dimension is allowed to be much larger than the sample size. This line of work was sharpened substantially by Chernozhukov et al. (2017), who extended the approximation theory to hyperrectangles and improved the first-order rate. Later, Deng and Zhang (2020) showed that third-moment matching bootstrap procedures enjoy a better logarithmic dependence in the first-order bound, and Koike (2021) proved that the same logarithmic rate is already available for normal approximation. Among the general first-order results under mild moment assumptions, Chernozhukov et al. (2022) further improved the error bound to an $n^{-1/4}$ -type rate up to logarithmic factors. Under additional nondegeneracy or structural assumptions, nearly parametric $n^{-1/2}$ rates up to logarithmic losses are also available; see, for example, (Lopes et al., 2020; Fang and Koike, 2021; Chernozhukov et al., 2023; Fang and Koike, 2024).

A decisive recent development for maxima is the asymptotic expansion theory developed by Koike (2026). That paper developed high-dimensional Edgeworth and Cornish–Fisher expansions for maxima and related rectangular probabilities by combining Stein-kernel arguments, smoothing inequalities, and a careful analysis of Gaussian anti-concentration. As a consequence, Koike (2026) obtained a second-order bootstrap coverage expansion and showed that, in several important regimes, the coverage error can be improved from the first-order scale to $O\!\left(\log^{a}(dn)/n\right)$ for a suitable constant $a>0$ . In particular, for third-moment matching wild bootstrap, the maximum statistic becomes second-order accurate even without studentization under suitable covariance assumptions.

Compared with the theory for maxima, the literature for the $k$ th largest coordinate is still sparse. Classical results on order statistics and extremes, such as (Fisher and Tippett, 1928; Mu, 1966; Watts et al., 1982), do not address high-dimensional Gaussian approximation for sums of random vectors. On the Gaussian side, Kozbur (2021) studied dimension-free anti-concentration inequalities for Gaussian order statistics. In the genuinely high-dimensional setting, Ding et al. (2026) established Gaussian and Gaussian multiplier bootstrap approximations for the $k$ th largest coordinate and for more general functionals of the top- $k$ order statistics. For the $k$ th largest coordinate, their Kolmogorov bounds are of order

k^{2}\Bigl(\frac{B_{n}^{2}\log^{5}(pn)}{n}\Bigr)^{1/4},

up to universal constants, and the bounds for general top- $k$ functionals are of even larger order. Therefore the currently available theory for the $k$ th largest coordinate is still essentially first-order and does not provide a second-order coverage expansion comparable to the one available for maxima.

The purpose of the present paper is to fill this gap. We prove that the $k$ th largest coordinate of a high-dimensional normalized sum also admits a Koike-type second-order bootstrap expansion. Our argument starts from the exact exceedance-count representation of the event $\{T_{n,[k]}\leq t\}$ and combines weighted inclusion–exclusion with a local rare-orthant analysis. This allows us to transfer the second-order expansion machinery from maxima to the $k$ th order statistic. As a result, we show that third-moment matching wild bootstrap retains second-order accuracy for the $k$ th largest coordinate, and we also obtain a second-order result for the prepivoted double wild bootstrap. In this way, the second-order theory that was previously available only for maxima is extended to the $k$ th largest coordinate in high dimension.

We also give a complementary dependence formulation based on a stationary Gaussian reference field with exponentially decaying strong-mixing coefficients. This assumption is structurally different from the maximal-correlation condition used in the baseline theory: it exploits one-dimensional dependence and allows local clusters of highly correlated coordinates. In that setting we rework the Gaussian aggregation argument and obtain the same distributional, quantile, and coverage expansions with an explicit additional remainder $r_{d}$ that isolates the effect of local exceedance clustering. The resulting expression is fully explicit and can again be of order $n^{-1}$ when the dimension grows sufficiently quickly relative to the sample size.

The remainder of the paper is organized as follows. Section 2 presents the main theoretical results, including the exponential-mixing alternative in Section 2.3. Section 3 reports simulation results comparing several bootstrap methods. Section 4 concludes. Proofs are collected in Appendices A and B.

Notation. We write $[d]:=\{1,\dots,d\}$ . For a vector $\bm{x}\in\mathbb{R}^{m}$ , let

\|\bm{x}\|_{2}:=\Bigl(\sum_{j=1}^{m}x_{j}^{2}\Bigr)^{1/2},\qquad\|\bm{x}\|_{\infty}:=\max_{1\leq j\leq m}|x_{j}|.

We denote by $\mathbf{1}_{d}=(1,\dots,1)^{\top}\in\mathbb{R}^{d}$ the all-ones vector. For $r\in\mathbb{N}$ , $(\mathbb{R}^{m})^{\otimes r}$ denotes the set of real-valued $m$ -dimensional $r$ -tensors. If $\mathsf{T}\in(\mathbb{R}^{m})^{\otimes q}$ and $\mathsf{U}\in(\mathbb{R}^{m})^{\otimes r}$ , then $\mathsf{T}\otimes\mathsf{U}\in(\mathbb{R}^{m})^{\otimes(q+r)}$ denotes their tensor product. When $q=r$ , we write

\langle\mathsf{T},\mathsf{U}\rangle:=\sum_{j_{1},\dots,j_{r}=1}^{m}T_{j_{1},\dots,j_{r}}U_{j_{1},\dots,j_{r}},

and

\|\mathsf{T}\|_{1}:=\sum_{j_{1},\dots,j_{r}=1}^{m}|T_{j_{1},\dots,j_{r}}|,\qquad\|\mathsf{T}\|_{\infty}:=\max_{1\leq j_{1},\dots,j_{r}\leq m}|T_{j_{1},\dots,j_{r}}|.

For $\bm{x}\in\mathbb{R}^{m}$ , $\bm{x}^{\otimes r}$ denotes the $r$ th tensor power of $\bm{x}$ . Whenever $\bm{X}_{1},\dots,\bm{X}_{n}$ are under discussion, we set

\bar{\bm{X}}^{\,r}:=\frac{1}{n}\sum_{i=1}^{n}\bm{X}_{i}^{\otimes r}.

Given an $r$ -times differentiable function $h:\mathbb{R}^{d}\rightarrow\mathbb{R}$ , we set $\nabla^{r}h(x):=\left(\partial^{j_{1},\ldots,j_{r}}h(x)\right)_{1\leq j_{1},\ldots,j_{r}\leq d}\in\left(\mathbb{R}^{d}\right)^{\otimes r}$ for $x\in\mathbb{R}^{d}$ , where $\partial^{j_{1},\ldots,j_{r}}=\frac{\partial^{r}}{\partial x_{j_{1}}\cdots\partial x_{j_{r}}}$ . For $m\in\mathbb{N}\cup\{\infty\},C_{b}^{m}\left(\mathbb{R}^{d}\right)$ denotes the set of bounded $C^{m}$ functions with bounded derivatives. For a multi-index $\alpha=(\alpha_{1},\dots,\alpha_{m})\in\mathbb{N}_{0}^{m}$ , we write

|\alpha|:=\sum_{j=1}^{m}\alpha_{j},\qquad\partial^{\alpha}:=\partial_{1}^{\alpha_{1}}\cdots\partial_{m}^{\alpha_{m}}.

For a positive definite matrix $V$ , let $\phi_{V}$ denote the density of $N(\bm{0},V)$ . We write $\Phi$ for the standard normal distribution function and $\bar{\Phi}:=1-\Phi$ for its survival function. For a distribution function $F:\mathbb{R}\to[0,1]$ , its generalized inverse is defined by

F^{-1}(p):=\inf\{t\in\mathbb{R}:F(t)\geq p\},\qquad p\in(0,1).

For $\alpha>0$ and a scalar random variable $Y$ , let

\|Y\|_{\psi_{\alpha}}:=\inf\Bigl\{C>0:\ \mathbb{E}\exp(|Y|^{\alpha}/C^{\alpha})\leq 2\Bigr\}.

For a matrix $\mathbf{A}=(a_{j\ell})$ , we set

\|\mathbf{A}\|_{\max}:=\max_{1\leq j,\ell\leq m}|a_{j\ell}|,\qquad R_{j}(\mathbf{A}):=\sum_{\ell\neq j}|a_{j\ell}|.

Also, $\mathbb{P}^{*}$ and $\mathbb{E}^{*}$ denote conditional probability and expectation given the data. We assume $d\geq 3$ whenever an expression containing $\log d$ appears, and similarly for $n$ .

2 Main Results

2.1 Asymptotic expansion of coverage probability

Let $\bm{X}_{1},\dots,\bm{X}_{n}$ be independent centered random vectors in $\mathbb{R}^{d}$ , and define

\bm{S}_{n}:=\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\bm{X}_{i},\qquad\bm{Z}\sim N(\bm{0},\mathbf{\Sigma}),\qquad\mathbf{\Sigma}:=\mathrm{Var}(\bm{S}_{n}).

Write

T_{n,[1]}\geq T_{n,[2]}\geq\cdots\geq T_{n,[d]}

for the descending order statistics of the coordinates of $\bm{S}_{n}$ , and define $T_{\bm{Z},[k]}$ analogously from $\bm{Z}$ . Set

G_{k}(t):=\mathbb{P}(T_{\bm{Z},[k]}\leq t),\qquad f_{k}(t):=G_{k}^{\prime}(t)

whenever the derivative exists.

Let $w_{1},\dots,w_{n}$ be i.i.d. multipliers independent of the data. Put

\bar{\bm{X}}:=\frac{1}{n}\sum_{i=1}^{n}\bm{X}_{i},\qquad\bm{S}_{n}^{*}:=\frac{1}{\sqrt{n}}\sum_{i=1}^{n}w_{i}(\bm{X}_{i}-\bar{\bm{X}}).

Let $T_{n,[k]}^{*}$ denote the $k$ th largest coordinate of $\bm{S}_{n}^{*}$ , and write

\hat{F}_{n,k}(t):=\mathbb{P}^{*}(T_{n,[k]}^{*}\leq t),\qquad\hat{c}_{p,k}:=\inf\{t\in\mathbb{R}:\hat{F}_{n,k}(t)\geq p\}.

For each coordinate, $\sigma_{j}^{2}:=\Sigma_{jj},\underline{\sigma}:=\min_{1\leq j\leq d}\sigma_{j},\overline{\sigma}:=\max_{1\leq j\leq d}\sigma_{j}.$ For $p\in(0,1)$ , define the Gaussian quantile $c^{G}_{p,k}:=G_{k}^{-1}(p).$ Fix $\epsilon\in(0,1/2)$ and define the quantile window

\mathcal{T}_{k,\epsilon}:=\{c^{G}_{p,k}:\ p\in[\epsilon/2,1-\epsilon/2]\}.

For $t\in\mathbb{R}$ , define the exceedance counts

N_{n}(t):=\sum_{j=1}^{d}\mathbf{1}\{S_{n,j}>t\},\qquad N_{n}^{*}(t):=\sum_{j=1}^{d}\mathbf{1}\{S_{n,j}^{*}>t\},\qquad N_{Z}(t):=\sum_{j=1}^{d}\mathbf{1}\{Z_{j}>t\}.

Then

T_{n,[k]}\leq t\iff N_{n}(t)\leq k-1.

For every integer $s\geq 1$ , define

V_{n,s}(t):=\mathbb{E}\binom{N_{n}(t)}{s},\qquad V^{*}_{n,s}(t):=\mathbb{E}^{*}\binom{N_{n}^{*}(t)}{s},\qquad V_{Z,s}(t):=\mathbb{E}\binom{N_{Z}(t)}{s}.

For each nonempty $I\subset[d]$ , write

B_{I}(t):=\{\bm{z}\in\mathbb{R}^{d}:\ z_{j}>t,\forall j\in I\}

and

\pi_{I}(t):=\mathbb{P}(Z_{j}>t,\forall j\in I).

Let $\phi_{\mathbf{\Sigma}}$ denote the density of $N(\bm{0},\mathbf{\Sigma})$ , and abbreviate $\phi:=\phi_{\mathbf{\Sigma}}$ . The first-order Edgeworth density for $\bm{S}_{n}$ is

p_{n}(\bm{z}):=\phi(\bm{z})-\frac{1}{6\sqrt{n}}\bigl\langle\mathbb{E}[\bar{\bm{X}}^{\,3}],\nabla^{3}\phi(\bm{z})\bigr\rangle.

Let $\gamma:=\mathbb{E}(w_{1}^{3}).$ The bootstrap Edgeworth density is defined by

\hat{p}_{n,\gamma}(\bm{z}):=\phi(\bm{z})+\frac{1}{2}\bigl\langle\bar{\bm{X}}^{\,2}-\mathbf{\Sigma},\nabla^{2}\phi(\bm{z})\bigr\rangle-\frac{\gamma}{6\sqrt{n}}\bigl\langle\bar{\bm{X}}^{\,3},\nabla^{3}\phi(\bm{z})\bigr\rangle.

For each integer $s\geq 1$ , define

M_{n,s}(t):=\sum_{\begin{subarray}{c}I\subset[d]\\ |I|=s\end{subarray}}\int_{(t,\infty)^{s}}p_{n,I}(\bm{u})\,d\bm{u},\qquad\hat{M}_{n,s,\gamma}(t):=\sum_{\begin{subarray}{c}I\subset[d]\\ |I|=s\end{subarray}}\int_{(t,\infty)^{s}}\hat{p}_{n,\gamma,I}(\bm{u})\,d\bm{u},

where $p_{n,I}$ and $\hat{p}_{n,\gamma,I}$ denote the corresponding projected densities defined later. Also set

M_{Z,s}(t):=\sum_{\begin{subarray}{c}I\subset[d]\\ |I|=s\end{subarray}}\pi_{I}(t)=V_{Z,s}(t).

We now present the assumptions underlying our analysis.

Assumption 2.1.

The vectors $\bm{X}_{1},\dots,\bm{X}_{n}$ are independent and centered. Each $\bm{X}_{i}$ admits a Stein kernel $\bm{\tau}_{i}(\bm{X}_{i})$ in the sense that

\mathbb{E}\bigl[\bm{X}_{i}^{\top}f(\bm{X}_{i})\bigr]=\mathbb{E}\bigl[\mathrm{tr}\{\bm{\tau}_{i}(\bm{X}_{i})\nabla f(\bm{X}_{i})^{\top}\}\bigr]

for every smooth vector-valued test function for which both sides are finite. There exist constants $b>0$ and $\sigma_{*}>0$ such that

(i)

$\lambda_{\min}(\mathbf{\Sigma})\geq\sigma_{*}^{2}$ ;
(ii)

$\max_{1\leq i\leq n}\max_{1\leq j\leq d}\|X_{ij}\|_{\psi_{1}}\leq b,\max_{1\leq i\leq n}\max_{1\leq j,\ell\leq d}\|\tau_{i,j\ell}(\bm{X}_{i})-\mathbb{E}\tau_{i,j\ell}(\bm{X}_{i})\|_{\psi_{1}}\leq b^{2};$
(iii)

$\delta_{n}:=\frac{b^{5}}{\sigma_{*}^{5}}\frac{\log^{3}(dn)}{n},\varepsilon_{n}:=\sqrt{\delta_{n}\log n}$ satisfies $\varepsilon_{n}\to 0$ .

Remark 2.1.

Assumption 2.1 is the data-side regularity condition. The Stein identity provides the analytic device behind the projected Edgeworth expansion and is a convenient substitute for classical Cramér-type smoothness conditions in high dimension. As emphasized in Koike (2026, Remark 2.4), in one dimension the existence of a Stein kernel implies a nontrivial absolutely continuous component, and hence Cramér’s condition, whereas in higher dimensions Stein kernels remain available even in situations where a multivariate Cramér condition is not appropriate, such as Gaussian laws with singular covariance matrices. In our setting, part (i) requires a uniform lower bound on $\lambda_{\min}(\mathbf{\Sigma})$ , which prevents global degeneracy of the Gaussian comparison law and guarantees that the projected Gaussian densities and their derivatives remain well behaved. Part (ii) imposes sub-exponential control on both the coordinates $X_{ij}$ and the fluctuations of the Stein-kernel entries. Since

\mathbb{E}\{\bm{\tau}_{i}(\bm{X}_{i})\}=\mathbb{E}(\bm{X}_{i}\bm{X}_{i}^{\top}),

the centered quantity

\tau_{i,j\ell}(\bm{X}_{i})-\mathbb{E}\tau_{i,j\ell}(\bm{X}_{i})

measures the random fluctuation of the local covariance proxy around its population counterpart; controlling these fluctuations is exactly what allows Koike’s decomposition to be applied uniformly over the low-dimensional projections that enter our inclusion–exclusion argument. Finally, part (iii) is the high-dimensional scaling condition ensuring that the resulting remainder terms vanish. In particular, it specifies the regime in which the projected Edgeworth approximation is accurate enough to deliver a valid second-order expansion for the coverage probability.

Assumption 2.2.

The multipliers $w_{1},w_{2},\dots$ are i.i.d., independent of the data, satisfy

\mathbb{E}w_{1}=0,\qquad\mathbb{E}w_{1}^{2}=1,\qquad\mathbb{E}|w_{1}|^{m}<\infty\quad\text{for all }m\geq 1,

and, in addition, satisfy one of the following two conditions:

(i)

$w_{1}\sim N(0,1)$ ;
(ii)

$w_{1}$ admits a Stein kernel $\tau^{w}(w_{1})$ and there exists a constant $b_{w}\geq 1$ such that

$|w_{1}|\leq b_{w},\qquad|\tau^{w}(w_{1})|\leq b_{w}^{2}\qquad\text{a.s.}$

The constants in the sequel are allowed to depend on $b_{w}$ .

Remark 2.2.

Assumption 2.2 is the bootstrap analogue of Assumption 2.1. It ensures that, conditional on the data, the multiplier statistic admits the same kind of Stein–Edgeworth expansion as the original statistic. The Gaussian case is separated out because it is the canonical multiplier choice and automatically fits the required framework. The alternative bounded Stein-kernel condition covers smooth non-Gaussian multipliers and is particularly useful for moment matching, which is central to the second-order improvement. As discussed in Koike (2026), this framework does not cover two-point multipliers such as Mammen’s weights, since two-point laws do not admit Stein kernels. Thus, the restriction is a limitation of the present proof strategy rather than of the bootstrap principle itself.

Assumption 2.3.

There exist constants $0<\underline{\sigma}\leq\overline{\sigma}<\infty$ such that

\underline{\sigma}^{2}\leq\Sigma_{jj}\leq\overline{\sigma}^{2},\qquad j=1,\dots,d.

Remark 2.3.

Assumption 2.3 places all coordinates on a common scale. Because our target is the raw order statistic $T_{n,[k]}$ , we are ranking the coordinates of the normalized sum without any coordinatewise rescaling. Uniform upper and lower bounds on the marginal variances therefore rule out the possibility that some coordinates dominate the ranking merely because their variances diverge, or become asymptotically irrelevant because their variances vanish. Without this assumption, the geometry of the $k$ th largest coordinate would depend on heterogeneous marginal scales, and the limiting problem would be substantially more complicated. In that regime one would typically need a different normalization or even a different target statistic.

Assumption 2.4.

Let $\rho_{d}:=\max_{1\leq i\neq j\leq d}|\Sigma_{ij}|.$ We assume $\rho_{d}\log d\to 0.$

Remark 2.4.

Assumption 2.4 is a weak-dependence condition tailored to our proof of the order-statistic expansion. The key step in the argument is to approximate the event $\{T_{n,[k]}>t\}$ by a finite-order inclusion–exclusion expansion and to show that the probability of having many coordinates simultaneously exceeding $t$ is negligible. For this strategy to work, exceedances above a high threshold must behave as rare events with only weak clustering, and the condition $\rho_{d}\log d\to 0$ enforces exactly this feature. When pairwise correlations are too strong, exceedances can occur in large clusters, and then one can no longer guarantee that the probability of having more than $k_{0}$ coordinates above the threshold decays fast enough for the truncation argument to be valid. Handling such strongly dependent regimes would require substantially further studies.

Fix a constant $A>0$ and define $k_{0}:=\left\lceil A\log(\varepsilon_{n}^{-1})\right\rceil$ . Throughout, $k\geq 1$ is fixed. Finally define

Q_{n,k}(t):=-\sum_{s=k}^{k_{0}}(-1)^{s-k}\binom{s-1}{k-1}\{M_{n,s}(t)-M_{Z,s}(t)\},

(1)

and define $\hat{Q}_{n,\gamma,k}(t)$ analogously with $\hat{M}_{n,s,\gamma}(t)$ in place of $M_{n,s}(t)$ .

Theorem 2.1.

Assume Assumptions 2.1–2.4. Then, for $A>0$ large enough,

\sup_{\epsilon<\alpha<1-\epsilon}\left|\mathbb{P}\bigl(T_{n,[k]}\geq\hat{c}_{1-\alpha,k}\bigr)-\left[\alpha-(1-\gamma)Q_{n,k}(c^{G}_{1-\alpha,k})-\mathbb{E}\{R_{n,k}(\alpha)\}\right]\right|\leq C\varepsilon_{n}^{2},

(2)

where

R_{n,k}(\alpha):=\frac{f_{k}^{\prime}(c^{G}_{1-\alpha,k})}{2f_{k}(c^{G}_{1-\alpha,k})^{3}}\hat{Q}_{n,\gamma,k}(c^{G}_{1-\alpha,k})^{2}-\frac{\hat{Q}_{n,\gamma,k}^{\prime}(c^{G}_{1-\alpha,k})}{f_{k}(c^{G}_{1-\alpha,k})^{2}}\hat{Q}_{n,\gamma,k}(c^{G}_{1-\alpha,k}).

Theorem 2.1 is the main second-order coverage statement for the single wild bootstrap. It shows that the leading coverage distortion is described by the deterministic linear term $(1-\gamma)Q_{n,k}$ together with the quadratic Cornish–Fisher correction $\mathbb{E}\{R_{n,k}(\alpha)\}$ , while the remaining error is of order $\varepsilon_{n}^{2}$ .

Corollary 2.1 (Third-moment matching).

Under the assumptions of Theorem 2.1, if $\gamma=1$ , then

\sup_{\epsilon<\alpha<1-\epsilon}\left|\mathbb{P}\bigl(T_{n,[k]}\geq\hat{c}_{1-\alpha,k}\bigr)-\alpha\right|\leq C\varepsilon_{n}^{2}.

Corollary 2.1 shows that matching the third multiplier moment removes the linear coverage distortion identified in Theorem 2.1. The wild bootstrap then becomes second-order accurate on the $\varepsilon_{n}^{2}$ scale without any further correction.

Corollary 2.2 (Persistence of the first-order term).

Under the assumptions of Theorem 2.1,

\sup_{\epsilon<\alpha<1-\epsilon}\left|\mathbb{P}\bigl(T_{n,[k]}\geq\hat{c}_{1-\alpha,k}\bigr)-\alpha+(1-\gamma)Q_{n,k}(c^{G}_{1-\alpha,k})\right|\leq C\varepsilon_{n}^{2}.

In particular, if for some $\alpha_{0}\in(\epsilon,1-\epsilon)$ one has

|Q_{n,k}(c^{G}_{1-\alpha_{0},k})|\geq c_{0}\varepsilon_{n},

then

\left|\mathbb{P}\bigl(T_{n,[k]}\geq\hat{c}_{1-\alpha_{0},k}\bigr)-\alpha_{0}\right|\geq|1-\gamma|c_{0}\varepsilon_{n}-C\varepsilon_{n}^{2}.

Corollary 2.2 shows that the term $(1-\gamma)Q_{n,k}$ is not an artifact of the proof. Unless the third moment is matched, the single-bootstrap coverage error typically remains of first-order size.

2.2 Double wild bootstrap

Let $v_{1},\dots,v_{n}$ be i.i.d. multipliers, independent of everything else, satisfying

\mathbb{E}v_{1}=0,\qquad\mathbb{E}v_{1}^{2}=1,\qquad\mathbb{E}v_{1}^{3}=1,

and the same regularity condition as in Assumption 2.2. Define

\bm{X}_{i}^{*}:=w_{i}(\bm{X}_{i}-\bar{\bm{X}}),\qquad\bar{\bm{X}}^{*}:=\frac{1}{n}\sum_{i=1}^{n}\bm{X}_{i}^{*},\qquad\bm{S}_{n}^{**}:=\frac{1}{\sqrt{n}}\sum_{i=1}^{n}v_{i}(\bm{X}_{i}^{*}-\bar{\bm{X}}^{*}).

Let $T^{**}_{n,[k]}$ be the $k$ th largest coordinate of $\bm{S}_{n}^{**}$ , let

\hat{F}^{*}_{n,k}(t):=\mathbb{P}^{**}(T^{**}_{n,[k]}\leq t),

and define

\hat{\beta}_{\alpha,k}:=\inf\Bigl\{\beta\in(0,1):\ \mathbb{P}^{*}\bigl(\hat{F}^{*}_{n,k}(T_{n,[k]}^{*})\leq\beta\bigr)\geq 1-\alpha\Bigr\}.

The prepivoted double-bootstrap test rejects when

T_{n,[k]}\geq\hat{c}_{\hat{\beta}_{\alpha,k},k}.

Theorem 2.2.

Assume Assumptions 2.1–2.4 and the second-level multiplier condition above. Then, for every fixed $\epsilon\in(0,1/4)$ ,

\sup_{2\epsilon<\alpha<1-2\epsilon}\left|\mathbb{P}\bigl(T_{n,[k]}\geq\hat{c}_{\hat{\beta}_{\alpha,k},k}\bigr)-\alpha\right|\leq C\varepsilon_{n}^{2}.

(3)

Theorem 2.2 shows that prepivoting removes the leading single-bootstrap distortion and restores second-order accuracy. Thus the double wild bootstrap achieves the same $\varepsilon_{n}^{2}$ coverage scale as the third-moment matching single bootstrap.

2.3 A stationary exponential-mixing alternative

The maximal-correlation condition in Assumption 2.4 can be replaced by a one-dimensional dependence condition when the Gaussian reference field is generated by a stationary Gaussian sequence. The price is an explicit additional remainder that records the contribution of local clusters of exceedances.

Assumption 2.5 (Stationary Gaussian coordinates with exponential strong mixing).

The Gaussian reference vector $\bm{Z}=(Z_{1},\dots,Z_{d})^{\top}$ is the first $d$ coordinates of a centered stationary Gaussian sequence $\{Z_{j}\}_{j\in\mathbb{Z}}$ with covariance function

\Gamma(h):=\mathrm{Cov}(Z_{0},Z_{h}),\qquad\Gamma(0)=\sigma^{2}\in[\underline{\sigma}^{2},\overline{\sigma}^{2}].

Its strong-mixing coefficients satisfy

\alpha(\ell):=\sup\Bigl\{\bigl|\mathbb{P}(A\cap B)-\mathbb{P}(A)\mathbb{P}(B)\bigr|:A\in\sigma(Z_{j}:j\leq 0),\ B\in\sigma(Z_{j}:j\geq\ell)\Bigr\}\leq C_{\alpha}e^{-a_{\alpha}\ell},\qquad\ell\geq 1,

for some constants $C_{\alpha}\geq 1$ and $a_{\alpha}>0$ .

Write

\rho(h):=\mathrm{Corr}(Z_{0},Z_{h})=\Gamma(h)/\sigma^{2},\qquad h\in\mathbb{Z},\qquad p(t):=\mathbb{P}(Z_{1}>t)=\bar{\Phi}(t/\sigma),\qquad\lambda(t):=d\,p(t).

Because every $2\times 2$ principal submatrix of $\Sigma$ has diagonal entries at most $\overline{\sigma}^{2}$ and smallest eigenvalue at least $\sigma_{*}^{2}$ , we have

\sup_{h\geq 1}|\rho(h)|\leq 1-\frac{\sigma_{*}^{2}}{\overline{\sigma}^{2}}=:\vartheta_{*}<1.

(4)

Set

\beta_{*}:=\frac{1-\vartheta_{*}}{1+\vartheta_{*}}>0.

(5)

Let $\Lambda_{k,\epsilon}>0$ be the unique constant satisfying

h_{k}(\Lambda_{k,\epsilon})=\epsilon/8,\qquad h_{k}(\lambda):=e^{-\lambda}\sum_{m=0}^{k-1}\frac{\lambda^{m}}{m!}.

(6)

Fix

m_{d}:=\left\lceil d^{\beta_{*}/4}\right\rceil,\qquad\ell_{d}:=\left\lceil\frac{8(k_{0}+2)\log(2d)+8\log n}{a_{\alpha}}\right\rceil,\qquad q_{d}:=\left\lfloor\frac{d}{m_{d}+\ell_{d}}\right\rfloor.

(7)

Define

\eta_{1,d}:=\frac{\ell_{d}}{m_{d}}+\frac{m_{d}+\ell_{d}}{d}+d^{-3\beta_{*}/4}(\log d)^{-1/2},

(8)

and

r_{d}:=\eta_{1,d}+q_{d}^{-1}+d^{k_{0}+1}\alpha(\ell_{d})+\frac{(3\Lambda_{k,\epsilon})^{k_{0}+1}}{(k_{0}+1)!}.

(9)

Since (7) implies

\alpha(\ell_{d})\leq C_{\alpha}n^{-8}(2d)^{-8(k_{0}+2)},

(10)

the sequence $r_{d}$ tends to $0$ .

Theorem 2.3 (Stationary exponential-mixing alternative).

Assume Assumptions 2.1, 2.2, 2.3, and 2.5. Then, for $A>0$ large enough, there exists a constant $C>0$ such that

	$\displaystyle\sup_{t\in\mathcal{T}_{k,\epsilon}}\left\|\mathbb{P}(T_{n,[k]}\leq t)-\bigl(G_{k}(t)+Q_{n,k}(t)\bigr)\right\|$	$\displaystyle\leq C(\varepsilon_{n}^{2}+r_{d}),$		(11)
	$\displaystyle\sup_{t\in\mathcal{T}_{k,\epsilon}}\left\|\mathbb{P}^{}(T_{n,[k]}^{}\leq t)-\bigl(G_{k}(t)+\hat{Q}_{n,\gamma,k}(t)\bigr)\right\|$	$\displaystyle\leq C(\varepsilon_{n}^{2}+r_{d})$		(12)

with probability at least $1-C/n$ ,

\sup_{\epsilon<\alpha<1-\epsilon}\left|\hat{c}_{1-\alpha,k}-\left[c^{G}_{1-\alpha,k}-\frac{\hat{Q}_{n,\gamma,k}(c^{G}_{1-\alpha,k})}{f_{k}(c^{G}_{1-\alpha,k})}+R_{n,k}(\alpha)\right]\right|\leq C(\varepsilon_{n}^{3}+r_{d}),

(13)

with probability at least $1-C/n$ ,

\sup_{\epsilon<\alpha<1-\epsilon}\left|\mathbb{P}\bigl(T_{n,[k]}\geq\hat{c}_{1-\alpha,k}\bigr)-\left[\alpha-(1-\gamma)Q_{n,k}(c^{G}_{1-\alpha,k})-\mathbb{E}\{R_{n,k}(\alpha)\}\right]\right|\leq C(\varepsilon_{n}^{2}+r_{d}),

(14)

and, if $\gamma=1$ ,

\sup_{\epsilon<\alpha<1-\epsilon}\left|\mathbb{P}\bigl(T_{n,[k]}\geq\hat{c}_{1-\alpha,k}\bigr)-\alpha\right|\leq C(\varepsilon_{n}^{2}+r_{d}).

(15)

The corresponding double wild bootstrap statement also holds with the same remainder:

\sup_{2\epsilon<\alpha<1-2\epsilon}\left|\mathbb{P}\bigl(T_{n,[k]}\geq\hat{c}_{\hat{\beta}_{\alpha,k},k}\bigr)-\alpha\right|\leq C(\varepsilon_{n}^{2}+r_{d}).

(16)

Theorem 2.3 replaces the maximal-correlation condition by a one-dimensional dependence assumption on the Gaussian reference field. The price is the explicit remainder $r_{d}$ , which isolates the effect of local clustering while leaving the structure of the second-order expansion unchanged.

Remark 2.5.

The remainder $r_{d}$ is driven mainly by the block-length ratio $\ell_{d}/m_{d}$ . Since $k_{0}\leq C\log n$ , the definition of $\ell_{d}$ yields

\ell_{d}\leq C\log n\,\log(2d).

Consequently,

\displaystyle r_{d}

\displaystyle\leq C\frac{\log n\,\log(2d)}{d^{\beta_{*}/4}}+Cd^{-1+\beta_{*}/4}+Cd^{-3\beta_{*}/4}(\log d)^{-1/2}+Cn^{-8}d^{-7k_{0}-15}+C\frac{(3\Lambda_{k,\epsilon})^{k_{0}+1}}{(k_{0}+1)!}.

In particular, a sufficient condition for $r_{d}=O(n^{-1})$ is

d^{\beta_{*}/4}\geq Cn\log n\,\log(2d).

If $d=n^{c}$ , then

\displaystyle r_{d}

\displaystyle\leq Cn^{-c\beta_{*}/4}\log^{2}n+Cn^{-c(1-\beta_{*}/4)}+Cn^{-3c\beta_{*}/4}(\log n)^{-1/2}+Cn^{-8-c(7k_{0}+15)}+C\frac{(3\Lambda_{k,\epsilon})^{k_{0}+1}}{(k_{0}+1)!},

so $r_{d}=O(n^{-1})$ whenever $c>4/\beta_{*}$ .

The proof of Theorem 2.3 is given in Appendix B. Only the Gaussian aggregation part of Appendix A needs to be modified; the projected local Edgeworth expansion remains unchanged once the shift/strip bounds are re-established under Assumption 2.5.

3 Simulation

We investigate the finite-sample size of the bootstrap procedures for the $k$ th largest coordinate $T_{n,[1]}\geq T_{n,[2]}\geq\cdots\geq T_{n,[d]}$ of $\bm{S}_{n}=\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\bm{X}_{i}.$ The simulation design is kept fixed across all experiments, and only the target order statistic is varied. We report results for $k\in\{2,5,10\}.$ The case $k=1$ coincides with the maximum and is therefore omitted here.

Throughout the simulation, the dimension is fixed at $d=400,$ and the sample size is taken from $n\in\{200,400\}.$ For the dependence structure, we consider two correlation designs. In Design I,

\mathbf{R}=\rho\,\mathbf{1}_{d}\mathbf{1}_{d}^{\top}+(1-\rho)\mathbf{I}_{d},

and in Design II,

\mathbf{R}=(\rho^{|j-k|}),\qquad 1\leq j,k\leq d,

with $\rho\in\{0.2,0.8\}.$

Let $\Phi$ denote the standard normal distribution function. For $\theta>0$ , let $F_{\theta}$ be the distribution function of the gamma distribution with shape parameter $\theta$ and unit scale. For each Monte Carlo repetition, we first generate

\bm{Z}_{i}=(Z_{i1},\dots,Z_{id})^{\top}\sim N(\bm{0},\mathbf{R}),\qquad i=1,\dots,n,

independently, and then define

U_{ij}=F_{\theta}^{-1}\bigl(\Phi(Z_{ij})\bigr),\qquad 1\leq i\leq n,\;1\leq j\leq d.

This yields a Gaussian-copula model with gamma marginals.

We consider two cases.

•

Asymmetric case. We set $\theta=1$ and define

$\bm{X}_{i}=\bm{U}_{i}-\mathbf{1}_{d},\qquad i=1,\dots,n,$

where $\bm{U}_{i}=(U_{i1},\dots,U_{id})^{\top}$ . Since each marginal has mean $1$ , the vector $\bm{X}_{i}$ is centered.
•

Symmetric case. We set $\theta=\tfrac{1}{2}$ . Let $\bm{U}_{i}^{\prime}$ be an independent copy of $\bm{U}_{i}$ , and define

$\bm{X}_{i}=\bm{U}_{i}-\bm{U}_{i}^{\prime},\qquad i=1,\dots,n.$

This symmetrization removes skewness. The choice $\theta=\tfrac{1}{2}$ keeps the marginal kurtosis on the same scale as in the asymmetric setup.

We consider the following bootstrap methods:

•

Empirical bootstrap (EB). The classic naive bootstrap methods;
•

Gaussian wild bootstrap (GB): $w_{i}\sim N(0,1).$

•

Mammen wild bootstrap (MB):

\mathbb{P}\!\left(w_{i}=\frac{1+\sqrt{5}}{2}\right)=\frac{\sqrt{5}-1}{2\sqrt{5}},\qquad\mathbb{P}\!\left(w_{i}=-\frac{\sqrt{5}-1}{2}\right)=\frac{\sqrt{5}+1}{2\sqrt{5}}.

•

Rademacher wild bootstrap (RB): $\mathbb{P}(w_{i}=1)=\mathbb{P}(w_{i}=-1)=\frac{1}{2}.$

•

Beta wild bootstrap (BB): let $\nu=0.1$ and define

c_{\nu}=\nu^{2}+20\nu+20,\alpha_{\nu}=\frac{\nu}{2}\left(1-\frac{\nu+2}{\sqrt{c_{\nu}}}\right),\beta_{\nu}=\frac{\nu}{2}\left(1+\frac{\nu+2}{\sqrt{c_{\nu}}}\right).

Let $\eta_{i}\sim\mathrm{Beta}(\alpha_{\nu},\beta_{\nu})$ i.i.d., and standardize by

w_{i}=\frac{\eta_{i}-\mathbb{E}[\eta_{i}]}{\sqrt{\mathrm{Var}(\eta_{i})}}.

Then

\mathbb{E}[w_{i}]=0,\qquad\mathbb{E}[w_{i}^{2}]=1,\qquad\mathbb{E}[w_{i}^{3}]=1.

•

double wild bootstrap (DB). The bootstrap method proposed in subsection 2.2.

For the Monte Carlo implementation, we use $B_{1}=499$ first-level bootstrap replications for EB, GB, MB, RB, and BB, and for DB we use $B_{1}=499,B_{2}=99$ at the first and second bootstrap levels, respectively.

Tables 1-3 report the emprical sizes of different bootstrap methods at the $10\%$ level for $k=2,5,10$ , respectively. Across $k\in\{2,5,10\}$ , the qualitative ordering of the bootstrap procedures is largely unchanged. The dominant source of finite-sample distortion is the underlying design—most notably asymmetry and the more difficult Design II—rather than the value of $k$ itself. EB is uniformly conservative, with the under-rejection being especially visible in the asymmetric settings and in symmetric Design II, although the distortion is somewhat mitigated as $n$ increases. GB is more design-sensitive: it is reasonably well calibrated in symmetric Design I, but becomes distinctly liberal under asymmetry, particularly when $n$ is small and $\rho=0.2$ . MB and BB display the most stable behavior overall; both are typically mildly conservative, yet they avoid the substantial over-rejection exhibited by GB and, more markedly, RB, and their performance is comparatively robust across designs and values of $k$ . RB is the least robust method: it is very accurate, and often closest to the nominal level, in the symmetric experiments, but it becomes severely oversized under asymmetry, especially in Design II. DB is frequently numerically closest to the nominal $10\%$ level in the asymmetric designs, although this occurs through a persistent liberal bias; under symmetry it likewise remains slightly oversized. Larger $n$ generally improves calibration, and increasing $k$ attenuates some distortions, but these effects are quantitative rather than qualitative. Overall, the evidence points to MB and BB as the most reliable choices when uniform size control across heterogeneous designs is the primary concern, whereas RB is competitive only when symmetry is a credible approximation.

Table 1: Empirical sizes at the

10\%

level for the second largest coordinate

T_{n,[2]}

Panel A: Asymmetric
Design	$n$	$\rho$	EB	GB	MB	RB	BB	DB
I	200	0.2	0.0612	0.1211	0.0757	0.1464	0.0743	0.1114
I	200	0.8	0.0731	0.0869	0.0739	0.0907	0.0719	0.1002
I	400	0.2	0.0732	0.1172	0.0809	0.1301	0.0802	0.1044
I	400	0.8	0.0814	0.0954	0.0809	0.0962	0.0821	0.1034
II	200	0.2	0.0619	0.152	0.0888	0.215	0.0884	0.115
II	200	0.8	0.0686	0.137	0.0865	0.174	0.0857	0.107
II	400	0.2	0.0790	0.153	0.0946	0.182	0.0936	0.109
II	400	0.8	0.0826	0.134	0.0927	0.156	0.0921	0.105
Panel B: Symmetric
I	200	0.2	0.0731	0.0829	0.0892	0.1047	0.0907	0.120
I	200	0.8	0.0970	0.1015	0.1002	0.1058	0.0993	0.114
I	400	0.2	0.0868	0.0914	0.0963	0.1035	0.0934	0.109
I	400	0.8	0.0993	0.1016	0.1029	0.1038	0.1026	0.109
II	200	0.2	0.0584	0.0653	0.0830	0.106	0.0814	0.116
II	200	0.8	0.0677	0.0759	0.0859	0.101	0.0848	0.104
II	400	0.2	0.0763	0.0807	0.0902	0.102	0.0893	0.107
II	400	0.8	0.0877	0.0911	0.0977	0.106	0.0984	0.110

Table 2: Empirical sizes at the

10\%

level for the 5th largest coordinate

T_{n,[5]}

Panel A: Asymmetric
Design	$n$	$\rho$	EB	GB	MB	RB	BB	DB
I	200	0.2	0.0645	0.1140	0.0731	0.1298	0.0730	0.107
I	200	0.8	0.0738	0.0866	0.0742	0.0878	0.0735	0.099
I	400	0.2	0.0754	0.1119	0.0823	0.1235	0.0809	0.102
I	400	0.8	0.0835	0.0953	0.0852	0.0955	0.0843	0.103
II	200	0.2	0.0565	0.1510	0.0887	0.2220	0.0854	0.118
II	200	0.8	0.0712	0.1380	0.0883	0.1730	0.0875	0.111
II	400	0.2	0.0731	0.1510	0.0907	0.1880	0.0884	0.111
II	400	0.8	0.0792	0.1290	0.0890	0.1450	0.0886	0.101
Panel B: Symmetric
I	200	0.2	0.0820	0.0891	0.0918	0.1050	0.0914	0.113
I	200	0.8	0.0985	0.1035	0.1006	0.1060	0.0974	0.112
I	400	0.2	0.0933	0.0962	0.0986	0.1050	0.0979	0.109
I	400	0.8	0.1005	0.1009	0.1007	0.1030	0.1017	0.107
II	200	0.2	0.0587	0.0629	0.0826	0.1070	0.0798	0.122
II	200	0.8	0.0711	0.0780	0.0867	0.1010	0.0883	0.109
II	400	0.2	0.0781	0.0802	0.0918	0.1050	0.0917	0.115
II	400	0.8	0.0886	0.0906	0.0960	0.1040	0.0957	0.107

Table 3: Empirical sizes at the

10\%

level for the 10th largest coordinate

T_{n,[10]}

Panel A: Asymmetric
Design	$n$	$\rho$	EB	GB	MB	RB	BB	DB
I	200	0.2	0.0695	0.1073	0.0769	0.1173	0.0757	0.1052
I	200	0.8	0.0745	0.0853	0.0749	0.0871	0.0730	0.0974
I	400	0.2	0.0770	0.1046	0.0798	0.1111	0.0796	0.0983
I	400	0.8	0.0843	0.0944	0.0867	0.0948	0.0856	0.1027
II	200	0.2	0.0537	0.142	0.0812	0.223	0.0782	0.117
II	200	0.8	0.0774	0.132	0.0883	0.159	0.0875	0.110
II	400	0.2	0.0732	0.146	0.0905	0.187	0.0895	0.113
II	400	0.8	0.0842	0.129	0.0932	0.146	0.0919	0.104
Panel B: Symmetric
I	200	0.2	0.0851	0.0894	0.0901	0.0997	0.0900	0.110
I	200	0.8	0.1005	0.1033	0.1014	0.1056	0.1002	0.110
I	400	0.2	0.0945	0.0969	0.0974	0.1019	0.0982	0.106
I	400	0.8	0.1004	0.1014	0.1002	0.1026	0.1017	0.105
II	200	0.2	0.0608	0.0636	0.0822	0.108	0.0789	0.124
II	200	0.8	0.0764	0.0786	0.0890	0.103	0.0904	0.110
II	400	0.2	0.0756	0.0774	0.0894	0.104	0.0891	0.114
II	400	0.8	0.0885	0.0902	0.0951	0.102	0.0960	0.107

4 Conclusion

This paper studies Gaussian and bootstrap approximations for the $k$ th largest coordinate statistic $T_{n,[k]}$ in high dimensions. We establish theoretical guarantees that justify bootstrap critical values when the ambient dimension is allowed to grow with the sample size, thereby extending valid inference beyond the maximum to nonmaximal order statistics. The simulation results show that the proposed framework delivers accurate finite-sample inference and clarify the relative robustness of the competing bootstrap procedures across a range of designs.

An important direction for future research is to develop analogous Gaussian approximation results for temporally dependent observations. Doing so would require a theory that accommodates serial dependence, long-run covariance estimation, and resampling schemes that preserve the time-series structure; see, for example, (Shao, 2010; Zhang and Wu, 2017; Zhang and Cheng, 2014, 2018; Chang et al., 2024, 2023, 2025).

Appendix A Appendix A: Proofs of Theorems

A.1 Combinatorial identities

Lemma A.1 (Finite inclusion–exclusion identity).

For every integer $k\geq 1$ and every nonnegative integer-valued random variable $N$ ,

\mathbf{1}\{N\geq k\}=\sum_{s=k}^{N}(-1)^{s-k}\binom{s-1}{k-1}\binom{N}{s}.

(17)

Consequently,

\mathbb{P}(T_{n,[k]}>t)=\sum_{s=k}^{d}(-1)^{s-k}\binom{s-1}{k-1}V_{n,s}(t),

(18)

and analogously with $N_{n}^{*}(t)$ and $N_{Z}(t)$ .

Proof.

For deterministic $N=m$ , define

S_{m,k}:=\sum_{s=k}^{m}(-1)^{s-k}\binom{s-1}{k-1}\binom{m}{s}.

Using

\binom{s-1}{k-1}\binom{m}{s}=\binom{m}{k}\binom{m-k}{s-k},

we obtain

S_{m,k}=\binom{m}{k}\sum_{r=0}^{m-k}(-1)^{r}\binom{m-k}{r}=\binom{m}{k}(1-1)^{m-k}.

Hence

S_{m,k}=\begin{cases}0,&m<k,\\ 1,&m\geq k.\end{cases}

This proves (17). Taking expectations with $N=N_{n}(t)$ gives (18). ∎

A.2 Projected quantities

For every nonempty $I=\{i_{1},\dots,i_{s}\}\subset[d]$ , let $\bm{P}_{I}:\mathbb{R}^{d}\to\mathbb{R}^{s}$ denote the coordinate projection. Define

\bm{S}_{n,I}:=\bm{P}_{I}\bm{S}_{n},\qquad\mathbf{\Sigma}_{II}:=\bm{P}_{I}\mathbf{\Sigma}\bm{P}_{I}^{\top},\qquad\phi_{I}:=\phi_{\mathbf{\Sigma}_{II}}.

Also define

\bar{\bm{X}}_{I}^{\,r}:=\frac{1}{n}\sum_{i=1}^{n}(\bm{P}_{I}\bm{X}_{i})^{\otimes r},\qquad\bm{b}_{i,I}:=\bm{P}_{I}(\bm{X}_{i}-\bar{\bm{X}}),\qquad\bar{\bm{b}}_{I}^{\,r}:=\frac{1}{n}\sum_{i=1}^{n}\bm{b}_{i,I}^{\otimes r}.

The projected Edgeworth densities are

	$\displaystyle p_{n,I}(\bm{u})$	$\displaystyle:=\phi_{I}(\bm{u})-\frac{1}{6\sqrt{n}}\bigl\langle\mathbb{E}[\bar{\bm{X}}_{I}^{\,3}],\nabla^{3}\phi_{I}(\bm{u})\bigr\rangle,$		(19)
	$\displaystyle\hat{p}_{n,\gamma,I}(\bm{u})$	$\displaystyle:=\phi_{I}(\bm{u})+\frac{1}{2}\bigl\langle\bar{\bm{b}}_{I}^{\,2}-\mathbf{\Sigma}_{II},\nabla^{2}\phi_{I}(\bm{u})\bigr\rangle-\frac{\gamma}{6\sqrt{n}}\bigl\langle\bar{\bm{b}}_{I}^{\,3},\nabla^{3}\phi_{I}(\bm{u})\bigr\rangle.$		(20)

Lemma A.2 (Projection preserves the data-side assumptions).

Assume Assumption 2.1. For every nonempty $I\subset[d]$ the projected vectors $\bm{P}_{I}\bm{X}_{1},\dots,\bm{P}_{I}\bm{X}_{n}$ satisfy the same Stein identity with covariance matrix $\mathbf{\Sigma}_{II}$ , the same sub-exponential envelope $b$ , and

\lambda_{\min}(\mathbf{\Sigma}_{II})\geq\sigma_{*}^{2}.

Proof.

Let $\bm{Y}_{i}:=\bm{P}_{I}\bm{X}_{i}$ . For any smooth $g:\mathbb{R}^{|I|}\to\mathbb{R}^{|I|}$ define

f(\bm{x}):=\bm{P}_{I}^{\top}g(\bm{P}_{I}\bm{x}).

Then

\nabla f(\bm{x})^{\top}=\bm{P}_{I}^{\top}\nabla g(\bm{P}_{I}\bm{x})^{\top}\bm{P}_{I}.

Applying the Stein identity for $\bm{X}_{i}$ yields

	$\displaystyle\mathbb{E}[\bm{Y}_{i}^{\top}g(\bm{Y}_{i})]$	$\displaystyle=\mathbb{E}[\bm{X}_{i}^{\top}f(\bm{X}_{i})]$
		$\displaystyle=\mathbb{E}\Bigl[\mathrm{tr}\{\bm{\tau}_{i}(\bm{X}_{i})\bm{P}_{I}^{\top}\nabla g(\bm{P}_{I}\bm{X}_{i})^{\top}\bm{P}_{I}\}\Bigr]$
		$\displaystyle=\mathbb{E}\Bigl[\mathrm{tr}\{\bm{P}_{I}\bm{\tau}_{i}(\bm{X}_{i})\bm{P}_{I}^{\top}\nabla g(\bm{Y}_{i})^{\top}\}\Bigr].$

Hence $\bm{P}_{I}\bm{\tau}_{i}(\bm{X}_{i})\bm{P}_{I}^{\top}$ is a Stein kernel for $\bm{Y}_{i}$ . The $\psi_{1}$ bounds follow by monotonicity under projection. Finally, for every nonzero $\bm{u}\in\mathbb{R}^{|I|}$ ,

\bm{u}^{\top}\mathbf{\Sigma}_{II}\bm{u}=(\bm{P}_{I}^{\top}\bm{u})^{\top}\mathbf{\Sigma}(\bm{P}_{I}^{\top}\bm{u})\geq\sigma_{*}^{2}\|\bm{P}_{I}^{\top}\bm{u}\|_{2}^{2}=\sigma_{*}^{2}\|\bm{u}\|_{2}^{2}.

∎

A.3 External matrix, Gaussian-comparison, and Koike lemmas

Lemma A.3 (Gershgorin interval theorem).

Let $A=(a_{j\ell})\in\mathbb{R}^{s\times s}$ be symmetric. Then

\lambda_{\min}(A)\geq\min_{1\leq j\leq s}\{a_{jj}-R_{j}(A)\},\qquad\lambda_{\max}(A)\leq\max_{1\leq j\leq s}\{a_{jj}+R_{j}(A)\}.

(21)

Proof.

Let $\lambda$ be an eigenvalue of $A$ with eigenvector $x=(x_{1},\dots,x_{s})^{\top}\neq 0$ . Choose

j_{0}\in\arg\max_{1\leq j\leq s}|x_{j}|.

Since $Ax=\lambda x$ ,

(\lambda-a_{j_{0}j_{0}})x_{j_{0}}=\sum_{\ell\neq j_{0}}a_{j_{0}\ell}x_{\ell}.

Hence

|\lambda-a_{j_{0}j_{0}}||x_{j_{0}}|\leq\sum_{\ell\neq j_{0}}|a_{j_{0}\ell}||x_{\ell}|\leq R_{j_{0}}(A)|x_{j_{0}}|,

and therefore

|\lambda-a_{j_{0}j_{0}}|\leq R_{j_{0}}(A).

This proves that every eigenvalue belongs to at least one Gershgorin interval

[a_{jj}-R_{j}(A),a_{jj}+R_{j}(A)],\qquad j=1,\dots,s.

Taking the minimum and maximum over these intervals yields (21). ∎

Lemma A.4 (Berman–Li–Shao normal comparison inequality).

Let $\xi=(\xi_{1},\dots,\xi_{s})^{\top}$ and $\eta=(\eta_{1},\dots,\eta_{s})^{\top}$ be centered Gaussian vectors with

\mathrm{Var}(\xi_{j})=\mathrm{Var}(\eta_{j})=1,\qquad 1\leq j\leq s.

Write $r_{j\ell}^{\xi}:=\mathrm{Corr}(\xi_{j},\xi_{\ell})$ and $r_{j\ell}^{\eta}:=\mathrm{Corr}(\eta_{j},\eta_{\ell})$ , and define

\rho_{j\ell}:=\max\{|r_{j\ell}^{\xi}|,|r_{j\ell}^{\eta}|\}.

Then for every $u=(u_{1},\dots,u_{s})^{\top}\in\mathbb{R}^{s}$ ,

	$\displaystyle\left\|\mathbb{P}(\xi_{1}\leq u_{1},\dots,\xi_{s}\leq u_{s})-\mathbb{P}(\eta_{1}\leq u_{1},\dots,\eta_{s}\leq u_{s})\right\|$
	$\displaystyle\qquad\leq\frac{1}{2\pi}\sum_{1\leq j<\ell\leq s}\left\|\arcsin(r_{j\ell}^{\xi})-\arcsin(r_{j\ell}^{\eta})\right\|\exp\!\left(-\frac{u_{j}^{2}+u_{\ell}^{2}}{2(1+\rho_{j\ell})}\right).$		(22)

In particular, if $\eta$ has independent coordinates and

\bar{\rho}:=\max_{1\leq j<\ell\leq s}|r_{j\ell}^{\xi}|<1,

then

	$\displaystyle\left\|\mathbb{P}(\xi_{1}\leq u_{1},\dots,\xi_{s}\leq u_{s})-\prod_{j=1}^{s}\Phi(u_{j})\right\|$
	$\displaystyle\qquad\leq\frac{1}{2\pi\sqrt{1-\bar{\rho}^{2}}}\sum_{1\leq j<\ell\leq s}\|r_{j\ell}^{\xi}\|\exp\!\left(-\frac{u_{j}^{2}+u_{\ell}^{2}}{2(1+\bar{\rho})}\right).$		(23)

Proof.

The inequality (22) is Theorem 1 of Li and Shao (2002), which refines Berman’s original comparison argument (Berman, 1964). If $\eta$ has independent coordinates, then $r_{j\ell}^{\eta}=0$ for all $j\neq\ell$ , and the mean-value theorem gives

\left|\arcsin(r_{j\ell}^{\xi})-\arcsin(0)\right|\leq\sup_{|u|\leq\bar{\rho}}\frac{1}{\sqrt{1-u^{2}}}|r_{j\ell}^{\xi}|\leq\frac{|r_{j\ell}^{\xi}|}{\sqrt{1-\bar{\rho}^{2}}}.

Substituting this estimate into (22) yields (23). ∎

Lemma A.5 (Koike smoothing identity).

Let $A\subset\mathbb{R}^{s}$ be measurable, let $Z_{I}\sim N(0,\Sigma_{II})$ , and define for $u\in(0,1]$ ,

h_{A,u}(x):=\mathbb{E}\bigl[\mathbf{1}_{A}(\sqrt{1-u}\,x+\sqrt{u}\,Z_{I})\bigr].

Then

h_{A,u}(x)=\int_{A}u^{-s/2}\phi_{I}\!\left(\frac{y-\sqrt{1-u}\,x}{\sqrt{u}}\right)\,dy.

(24)

Moreover, for every multi-index $\alpha\in\mathbb{N}_{0}^{s}$ with $|\alpha|=r\geq 1$ ,

\displaystyle\partial^{\alpha}h_{A,u}(x)

\displaystyle=(-1)^{r}\left(\frac{1-u}{u}\right)^{r/2}\int_{A}\partial^{\alpha}\phi_{I}\!\left(\frac{y-\sqrt{1-u}\,x}{\sqrt{u}}\right)u^{-s/2}\,dy.

(25)

Proof.

Formulas (24)–(25) are the projected specialization of equations (4.4)–(4.5) in Koike (2026). From the definition,

\displaystyle h_{A,u}(x)

\displaystyle=\int_{\mathbb{R}^{s}}\mathbf{1}_{A}(\sqrt{1-u}\,x+\sqrt{u}\,z)\phi_{I}(z)\,dz.

Set

y:=\sqrt{1-u}\,x+\sqrt{u}\,z,\qquad z=\frac{y-\sqrt{1-u}\,x}{\sqrt{u}},\qquad dz=u^{-s/2}\,dy.

Then (24) follows. Differentiate (24) under the integral sign. For each derivative $\partial_{x_{j}}$ ,

\partial_{x_{j}}\phi_{I}\!\left(\frac{y-\sqrt{1-u}\,x}{\sqrt{u}}\right)=-\sqrt{\frac{1-u}{u}}\,\partial_{j}\phi_{I}\!\left(\frac{y-\sqrt{1-u}\,x}{\sqrt{u}}\right).

Applying this identity $r=|\alpha|$ times gives (25). ∎

Lemma A.6 (Koike orthant derivative bound).

Let

A_{t}^{-}:=(-\infty,-t]^{s}.

There exist constants $c_{r},C_{r}>0$ , depending only on $r$ , such that for every $u\in(0,1/2]$ , every $x\in\mathbb{R}^{s}$ , and every integer $r\geq 1$ ,

\|\nabla^{r}h_{A_{t}^{-},u}(x)\|_{1}\leq C_{r}u^{-r/2}(1+t)^{r}\mathbb{P}\!\left(\sqrt{1-u}\,x+\sqrt{u}\,Z_{I}\in A_{t-a_{u}}^{-}\right),\qquad a_{u}:=c_{r}\sqrt{u\log(2s)}.

(26)

In particular,

\sup_{x\in\mathbb{R}^{s}}\|\nabla^{r}h_{A_{t}^{-},u}(x)\|_{1}\leq C_{r}\left(\frac{\log(2s)}{u}\right)^{r/2}.

(27)

Proof.

The uniform estimate (27) is exactly Lemma 4.4 in Koike (2026) after replacing $d$ by $s$ and using $\lambda_{\min}(\Sigma_{II})\geq\sigma_{*}^{2}$ from Lemma A.2. The localized bound (26) is obtained by combining (25) with the Anderson–Hall–Titterington bound stated as Lemma D.4 in Koike (2026). Indeed, for each multi-index $\alpha$ with $|\alpha|=r$ ,

|\partial^{\alpha}h_{A_{t}^{-},u}(x)|\leq\left(\frac{1-u}{u}\right)^{r/2}\int_{A_{t}^{-}}\left|\partial^{\alpha}\phi_{I}\!\left(\frac{y-\sqrt{1-u}\,x}{\sqrt{u}}\right)\right|u^{-s/2}\,dy.

If $z=(y-\sqrt{1-u}\,x)/\sqrt{u}$ , then $y\in A_{t}^{-}$ implies

\sqrt{1-u}\,x+\sqrt{u}\,z\in A_{t}^{-}.

Applying Lemma D.4 to the orthant enlarged by a cube of side length $a_{u}=c_{r}\sqrt{u\log(2s)}$ yields

\int_{A_{t}^{-}}\left|\partial^{\alpha}\phi_{I}\!\left(\frac{y-\sqrt{1-u}\,x}{\sqrt{u}}\right)\right|u^{-s/2}\,dy\leq C_{r}(1+t)^{r}\mathbb{P}\!\left(\sqrt{1-u}\,x+\sqrt{u}\,Z_{I}\in A_{t-a_{u}}^{-}\right),

and summing over $|\alpha|=r$ proves (26). ∎

Lemma A.7 (Koike projected decomposition).

Let $\xi_{1,I},\dots,\xi_{n,I}$ be independent centered $\mathbb{R}^{s}$ -valued random vectors with approximate Stein kernels $(\tau_{i,I},\beta_{i,I})$ , and put

W_{I}:=\sum_{i=1}^{n}\xi_{i,I},\qquad\bar{T}_{I}:=\sum_{i=1}^{n}\tau_{i,I}(\xi_{i,I})-\Sigma_{II},\qquad B_{I}:=\sum_{i=1}^{n}\beta_{i,I}(\xi_{i,I}).

For a bounded measurable function $h:\mathbb{R}^{s}\to\mathbb{R}$ and $u\in(0,1]$ , define

h_{u}(x):=\mathbb{E}\bigl[h(\sqrt{1-u}\,x+\sqrt{u}\,Z_{I})\bigr],\qquad Z_{I}\sim N(0,\Sigma_{II}).

Let $p_{W_{I}}$ be the first-order Edgeworth density around $N(0,\Sigma_{II})$ as in equation (4.1) of Koike (2026). Then for every bounded measurable $h:\mathbb{R}^{s}\to\mathbb{R}$ and every $\vartheta\in(0,1]$ ,

\mathbb{E}[h_{\vartheta}(W_{I})]-\int_{\mathbb{R}^{s}}h_{\vartheta}(z)p_{W_{I}}(z)\,dz=\sum_{\nu=1}^{6}R_{\nu,I}(h,\vartheta),

(28)

where each $R_{\nu,I}(h,\vartheta)$ is one of the six terms displayed in equation (4.6) of Koike (2026), specialized to the projected dimension $s$ . In particular, each $R_{\nu,I}(h,\vartheta)$ is a finite linear combination of iterated integrals involving only the tensors

\bar{T}_{I},\qquad\sum_{i=1}^{n}\xi_{i,I}^{\otimes 3},\qquad\sum_{i=1}^{n}\tau_{i,I}(\xi_{i,I})^{\otimes 2},\qquad\sum_{i=1}^{n}\xi_{i,I}^{\otimes 3}\otimes\tau_{i,I}(\xi_{i,I}),\qquad\sum_{i=1}^{n}\xi_{i,I}^{\otimes 4},

and their $\beta$ -counterparts. If $\beta_{i,I}\equiv 0$ , then the last four terms in equation (4.6) of Koike (2026) disappear identically.

Proof.

This is exactly Lemma 4.3 in Koike (2026) after replacing the ambient dimension $d$ by the projected dimension $s=|I|$ . The statement about the coefficient tensors follows by reading term-by-term the six displays in equation (4.6) of Koike (2026). When $\beta_{i,I}\equiv 0$ , every summand involving $\beta_{i,I}$ vanishes. ∎

Lemma A.8 (Koike tensor and concentration bounds).

Under Assumption 2.1, there exists $C>0$ such that, for every $a\geq 1$ ,

$\displaystyle\mathbb{P}\!\left(\left\\|\frac{1}{n}\sum_{i=1}^{n}\bm{X}_{i}\bm{X}_{i}^{\top}-\mathbf{\Sigma}\right\\|_{\max}>Cb^{2}\sqrt{\frac{a\log(dn)}{n}}\right)$	$\displaystyle\leq n^{-a},$	(29)
$\displaystyle\mathbb{P}\!\left(\\|\bar{\bm{X}}\\|_{\infty}>Cb\sqrt{\frac{a\log(dn)}{n}}\right)$	$\displaystyle\leq n^{-a},$	(30)
$\displaystyle\mathbb{P}\!\left(\sup_{\\|V\\|_{1}\leq 1}\left\|\frac{1}{n}\sum_{i=1}^{n}\bigl\langle\bm{X}_{i}^{\otimes 3}-\mathbb{E}[\bm{X}_{i}^{\otimes 3}],V\bigr\rangle\right\|>Cb^{3}a\frac{\log n}{\sqrt{n}}\right)$	$\displaystyle\leq n^{-a}.$	(31)

Moreover, for every nonempty $I\subset[d]$ with $|I|\leq k_{0}$ , the coefficient tensors appearing in Lemma A.7 satisfy

\mathbb{E}\|\mathsf{A}_{I,\nu}\|_{\infty}\leq C\delta_{n},\qquad\nu=1,\dots,6,

(32)

where $\mathsf{A}_{I,\nu}$ denotes any coefficient tensor multiplying $\nabla^{2}h_{u}$ , $\nabla^{3}h_{u}$ , $\nabla^{4}h_{u}$ , or $\nabla^{5}h_{u}$ in the six terms of (28).

Proof.

The mean and covariance bounds (29)–(30) follow from Lemma D.10 in Koike (2026) applied to $Y_{i}=\mathrm{vec}(\bm{X}_{i}\bm{X}_{i}^{\top})$ and $Y_{i}=\bm{X}_{i}$ , respectively. The third-order tensor bound (31) is Lemma D.11 in Koike (2026) with $r=3$ . Finally, the coefficient-tensor estimate (32) is the projected specialization of the bounds obtained in the proof of Theorem 4.1 in Koike (2026, pp. 22–24). Since projection only removes coordinates, every projected $\ell_{\infty}$ -tensor norm is bounded by the corresponding full-dimensional norm. ∎

Lemma A.9 (Explicit high-probability event for the first bootstrap array).

Under Assumption 2.1, there exists an event $\Omega_{n,X}$ such that

\mathbb{P}(\Omega_{n,X}^{c})\leq\frac{C}{n^{2}}

and, on $\Omega_{n,X}$ ,

$\displaystyle\\|\bar{\bm{X}}\\|_{\infty}$	$\displaystyle\leq Cb\sqrt{\frac{\log(dn)}{n}},$	(33)
$\displaystyle\left\\|\frac{1}{n}\sum_{i=1}^{n}\bm{X}_{i}\bm{X}_{i}^{\top}-\mathbf{\Sigma}\right\\|_{\max}$	$\displaystyle\leq Cb^{2}\sqrt{\frac{\log(dn)}{n}},$
$\displaystyle\max_{1\leq i\leq n}\\|\bm{X}_{i}\\|_{\infty}$	$\displaystyle\leq Cb\log(dn).$	(34)

Consequently, if

\hat{\mathbf{\Sigma}}_{X}:=\frac{1}{n}\sum_{i=1}^{n}(\bm{X}_{i}-\bar{\bm{X}})(\bm{X}_{i}-\bar{\bm{X}})^{\top},\qquad\bm{b}_{i}:=\bm{X}_{i}-\bar{\bm{X}},

then on $\Omega_{n,X}$ ,

$\displaystyle\\|\hat{\mathbf{\Sigma}}_{X}-\Sigma\\|_{\max}$	$\displaystyle\leq Cb^{2}\sqrt{\frac{\log(dn)}{n}},$	(35)
$\displaystyle\max_{1\leq i\leq n}\\|\bm{b}_{i}\\|_{\infty}$	$\displaystyle\leq Cb\log(dn),$	(36)
$\displaystyle\sup_{\begin{subarray}{c}I\subset[d]\\ 1\leq\|I\|\leq k_{0}\end{subarray}}\\|\bar{\bm{b}}_{I}^{\,2}-\Sigma_{II}\\|_{\max}$	$\displaystyle\leq Cb^{2}\sqrt{\frac{\log(dn)}{n}},$	(37)
$\displaystyle\sup_{\begin{subarray}{c}I\subset[d]\\ 1\leq\|I\|\leq k_{0}\end{subarray}}\\|\bar{\bm{b}}_{I}^{\,3}\\|_{\infty}$	$\displaystyle\leq Cb^{3}\log(dn).$	(38)

Proof.

Apply (30) and (29) with $a=2$ to obtain

\mathbb{P}\!\left(\|\bar{\bm{X}}\|_{\infty}>Cb\sqrt{\frac{\log(dn)}{n}}\right)+\mathbb{P}\!\left(\left\|\frac{1}{n}\sum_{i=1}^{n}\bm{X}_{i}\bm{X}_{i}^{\top}-\mathbf{\Sigma}\right\|_{\max}>Cb^{2}\sqrt{\frac{\log(dn)}{n}}\right)\leq\frac{C}{n^{2}}.

(39)

Next, since $\|X_{ij}\|_{\psi_{1}}\leq b$ , the tail bound for sub-exponential random variables implies

\mathbb{P}(|X_{ij}|>u)\leq 2\exp\!\left(-\frac{u}{Cb}\right),\qquad u>0.

Set $u=C_{1}b\log(dn)$ with $C_{1}$ large enough. Then

\mathbb{P}\!\left(\max_{1\leq i\leq n}\max_{1\leq j\leq d}|X_{ij}|>C_{1}b\log(dn)\right)\leq 2nd\,(dn)^{-4}\leq\frac{2}{n^{2}}.

(40)

Define $\Omega_{n,X}$ as the intersection of the events in (39) and (40). Then $\mathbb{P}(\Omega_{n,X}^{c})\leq C/n^{2}$ , and (33)–(34) hold on $\Omega_{n,X}$ .

Now

\hat{\mathbf{\Sigma}}_{X}=\frac{1}{n}\sum_{i=1}^{n}\bm{X}_{i}\bm{X}_{i}^{\top}-\bar{\bm{X}}\bar{\bm{X}}^{\top}.

Hence, on $\Omega_{n,X}$ ,

	$\displaystyle\\|\hat{\mathbf{\Sigma}}_{X}-\mathbf{\Sigma}\\|_{\max}$	$\displaystyle\leq\left\\|\frac{1}{n}\sum_{i=1}^{n}\bm{X}_{i}\bm{X}_{i}^{\top}-\mathbf{\Sigma}\right\\|_{\max}+\\|\bar{\bm{X}}\bar{\bm{X}}^{\top}\\|_{\max}$
		$\displaystyle\leq Cb^{2}\sqrt{\frac{\log(dn)}{n}}+Cb^{2}\frac{\log(dn)}{n}$
		$\displaystyle\leq Cb^{2}\sqrt{\frac{\log(dn)}{n}},$

which proves (35). Also,

\max_{1\leq i\leq n}\|\bm{b}_{i}\|_{\infty}\leq\max_{1\leq i\leq n}\|\bm{X}_{i}\|_{\infty}+\|\bar{\bm{X}}\|_{\infty}\leq Cb\log(dn),

which proves (36). For every $I$ ,

\bar{\bm{b}}_{I}^{\,2}=\bm{P}_{I}\hat{\mathbf{\Sigma}}_{X}\bm{P}_{I}^{\top},

so (37) follows from (35). Finally, for every coordinate triple $(j_{1},j_{2},j_{3})$ belonging to $I$ ,

	$\displaystyle\left\|\frac{1}{n}\sum_{i=1}^{n}b_{ij_{1}}b_{ij_{2}}b_{ij_{3}}\right\|$	$\displaystyle\leq\left(\max_{1\leq i\leq n}\\|\bm{b}_{i}\\|_{\infty}\right)\frac{1}{n}\sum_{i=1}^{n}\|b_{ij_{1}}b_{ij_{2}}\|$
		$\displaystyle\leq\left(\max_{1\leq i\leq n}\\|\bm{b}_{i}\\|_{\infty}\right)\left(\frac{1}{n}\sum_{i=1}^{n}b_{ij_{1}}^{2}\right)^{1/2}\left(\frac{1}{n}\sum_{i=1}^{n}b_{ij_{2}}^{2}\right)^{1/2}$
		$\displaystyle\leq Cb\log(dn)\cdot\max_{1\leq j\leq d}\hat{\Sigma}_{X,jj}$
		$\displaystyle\leq Cb^{3}\log(dn),$

because $\hat{\Sigma}_{X,jj}\leq\Sigma_{jj}+Cb^{2}\sqrt{\log(dn)/n}\leq Cb^{2}$ on $\Omega_{n,X}$ . Taking the maximum over all coordinate triples gives (38). ∎

Lemma A.10 (Multiplier maximum event).

Under Assumption 2.2, there exists an event $\Omega_{n,w}$ such that

\mathbb{P}(\Omega_{n,w}^{c})\leq\frac{C}{n^{2}}

and

\max_{1\leq i\leq n}|w_{i}|\leq C\log(dn)\qquad\text{on }\Omega_{n,w}.

(41)

Proof.

If Assumption 2.2(ii) holds, then $|w_{i}|\leq b_{w}$ almost surely, so (41) is trivial for $C\geq b_{w}$ . If Assumption 2.2(i) holds, then $w_{i}\sim N(0,1)$ and

\mathbb{P}(|w_{i}|>u)\leq 2e^{-u^{2}/2},\qquad u>0.

Choose $u=C_{0}\log(dn)$ with $C_{0}$ large enough. Then

\mathbb{P}\!\left(\max_{1\leq i\leq n}|w_{i}|>C_{0}\log(dn)\right)\leq 2n\exp\!\left(-\frac{C_{0}^{2}\log^{2}(dn)}{2}\right)\leq\frac{C}{n^{2}}.

This proves the claim. ∎

A.4 The projected local input

Proposition A.1 (Projected local orthant expansion).

Assume Assumptions 2.1, 2.3, and 2.4. Then there exists $C>0$ such that for every nonempty $I\subset[d]$ with $|I|\leq k_{0}$ ,

\sup_{t\in\mathcal{T}_{k,\epsilon}}\left|\mathbb{P}\bigl(\bm{S}_{n,I}\in(t,\infty)^{|I|}\bigr)-\int_{(t,\infty)^{|I|}}p_{n,I}(\bm{u})\,d\bm{u}\right|\leq C\varepsilon_{n}^{2}\pi_{I}(t).

(42)

Moreover, with probability at least $1-C/n$ ,

\sup_{t\in\mathcal{T}_{k,\epsilon}}\left|\mathbb{P}^{*}\bigl(\bm{S}_{n,I}^{*}\in(t,\infty)^{|I|}\bigr)-\int_{(t,\infty)^{|I|}}\hat{p}_{n,\gamma,I}(\bm{u})\,d\bm{u}\right|\leq C\varepsilon_{n}^{2}\pi_{I}(t)

(43)

holds simultaneously for all such $I$ .

Proof.

Fix a nonempty $I\subset[d]$ and write $s:=|I|$ . Set

\bm{Y}_{n,I}:=-\bm{S}_{n,I},\qquad A_{t}^{-}:=(-\infty,-t]^{s}.

Then

\mathbb{P}\bigl(\bm{S}_{n,I}\in(t,\infty)^{s}\bigr)=\mathbb{P}\bigl(\bm{Y}_{n,I}\in A_{t}^{-}\bigr).

If $\bm{Z}_{I}^{-}\sim N(0,\Sigma_{II})$ , then

\pi_{I}(t)=\mathbb{P}(\bm{Z}_{I}^{-}\in A_{t}^{-}).

For $u\in(0,1]$ , define

h_{t,u}(x):=h_{A_{t}^{-},u}(x)=\mathbb{E}\bigl[\mathbf{1}_{A_{t}^{-}}(\sqrt{1-u}\,x+\sqrt{u}\,\bm{Z}_{I}^{-})\bigr].

By Lemma A.5, for every multi-index $\alpha$ with $|\alpha|=r$ ,

\partial^{\alpha}h_{t,u}(x)=(-1)^{r}\left(\frac{1-u}{u}\right)^{r/2}\int_{A_{t}^{-}}\partial^{\alpha}\phi_{I}\!\left(\frac{y-\sqrt{1-u}\,x}{\sqrt{u}}\right)u^{-s/2}\,dy.

Applying Lemma A.6 with $r\in\{2,4,5\}$ gives

\|\nabla^{r}h_{t,u}(x)\|_{1}\leq C_{r}u^{-r/2}(1+t)^{r}\mathbb{P}\!\left(\sqrt{1-u}\,x+\sqrt{u}\,\bm{Z}_{I}^{-}\in A_{t-a_{u}}^{-}\right),\qquad a_{u}:=c_{r}\sqrt{u\log(2s)}.

(44)

Lemma A.12(i) implies that there exists $c_{0}>0$ such that, uniformly for $t\in\mathcal{T}_{k,\epsilon}$ and $0\leq a\leq c_{0}/t$ ,

\pi_{I}(t-a)\leq C\pi_{I}(t).

(45)

Choose

\vartheta_{n}:=\min\left\{\frac{1}{2},\,\frac{\varepsilon_{n}^{4}}{\log(dn)\log(2k_{0})}\right\}.

Then

\log(\vartheta_{n}^{-1})\leq C\log n,\qquad t\,a_{\vartheta_{n}}\leq C\varepsilon_{n}^{2}\qquad\text{uniformly on }\mathcal{T}_{k,\epsilon},

(46)

because $t^{2}\asymp\log d$ by Lemma A.11. Since $s\leq k_{0}$ , (46) implies

a_{u}\leq\frac{c_{0}}{t}\qquad\text{for every }u\in[\vartheta_{n},1/2].

Therefore, integrating (44) with respect to the law of $\bm{Y}_{n,I}$ and using (45),

	$\displaystyle\int\\|\nabla^{r}h_{t,u}(x)\\|_{1}\,d\mathbb{P}_{\bm{Y}_{n,I}}(x)$	$\displaystyle\leq C_{r}u^{-r/2}(1+t)^{r}\mathbb{P}\bigl(\bm{Z}_{I}^{-}\in A_{t-a_{u}}^{-}\bigr)$
		$\displaystyle\leq C_{r}u^{-r/2}(1+t)^{r}\pi_{I}(t),\qquad u\in[\vartheta_{n},1/2].$		(47)

Exactly the same estimate holds with $\mathbb{P}_{\bm{Y}_{n,I}}$ replaced by the Gaussian law.

Let

p_{n,I}^{Y}(y):=p_{n,I}(-y).

Apply the smoothing inequality in Lemma 4.1 of Koike (2026) to the bounded measurable function $\mathbf{1}_{A_{t}^{-}}$ , with

\mu=\mathcal{L}(\bm{Y}_{n,I}),\qquad\nu(dy)=p_{n,I}^{Y}(y)\,dy,\qquad K=N(0,\vartheta_{n}\Sigma_{II}).

Using Lemma A.13(with $a=a_{\vartheta_{n}}$ ) and Lemma A.12(ii), we obtain

	$\displaystyle\left\|\mathbb{P}(\bm{Y}_{n,I}\in A_{t}^{-})-\mathbb{E}[h_{t,\vartheta_{n}}(\bm{Y}_{n,I})]\right\|+\left\|\int_{A_{t}^{-}}p_{n,I}^{Y}(y)\,dy-\int h_{t,\vartheta_{n}}(y)p_{n,I}^{Y}(y)\,dy\right\|$
	$\displaystyle\qquad\leq Ca_{\vartheta_{n}}(1+t)^{4}\pi_{I}(t)\leq C\varepsilon_{n}^{2}\pi_{I}(t).$		(48)

For each $i$ , define

\xi_{i,I}:=-\frac{1}{\sqrt{n}}\bm{P}_{I}\bm{X}_{i}.

Then

\sum_{i=1}^{n}\xi_{i,I}=\bm{Y}_{n,I},\qquad\sum_{i=1}^{n}\mathbb{E}[\tau_{i,I}(\xi_{i,I})]=\Sigma_{II},\qquad\beta_{i,I}\equiv 0

with projected Stein kernels inherited from Lemma A.2. Therefore Lemma A.7 gives

\mathbb{E}[h_{t,\vartheta_{n}}(\bm{Y}_{n,I})]-\int h_{t,\vartheta_{n}}(y)p_{n,I}^{Y}(y)\,dy=\sum_{\nu=1}^{6}R_{\nu,I}(t,\vartheta_{n}).

Because $\beta_{i,I}\equiv 0$ , only the terms involving $\bar{T}_{I}$ ,

\bar{T}_{I}:=\sum_{i=1}^{n}\tau_{i,I}(\xi_{i,I})-\Sigma_{II},

and $\sum_{i}\xi_{i,I}^{\otimes 3}$ remain. By (32),

\mathbb{E}\|\mathsf{A}_{I,\nu}\|_{\infty}\leq C\delta_{n},\qquad\nu=1,\dots,6.

(49)

Substituting (47) and (49) into the six explicit terms of equation (4.6) in Koike (2026), and integrating the kernels exactly as they appear there, yields

\sum_{\nu=1}^{6}|R_{\nu,I}(t,\vartheta_{n})|\leq C\delta_{n}\,(1+t)^{4}\bigl\{1+\log(\vartheta_{n}^{-1})\bigr\}\pi_{I}(t).

(50)

Since $t^{2}\asymp\log d$ on $\mathcal{T}_{k,\epsilon}$ and $\varepsilon_{n}^{2}=\delta_{n}\log n$ , (46) implies

\delta_{n}\,(1+t)^{4}\bigl\{1+\log(\vartheta_{n}^{-1})\bigr\}\leq C\varepsilon_{n}^{2}.

Hence

\left|\mathbb{E}[h_{t,\vartheta_{n}}(\bm{Y}_{n,I})]-\int h_{t,\vartheta_{n}}(y)p_{n,I}^{Y}(y)\,dy\right|\leq C\varepsilon_{n}^{2}\pi_{I}(t).

(51)

Combining (48) and (51),

\left|\mathbb{P}(\bm{Y}_{n,I}\in A_{t}^{-})-\int_{A_{t}^{-}}p_{n,I}^{Y}(y)\,dy\right|\leq C\varepsilon_{n}^{2}\pi_{I}(t).

Changing variables $y=-u$ gives (42).

Work on the event $\Omega_{n,X}$ from Lemma A.9. Then, for every $I$ with $|I|\leq k_{0}$ ,

\|\bar{\bm{b}}_{I}^{\,2}-\Sigma_{II}\|_{\max}\leq Cb^{2}\sqrt{\frac{\log(dn)}{n}},\qquad\|\bar{\bm{b}}_{I}^{\,3}\|_{\infty}\leq Cb^{3}\log(dn).

(52)

Condition on the data. Define

\xi_{i,I}^{*}:=\frac{1}{\sqrt{n}}w_{i}\bm{b}_{i,I},\qquad W_{I}^{*}:=\sum_{i=1}^{n}\xi_{i,I}^{*}=\bm{S}_{n,I}^{*}.

If $w_{i}\sim N(0,1)$ , then $\xi_{i,I}^{*}$ has exact Stein kernel

\tau_{i,I}^{*}(\xi_{i,I}^{*})=\frac{1}{n}\bm{b}_{i,I}\bm{b}_{i,I}^{\top}.

If Assumption 2.2(ii) holds, then

\tau_{i,I}^{*}(\xi_{i,I}^{*})=\frac{1}{n}\tau^{w}(w_{i})\bm{b}_{i,I}\bm{b}_{i,I}^{\top}

is again an exact Stein kernel, because for every smooth vector field $g:\mathbb{R}^{s}\to\mathbb{R}^{s}$ ,

	$\displaystyle\mathbb{E}^{}[\xi_{i,I}^{\top}g(\xi_{i,I}^{*})]$	$\displaystyle=\frac{1}{\sqrt{n}}\mathbb{E}^{*}\bigl[w_{i}\bm{b}_{i,I}^{\top}g(n^{-1/2}w_{i}\bm{b}_{i,I})\bigr]$
		$\displaystyle=\frac{1}{n}\mathbb{E}^{*}\bigl[\tau^{w}(w_{i})\mathrm{tr}\{\bm{b}_{i,I}\bm{b}_{i,I}^{\top}\nabla g(n^{-1/2}w_{i}\bm{b}_{i,I})^{\top}\}\bigr].$

Thus, conditionally on $\Omega_{n,X}$ ,

\sum_{i=1}^{n}\mathbb{E}^{*}[\tau_{i,I}^{*}(\xi_{i,I}^{*})]=\bar{\bm{b}}_{I}^{\,2},\qquad\mathbb{E}^{*}\left[\sum_{i=1}^{n}(\xi_{i,I}^{*})^{\otimes 3}\right]=\frac{\gamma}{\sqrt{n}}\bar{\bm{b}}_{I}^{\,3},\qquad\beta_{i,I}^{*}\equiv 0.

Therefore, conditionally on $\Omega_{n,X}$ , set

W_{I}^{*}:=-\bm{S}_{n,I}^{*},\qquad\bar{T}_{I}^{*}:=\bar{\bm{b}}_{I}^{\,2}-\Sigma_{II}.

Then

\mathbb{P}^{*}(\bm{S}_{n,I}^{*}\in(t,\infty)^{s})=\mathbb{P}^{*}(W_{I}^{*}\in A_{t}^{-}).

And we have, for $r\in\{2,4,5\}$ ,

\mathbb{E}^{*}\bigl[\|\nabla^{r}h_{t,u}(W_{I}^{*})\|_{1}\bigr]\leq C_{r}u^{-r/2}(1+t)^{r}\pi_{I}(t),

uniformly over $u\in[\vartheta_{n},1/2]$ , because Lemma A.12 depends only on the Gaussian reference law $N(0,\Sigma_{II})$ . Here, we gives the smoothing error

\left|\mathbb{P}^{*}(W_{I}^{*}\in A_{t}^{-})-\mathbb{E}^{*}[h_{t,\vartheta_{n}}(W_{I}^{*})]\right|\leq C\varepsilon_{n}^{2}\pi_{I}(t).

Finally, the coefficient tensors in Koike’s decomposition satisfy

\mathbb{E}^{*}[\|\bar{T}_{I}^{*}\|_{\infty}]+\frac{1}{\sqrt{n}}\|\bar{\bm{b}}_{I}^{\,3}\|_{\infty}\leq C\varepsilon_{n}

by (52). Substituting these conditional bounds into the six remainder terms in (50) yields, on $\Omega_{n,X}$ ,

\sup_{t\in\mathcal{T}_{k,\epsilon}}\left|\mathbb{P}^{*}(\bm{S}_{n,I}^{*}\in(t,\infty)^{s})-\int_{(t,\infty)^{s}}\hat{p}_{n,\gamma,I}(u)\,du\right|\leq C\varepsilon_{n}^{2}\pi_{I}(t).

Since $\mathbb{P}(\Omega_{n,X}^{c})\leq C/n^{2}$ , this proves (43). ∎

Lemma A.11 (Gaussian threshold scale).

Assume Assumptions 2.3 and 2.4. Then there exist constants $0<c_{1}<C_{1}<\infty$ such that

c_{1}\log d\leq t^{2}\leq C_{1}\log d\qquad\text{for every }t\in\mathcal{T}_{k,\epsilon}

for all sufficiently large $d$ .

Proof.

Set

\lambda(t):=\sum_{j=1}^{d}\bar{\Phi}\!\left(\frac{t}{\sigma_{j}}\right).

Lemma A.14 below yields

G_{k}(t)=h_{k}(\lambda(t))+O(\eta_{d}),\qquad h_{k}(\lambda):=e^{-\lambda}\sum_{m=0}^{k-1}\frac{\lambda^{m}}{m!},

with $\eta_{d}\to 0$ uniformly on $\mathcal{T}_{k,\epsilon}$ . Since $G_{k}(t)\in[\epsilon/2,1-\epsilon/2]$ on that window and $h_{k}$ is continuous and strictly decreasing, there exist constants $0<\lambda_{-}<\lambda_{+}<\infty$ such that

\lambda_{-}\leq\lambda(t)\leq\lambda_{+}\qquad\text{for every }t\in\mathcal{T}_{k,\epsilon}

(53)

for all sufficiently large $d$ . Also,

d\,\bar{\Phi}\!\left(\frac{t}{\underline{\sigma}}\right)\leq\lambda(t)\leq d\,\bar{\Phi}\!\left(\frac{t}{\overline{\sigma}}\right).

(54)

Applying Mills’ ratio to (54) and using (53) gives the claim. ∎

Lemma A.12 (Gaussian shift and strip bounds).

Assume Assumptions 2.3 and 2.4. Then there exists $c_{0}>0$ such that the following hold uniformly over all nonempty $I\subset[d]$ with $|I|\leq k_{0}$ and all $t\in\mathcal{T}_{k,\epsilon}$ :

(i)

if $0\leq a\leq c_{0}/t$ , then

$\pi_{I}(t-a)\leq C\pi_{I}(t);$
(ii)

if $0\leq a\leq c_{0}/t$ , then

$\pi_{I}(t-a)-\pi_{I}(t)\leq Ca(1+t)\pi_{I}(t).$

Proof.

Let $I=\{j_{1},\dots,j_{s}\}$ and standardize $Y_{r}:=Z_{j_{r}}/\sigma_{j_{r}}$ . The covariance matrix of $(Y_{1},\dots,Y_{s})$ has diagonal entries $1$ and off-diagonal entries bounded by $\rho_{d}/\underline{\sigma}^{2}$ . Since $s\rho_{d}\to 0$ , Lemma A.3 applied to the correlation matrix yields

\lambda_{\max}(\mathrm{Corr}(Y_{1},\dots,Y_{s}))\leq 2

for all sufficiently large $d$ . Hence the Gaussian density on $\mathbb{R}^{s}$ is bounded above and below, on the relevant orthant boundary region, by the density of an independent Gaussian vector up to multiplicative constants depending only on $\underline{\sigma},\overline{\sigma}$ . Consequently,

\pi_{I}(t)\asymp\prod_{r=1}^{s}\bar{\Phi}\!\left(\frac{t}{\sigma_{j_{r}}}\right)

(55)

uniformly for $|I|\leq k_{0}$ and $t\in\mathcal{T}_{k,\epsilon}$ . By Mills’ ratio,

\frac{\bar{\Phi}((t-a)/\sigma_{j_{r}})}{\bar{\Phi}(t/\sigma_{j_{r}})}\leq\exp\!\left\{\frac{at}{\underline{\sigma}^{2}}\right\}\leq C\qquad\text{if }0\leq a\leq c_{0}/t,

which proves part (i) after multiplication over $r$ . Also,

\bar{\Phi}\!\left(\frac{t-a}{\sigma}\right)-\bar{\Phi}\!\left(\frac{t}{\sigma}\right)\leq\frac{a}{\sigma}\phi\!\left(\frac{t-a}{\sigma}\right)\leq Ca(1+t)\bar{\Phi}\!\left(\frac{t}{\sigma}\right),

again by Mills’ ratio. Multiplying over coordinates and using (55) proves part (ii). ∎

Lemma A.13 (Gaussian strip bound for the Edgeworth density).

Under Assumptions 2.1–2.4, for every nonempty $I\subset[d]$ with $|I|\leq k_{0}$ , every $t\in\mathcal{T}_{k,\epsilon}$ , and every $0\leq a\leq c_{0}/t$ ,

\int_{(t-a,\infty)^{|I|}\setminus(t,\infty)^{|I|}}\phi_{I}(\bm{u})\,d\bm{u}\leq Ca(1+t)\pi_{I}(t),

(56)

and

\int_{(t-a,\infty)^{|I|}\setminus(t,\infty)^{|I|}}|p_{n,I}(\bm{u})-\phi_{I}(\bm{u})|\,d\bm{u}\leq Ca(1+t)^{4}n^{-1/2}\pi_{I}(t).

(57)

On the event of Lemma A.17, the same proof with $\bar{\bm{c}}_{I}^{\,2}-\Sigma_{II}$ replaced by $\bar{\bm{b}}_{I}^{\,2}-\Sigma_{II}$ and $\bar{\bm{c}}_{I}^{\,3}$ replaced by $(\gamma/\sqrt{n})\bar{\bm{b}}_{I}^{\,3}$ gives

\int_{(t-a,\infty)^{|I|}\setminus(t,\infty)^{|I|}}|\hat{p}_{n,\gamma,I}(\bm{u})-\phi_{I}(\bm{u})|\,d\bm{u}\leq Ca(1+t)^{4}n^{-1/2}\pi_{I}(t).

Proof.

Let $s:=|I|$ . For the Gaussian part, write

\mathcal{S}_{t,a}^{I}:=(t-a,\infty)^{s}\setminus(t,\infty)^{s}.

Since

\mathcal{S}_{t,a}^{I}\subset\bigcup_{r=1}^{s}(t-a,t]\times(t-a,\infty)^{r-1}\times(t,\infty)^{s-r},

we have

	$\displaystyle\int_{\mathcal{S}_{t,a}^{I}}\phi_{I}(u)\,du$	$\displaystyle=\pi_{I}(t-a)-\pi_{I}(t)$
		$\displaystyle\leq Ca(1+t)\pi_{I}(t)$

by Lemma A.12(ii). This proves (56).

For the Edgeworth correction, (19) gives

p_{n,I}(u)-\phi_{I}(u)=-\frac{1}{6\sqrt{n}}\left\langle\mathbb{E}[\bar{\bm{X}}_{I}^{\,3}],\nabla^{3}\phi_{I}(u)\right\rangle.

Because $|I|\leq k_{0}$ and $\|X_{ij}\|_{\psi_{1}}\leq b$ , all components of $\mathbb{E}[\bar{\bm{X}}_{I}^{\,3}]$ are bounded by $Cb^{3}$ . Also,

|\partial^{\alpha}\phi_{I}(u)|\leq C_{\alpha}(1+\|u\|_{\infty}^{3})\phi_{I}(u),\qquad|\alpha|=3.

Hence

	$\displaystyle\int_{\mathcal{S}_{t,a}^{I}}\|p_{n,I}(u)-\phi_{I}(u)\|\,du$	$\displaystyle\leq\frac{C}{\sqrt{n}}\int_{\mathcal{S}_{t,a}^{I}}(1+\\|u\\|_{\infty}^{3})\phi_{I}(u)\,du$
		$\displaystyle\leq\frac{C(1+t)^{3}}{\sqrt{n}}\int_{\mathcal{S}_{t,a}^{I}}\phi_{I}(u)\,du$
		$\displaystyle\leq Ca(1+t)^{4}n^{-1/2}\pi_{I}(t),$

which proves (57).

For the bootstrap density, work on the event $\Omega_{n,X}$ from Lemma A.9. Then

\hat{p}_{n,\gamma,I}(u)-\phi_{I}(u)=\frac{1}{2}\left\langle\bar{\bm{b}}_{I}^{\,2}-\Sigma_{II},\nabla^{2}\phi_{I}(u)\right\rangle-\frac{\gamma}{6\sqrt{n}}\left\langle\bar{\bm{b}}_{I}^{\,3},\nabla^{3}\phi_{I}(u)\right\rangle.

By (37)–(38),

\|\bar{\bm{b}}_{I}^{\,2}-\Sigma_{II}\|_{\max}\leq Cb^{2}\sqrt{\frac{\log(dn)}{n}},\qquad\|\bar{\bm{b}}_{I}^{\,3}\|_{\infty}\leq Cb^{3}\log(dn).

Using

|\partial^{\alpha}\phi_{I}(u)|\leq C_{\alpha}(1+\|u\|_{\infty}^{|\alpha|})\phi_{I}(u),\qquad|\alpha|\in\{2,3\},

and (56), we obtain on $\Omega_{n,X}$ ,

	$\displaystyle\int_{\mathcal{S}_{t,a}^{I}}\|\hat{p}_{n,\gamma,I}(u)-\phi_{I}(u)\|\,du$	$\displaystyle\leq C\left\{\sqrt{\frac{\log(dn)}{n}}(1+t)^{2}+\frac{\log(dn)}{\sqrt{n}}(1+t)^{3}\right\}\int_{\mathcal{S}_{t,a}^{I}}\phi_{I}(u)\,du$
		$\displaystyle\leq Ca(1+t)^{4}\varepsilon_{n}\,\pi_{I}(t),$

because $\varepsilon_{n}\geq Cn^{-1/2}\log(dn)$ under Assumption 2.1. This is the bootstrap analogue of (57). ∎

A.5 Gaussian factorial moments, aggregation, and regularity

Define

p_{j}(t):=\mathbb{P}(Z_{j}>t)=\bar{\Phi}\!\left(\frac{t}{\sigma_{j}}\right),\qquad\lambda(t):=\sum_{j=1}^{d}p_{j}(t).

Also define the elementary symmetric polynomial

e_{s}(t):=\sum_{\begin{subarray}{c}I\subset[d]\\ |I|=s\end{subarray}}\prod_{j\in I}p_{j}(t).

Lemma A.14 (Gaussian factorial moments).

Assume Assumptions 2.3 and 2.4. Then there exists a sequence $\eta_{d}\downarrow 0$ such that, uniformly over $t\in\mathcal{T}_{k,\epsilon}$ and $1\leq s\leq k_{0}$ ,

\left|V_{Z,s}(t)-\frac{\lambda(t)^{s}}{s!}\right|\leq C\eta_{d},

(58)

where one may take

\eta_{d}:=C\Bigl(d^{-a_{\sigma}}(\log d)^{-1/2}+\rho_{d}\log d\Bigr),\qquad a_{\sigma}:=\frac{\underline{\sigma}^{2}}{\overline{\sigma}^{2}}.

Consequently,

\sup_{t\in\mathcal{T}_{k,\epsilon}}|G_{k}(t)-h_{k}(\lambda(t))|\leq C\eta_{d},\qquad h_{k}(\lambda):=e^{-\lambda}\sum_{m=0}^{k-1}\frac{\lambda^{m}}{m!}.

(59)

Proof.

Fix $s\in\{1,\dots,k_{0}\}$ . First compare $V_{Z,s}(t)$ with the elementary symmetric polynomial $e_{s}(t)$ . For every $I=\{j_{1},\dots,j_{s}\}$ , Lemma A.4 applied repeatedly to the standardized vector $\bigl(Z_{j_{r}}/\sigma_{j_{r}}\bigr)_{r\leq s}$ yields

\left|\pi_{I}(t)-\prod_{j\in I}p_{j}(t)\right|\leq C\rho_{d}(1+t^{2})\prod_{j\in I}p_{j}(t).

Summing over $|I|=s$ and using $t^{2}\asymp\log d$ from Lemma A.11 gives

|V_{Z,s}(t)-e_{s}(t)|\leq C\rho_{d}\log d\,e_{s}(t).

(60)

Next compare $e_{s}(t)$ with $\lambda(t)^{s}/s!$ . Writing

\lambda(t)^{s}=\sum_{j_{1},\dots,j_{s}=1}^{d}p_{j_{1}}(t)\cdots p_{j_{s}}(t)

and separating the terms with repeated indices, we obtain

\left|\frac{\lambda(t)^{s}}{s!}-e_{s}(t)\right|\leq C_{s}\lambda(t)^{s-2}\sum_{j=1}^{d}p_{j}(t)^{2}.

(61)

Because $\lambda(t)$ stays in a compact interval by Lemma A.11 and

\max_{j}p_{j}(t)\leq\bar{\Phi}\!\left(\frac{t}{\overline{\sigma}}\right)\leq Cd^{-a_{\sigma}}(\log d)^{-1/2},

we obtain

\sum_{j=1}^{d}p_{j}(t)^{2}\leq\lambda(t)\max_{j}p_{j}(t)\leq Cd^{-a_{\sigma}}(\log d)^{-1/2}.

Combining this with (60) and (61) proves (58).

Now apply Lemma A.1 to $N_{Z}(t)$ :

G_{k}(t)=1-\sum_{s=k}^{\infty}(-1)^{s-k}\binom{s-1}{k-1}V_{Z,s}(t).

The same identity with $V_{Z,s}(t)$ replaced by $\lambda(t)^{s}/s!$ equals $h_{k}(\lambda(t))$ . Using (58) for $s\leq k_{0}$ and the Gaussian tail bound from Lemma A.15(ii) below for $s>k_{0}$ gives (59). ∎

Lemma A.15 (Weighted aggregation and Gaussian regularity).

Assume Assumptions 2.3 and 2.4. Then the following hold.

(i)

There exist constants $0<\lambda_{-}<\lambda_{+}<\infty$ such that

$\lambda_{-}\leq\lambda(t)\leq\lambda_{+}\qquad\text{for every }t\in\mathcal{T}_{k,\epsilon}$

for all sufficiently large $d$ .
(ii)

If $A>0$ is large enough in the definition of $k_{0}$ , then

$\sup_{t\in\mathcal{T}_{k,\epsilon}}\sum_{s=k}^{k_{0}}\binom{s-1}{k-1}M_{Z,s}(t)\leq C,$ (62)

and

$\sup_{t\in\mathcal{T}_{k,\epsilon}}\sum_{s=k_{0}+1}^{d}\binom{s-1}{k-1}M_{Z,s}(t)\leq C\varepsilon_{n}^{2}.$ (63)

(iii)

$G_{k}$ is $C^{2}$ on a neighborhood of $\mathcal{T}_{k,\epsilon}$ , and there exist constants $m_{k,\epsilon},B_{k,\epsilon}>0$ such that

f_{k}(c^{G}_{p,k})\geq m_{k,\epsilon},\qquad\bigl|(G_{k}^{-1})^{\prime\prime}(p)\bigr|\leq B_{k,\epsilon}\qquad\text{for }p\in[\epsilon/2,1-\epsilon/2].

(64)

Proof.

Part (i) was already proved in the proof of Lemma A.11. For part (ii), use Lemma A.14:

M_{Z,s}(t)=V_{Z,s}(t)=\frac{\lambda(t)^{s}}{s!}+O(\eta_{d})\qquad(1\leq s\leq k_{0}).

Since $\lambda(t)\leq\lambda_{+}$ and $k$ is fixed,

\sum_{s=k}^{\infty}\binom{s-1}{k-1}\frac{\lambda_{+}^{s}}{s!}<\infty.

Therefore (62) follows. For the tail, use

M_{Z,s}(t)=V_{Z,s}(t)\leq C\frac{\lambda_{+}^{s}}{s!}

for every $s\geq 1$ , and then Stirling’s formula gives

\sum_{s=k_{0}+1}^{\infty}\binom{s-1}{k-1}\frac{\lambda_{+}^{s}}{s!}\leq C\exp(-ck_{0}\log k_{0}).

Choosing $A$ large enough yields (63).

For part (iii), write

H_{k}(t):=h_{k}(\lambda(t)).

On the compact interval $[\lambda_{-},\lambda_{+}]$ one has

h_{k}^{\prime}(\lambda)=-e^{-\lambda}\frac{\lambda^{k-1}}{(k-1)!},\qquad\inf_{\lambda\in[\lambda_{-},\lambda_{+}]}|h_{k}^{\prime}(\lambda)|>0.

Moreover,

\lambda^{\prime}(t)=-\sum_{j=1}^{d}\frac{1}{\sigma_{j}}\phi\!\left(\frac{t}{\sigma_{j}}\right),\qquad\lambda^{\prime\prime}(t)=\sum_{j=1}^{d}\frac{t}{\sigma_{j}^{3}}\phi\!\left(\frac{t}{\sigma_{j}}\right).

By Mills’ ratio and Lemma A.11,

|\lambda^{\prime}(t)|\asymp t,\qquad|\lambda^{\prime\prime}(t)|\leq C(1+t^{2})\qquad\text{on }\mathcal{T}_{k,\epsilon}.

Hence

H_{k}^{\prime}(t)=h_{k}^{\prime}(\lambda(t))\lambda^{\prime}(t),\qquad H_{k}^{\prime\prime}(t)=h_{k}^{\prime\prime}(\lambda(t))(\lambda^{\prime}(t))^{2}+h_{k}^{\prime}(\lambda(t))\lambda^{\prime\prime}(t)

with

|H_{k}^{\prime}(t)|\asymp t,\qquad|H_{k}^{\prime\prime}(t)|\leq C(1+t^{2}).

(65)

Differentiating the factorial expansion termwise and using the same argument as in Lemma A.14,

\sup_{t\in\mathcal{T}_{k,\epsilon}}|G_{k}^{\prime}(t)-H_{k}^{\prime}(t)|+\sup_{t\in\mathcal{T}_{k,\epsilon}}|G_{k}^{\prime\prime}(t)-H_{k}^{\prime\prime}(t)|\leq C\eta_{d}(1+t^{2}).

(66)

Since $\eta_{d}\to 0$ and $t\asymp\sqrt{\log d}$ , (65) and (66) imply

f_{k}(t)=G_{k}^{\prime}(t)\geq ct\geq m_{k,\epsilon}>0\qquad\text{for }t\in\mathcal{T}_{k,\epsilon}

for all sufficiently large $d$ . Finally,

(G_{k}^{-1})^{\prime\prime}(p)=-\frac{f_{k}^{\prime}(G_{k}^{-1}(p))}{f_{k}(G_{k}^{-1}(p))^{3}}

and (65)–(66) imply the asserted boundedness. ∎

A.6 Factorial-moment and distribution expansions

Theorem A.1 (Factorial-moment expansion).

Assume Assumptions 2.1–2.4. Then

\sup_{t\in\mathcal{T}_{k,\epsilon}}\sum_{s=k}^{k_{0}}\binom{s-1}{k-1}|V_{n,s}(t)-M_{n,s}(t)|\leq C\varepsilon_{n}^{2},

(67)

and, with probability at least $1-C/n$ ,

\sup_{t\in\mathcal{T}_{k,\epsilon}}\sum_{s=k}^{k_{0}}\binom{s-1}{k-1}|V^{*}_{n,s}(t)-\hat{M}_{n,s,\gamma}(t)|\leq C\varepsilon_{n}^{2}.

(68)

Comment. Theorem A.1 converts the local projected Edgeworth expansions into a weighted approximation for the factorial moments of the exceedance count. This is the combinatorial bridge from rare orthant probabilities to the law of the $k$ th largest coordinate.

Proof.

For every integer $s\geq 1$ ,

V_{n,s}(t)=\sum_{\begin{subarray}{c}I\subset[d]\\ |I|=s\end{subarray}}\mathbb{P}\bigl(\bm{S}_{n,I}\in(t,\infty)^{s}\bigr),\qquad M_{n,s}(t)=\sum_{\begin{subarray}{c}I\subset[d]\\ |I|=s\end{subarray}}\int_{(t,\infty)^{s}}p_{n,I}(u)\,du.

Hence

	$\displaystyle\|V_{n,s}(t)-M_{n,s}(t)\|$	$\displaystyle\leq\sum_{\begin{subarray}{c}I\subset[d]\\ \|I\|=s\end{subarray}}\left\|\mathbb{P}\bigl(\bm{S}_{n,I}\in(t,\infty)^{s}\bigr)-\int_{(t,\infty)^{s}}p_{n,I}(u)\,du\right\|$
		$\displaystyle\leq C\varepsilon_{n}^{2}\sum_{\begin{subarray}{c}I\subset[d]\\ \|I\|=s\end{subarray}}\pi_{I}(t)=C\varepsilon_{n}^{2}M_{Z,s}(t)$

by Proposition A.1. Summing with the weights $\binom{s-1}{k-1}$ and using (62) gives (67).

For the bootstrap statement, work on the event from (43). Then

V_{n,s}^{*}(t)=\sum_{\begin{subarray}{c}I\subset[d]\\ |I|=s\end{subarray}}\mathbb{P}^{*}\bigl(\bm{S}_{n,I}^{*}\in(t,\infty)^{s}\bigr),\qquad\hat{M}_{n,s,\gamma}(t)=\sum_{\begin{subarray}{c}I\subset[d]\\ |I|=s\end{subarray}}\int_{(t,\infty)^{s}}\hat{p}_{n,\gamma,I}(u)\,du,

and therefore

	$\displaystyle\|V_{n,s}^{*}(t)-\hat{M}_{n,s,\gamma}(t)\|$	$\displaystyle\leq\sum_{\begin{subarray}{c}I\subset[d]\\ \|I\|=s\end{subarray}}\left\|\mathbb{P}^{}\bigl(\bm{S}_{n,I}^{}\in(t,\infty)^{s}\bigr)-\int_{(t,\infty)^{s}}\hat{p}_{n,\gamma,I}(u)\,du\right\|$
		$\displaystyle\leq C\varepsilon_{n}^{2}\sum_{\begin{subarray}{c}I\subset[d]\\ \|I\|=s\end{subarray}}\pi_{I}(t)=C\varepsilon_{n}^{2}M_{Z,s}(t).$

Summing again with the weights $\binom{s-1}{k-1}$ and using (62) proves (68). ∎

Theorem A.2 (Distribution expansion).

Assume Assumptions 2.1–2.4. Then

\sup_{t\in\mathcal{T}_{k,\epsilon}}\left|\mathbb{P}(T_{n,[k]}\leq t)-\bigl(G_{k}(t)+Q_{n,k}(t)\bigr)\right|\leq C\varepsilon_{n}^{2},

(69)

and, with probability at least $1-C/n$ ,

\sup_{t\in\mathcal{T}_{k,\epsilon}}\left|\mathbb{P}^{*}(T_{n,[k]}^{*}\leq t)-\bigl(G_{k}(t)+\hat{Q}_{n,\gamma,k}(t)\bigr)\right|\leq C\varepsilon_{n}^{2}.

(70)

Moreover,

\sup_{t\in\mathcal{T}_{k,\epsilon}}\Bigl(|Q_{n,k}(t)|+|Q_{n,k}^{\prime}(t)|+|Q_{n,k}^{\prime\prime}(t)|\Bigr)\leq C\varepsilon_{n},

(71)

and, with probability at least $1-C/n$ ,

\sup_{t\in\mathcal{T}_{k,\epsilon}}\Bigl(|\hat{Q}_{n,\gamma,k}(t)|+|\hat{Q}_{n,\gamma,k}^{\prime}(t)|+|\hat{Q}_{n,\gamma,k}^{\prime\prime}(t)|\Bigr)\leq C\varepsilon_{n}.

(72)

Comment. Theorem A.2 upgrades the factorial-moment approximation to a distributional expansion for $T_{n,[k]}$ and its bootstrap analogue. It also shows that the correction term $Q_{n,k}$ is smooth enough for the quantile inversion carried out later.

Proof.

By Lemma A.1,

\mathbb{P}(T_{n,[k]}>t)=\sum_{s=k}^{d}(-1)^{s-k}\binom{s-1}{k-1}V_{n,s}(t).

(73)

Split the right-hand side at $k_{0}$ . For $k\leq s\leq k_{0}$ , Theorem A.1 gives

V_{n,s}(t)=M_{n,s}(t)+r_{n,s}(t),\qquad|r_{n,s}(t)|\leq C\varepsilon_{n}^{2}M_{Z,s}(t).

(74)

Substituting (74) into (73), and using

\sum_{s=k}^{k_{0}}\binom{s-1}{k-1}M_{Z,s}(t)=\mathbb{P}(T_{\bm{Z},[k]}>t)+O(\varepsilon_{n}^{2})

from (63), we obtain

\mathbb{P}(T_{n,[k]}\leq t)=G_{k}(t)-\sum_{s=k}^{k_{0}}(-1)^{s-k}\binom{s-1}{k-1}\{M_{n,s}(t)-M_{Z,s}(t)\}+O(\varepsilon_{n}^{2}).

The sum equals $Q_{n,k}(t)$ by (1), so (69) follows.

For the bootstrap expansion, work on the event (68). On that event,

	$\displaystyle\mathbb{P}^{}(T_{n,[k]}^{}\leq t)$	$\displaystyle=G_{k}(t)-\sum_{s=k}^{k_{0}}(-1)^{s-k}\binom{s-1}{k-1}\{V_{n,s}^{*}(t)-V_{Z,s}(t)\}+O(\varepsilon_{n}^{2})$
		$\displaystyle=G_{k}(t)-\sum_{s=k}^{k_{0}}(-1)^{s-k}\binom{s-1}{k-1}\{\hat{M}_{n,s,\gamma}(t)-M_{Z,s}(t)\}+O(\varepsilon_{n}^{2}),$

uniformly on $\mathcal{T}_{k,\epsilon}$ , which is exactly (70).

It remains to prove the derivative bounds. Fix $s\in\{k,\dots,k_{0}\}$ . By (19),

M_{n,s}(t)-M_{Z,s}(t)=-\frac{1}{6\sqrt{n}}\sum_{\begin{subarray}{c}I\subset[d]\\ |I|=s\end{subarray}}\int_{(t,\infty)^{s}}\left\langle\mathbb{E}[\bar{\bm{X}}_{I}^{\,3}],\nabla^{3}\phi_{I}(u)\right\rangle\,du.

(75)

Since $\|\mathbb{E}[\bar{\bm{X}}_{I}^{\,3}]\|_{\infty}\leq C$ uniformly for $|I|\leq k_{0}$ ,

|M_{n,s}(t)-M_{Z,s}(t)|\leq\frac{C(1+t)^{3}}{\sqrt{n}}M_{Z,s}(t)\leq C\varepsilon_{n}M_{Z,s}(t),

(76)

where the first inequality follows from the Gaussian derivative bound

|\partial^{\alpha}\phi_{I}(u)|\leq C_{\alpha}(1+\|u\|_{\infty}^{3})\phi_{I}(u),\qquad|\alpha|=3,

and the second uses $t^{2}\asymp\log d$ .

Differentiate (75) with respect to $t$ . By the fundamental theorem of calculus, each derivative creates a finite sum of boundary integrals over $(s-1)$ -dimensional faces. Therefore

\displaystyle\frac{d}{dt}\{M_{n,s}(t)-M_{Z,s}(t)\}

\displaystyle=\frac{1}{6\sqrt{n}}\sum_{\begin{subarray}{c}I\subset[d]\\ |I|=s\end{subarray}}\sum_{r=1}^{s}\int_{(t,\infty)^{s-1}}\left\langle\mathbb{E}[\bar{\bm{X}}_{I}^{\,3}],\nabla^{3}\phi_{I}(u^{(r,t)})\right\rangle\,du_{-r},

hence

\left|\frac{d}{dt}\{M_{n,s}(t)-M_{Z,s}(t)\}\right|\leq C\varepsilon_{n}M_{Z,s}(t).

Differentiating once more produces second-face integrals and diagonal boundary terms. The same Gaussian derivative estimate and the strip estimate of Lemma A.13 imply

\left|\frac{d^{2}}{dt^{2}}\{M_{n,s}(t)-M_{Z,s}(t)\}\right|\leq C\varepsilon_{n}M_{Z,s}(t).

(77)

Summing (76)–(77) with the weights in (1) and using (62) proves (71).

For the bootstrap derivative bounds, work on $\Omega_{n,X}$ from Lemma A.9. By (20),

	$\displaystyle\hat{M}_{n,s,\gamma}(t)-M_{Z,s}(t)$	$\displaystyle=\frac{1}{2}\sum_{\begin{subarray}{c}I\subset[d]\\ \|I\|=s\end{subarray}}\int_{(t,\infty)^{s}}\left\langle\bar{\bm{b}}_{I}^{\,2}-\Sigma_{II},\nabla^{2}\phi_{I}(u)\right\rangle\,du$
		$\displaystyle\qquad-\frac{\gamma}{6\sqrt{n}}\sum_{\begin{subarray}{c}I\subset[d]\\ \|I\|=s\end{subarray}}\int_{(t,\infty)^{s}}\left\langle\bar{\bm{b}}_{I}^{\,3},\nabla^{3}\phi_{I}(u)\right\rangle\,du.$

Using (37)–(38), together with the Gaussian derivative estimates for $|\alpha|=2,3$ and the same boundary differentiation as above, yields

|\hat{M}_{n,s,\gamma}(t)-M_{Z,s}(t)|+\left|\frac{d}{dt}\{\hat{M}_{n,s,\gamma}(t)-M_{Z,s}(t)\}\right|+\left|\frac{d^{2}}{dt^{2}}\{\hat{M}_{n,s,\gamma}(t)-M_{Z,s}(t)\}\right|\leq C\varepsilon_{n}M_{Z,s}(t)

uniformly on $\Omega_{n,X}$ . Summing with the weights in (1) proves (72). ∎

A.7 Bootstrap centering and Cornish–Fisher inversion

Lemma A.16 (Bootstrap centering).

Assume Assumptions 2.1–2.4. Then

\sup_{t\in\mathcal{T}_{k,\epsilon}}\left|\mathbb{E}\bigl[\hat{Q}_{n,\gamma,k}(t)\bigr]-\gamma Q_{n,k}(t)\right|\leq C\varepsilon_{n}^{2}.

(78)

Proof.

Fix $s\in\{k,\dots,k_{0}\}$ . For every $I$ with $|I|=s$ ,

	$\displaystyle\mathbb{E}\bigl[\bar{\bm{b}}_{I}^{\,2}\bigr]-\mathbf{\Sigma}_{II}$	$\displaystyle=-\frac{1}{n}\mathbf{\Sigma}_{II},$		(79)
	$\displaystyle\left\\|\mathbb{E}\bigl[\bar{\bm{b}}_{I}^{\,3}\bigr]-\mathbb{E}\bigl[\bar{\bm{X}}_{I}^{\,3}\bigr]\right\\|_{\max}$	$\displaystyle\leq\frac{C}{n}.$		(80)

Indeed, (79) is the usual bias of the sample covariance, and (80) follows by expanding $(\bm{P}_{I}\bm{X}_{i}-\bm{P}_{I}\bar{\bm{X}})^{\otimes 3}$ and observing that every difference term contains at least one factor $\bar{\bm{X}}$ .

Now integrate (20) over $(t,\infty)^{s}$ , take expectations, subtract $\gamma\{M_{n,s}(t)-M_{Z,s}(t)\}$ , and use (79)–(80). By Lemma A.13,

\left|\mathbb{E}\bigl[\hat{M}_{n,s,\gamma}(t)-M_{Z,s}(t)\bigr]-\gamma\{M_{n,s}(t)-M_{Z,s}(t)\}\right|\leq C\left(\frac{(1+t)^{2}}{n}+\frac{(1+t)^{3}}{n^{3/2}}\right)M_{Z,s}(t).

Since $t\asymp\sqrt{\log d}$ on $\mathcal{T}_{k,\epsilon}$ , the right-hand side is $O(\varepsilon_{n}^{2}M_{Z,s}(t))$ . Summing over $s$ with the weights in (1) and using (62) proves (78). ∎

Theorem A.3 (Cornish–Fisher expansion).

Assume Assumptions 2.1–2.4. Then, with probability at least $1-C/n$ ,

\sup_{\epsilon<\alpha<1-\epsilon}\left|\hat{c}_{1-\alpha,k}-\left[c^{G}_{1-\alpha,k}-\frac{\hat{Q}_{n,\gamma,k}(c^{G}_{1-\alpha,k})}{f_{k}(c^{G}_{1-\alpha,k})}+R_{n,k}(\alpha)\right]\right|\leq C\varepsilon_{n}^{3}.

(81)

Comment. Theorem A.3 identifies the bootstrap critical value as a Gaussian quantile perturbed by an explicit linear term and a quadratic correction. This is the quantile-level expansion needed to turn the distributional approximation into a coverage expansion.

Proof.

Fix $\alpha\in(\epsilon,1-\epsilon)$ and abbreviate

c_{k}:=c^{G}_{1-\alpha,k},\qquad\hat{Q}_{k}:=\hat{Q}_{n,\gamma,k}(c_{k}),\qquad\hat{Q}_{k}^{\prime}:=\hat{Q}_{n,\gamma,k}^{\prime}(c_{k}).

On the event of (70) and (72),

\hat{F}_{n,k}(t)=G_{k}(t)+\hat{Q}_{n,\gamma,k}(t)+r_{n}(t),\qquad\sup_{t\in\mathcal{T}_{k,\epsilon}}|r_{n}(t)|\leq C\varepsilon_{n}^{2}.

(82)

Because $G_{k}(c_{k})=1-\alpha$ and $f_{k}(c_{k})\geq m_{k,\epsilon}>0$ , the implicit function theorem yields a unique root $\hat{c}_{1-\alpha,k}=c_{k}+\Delta_{k}$ with $|\Delta_{k}|\leq C\varepsilon_{n}$ . Substituting $t=c_{k}+\Delta_{k}$ into (82) and using Taylor’s formula up to order $2$ gives

	$\displaystyle 0$	$\displaystyle=\hat{F}_{n,k}(c_{k}+\Delta_{k})-(1-\alpha)$
		$\displaystyle=f_{k}(c_{k})\Delta_{k}+\frac{1}{2}f_{k}^{\prime}(c_{k})\Delta_{k}^{2}+\hat{Q}_{k}+\hat{Q}_{k}^{\prime}\Delta_{k}+\frac{1}{2}\hat{Q}_{n,\gamma,k}^{\prime\prime}(\xi_{k})\Delta_{k}^{2}+r_{n}(c_{k}+\Delta_{k})$		(83)

for some $\xi_{k}$ between $c_{k}$ and $c_{k}+\Delta_{k}$ . Since $\hat{Q}_{n,\gamma,k}^{\prime\prime}(\xi_{k})=O(\varepsilon_{n})$ by (72), the last quadratic term in (83) is $O(\varepsilon_{n}^{3})$ . Solving (83) iteratively,

\Delta_{k}=-\frac{\hat{Q}_{k}}{f_{k}(c_{k})}+\frac{f_{k}^{\prime}(c_{k})}{2f_{k}(c_{k})^{3}}\hat{Q}_{k}^{2}-\frac{\hat{Q}_{k}^{\prime}}{f_{k}(c_{k})^{2}}\hat{Q}_{k}+O(\varepsilon_{n}^{3}).

This is exactly (81); compare also the classical Cornish–Fisher inversion formulas in Hall (1992, Chapter 2). ∎

A.8 Coverage expansion

Proof of Theorem 2.1.

Fix $\alpha\in(\epsilon,1-\epsilon)$ and write

c_{k}:=c^{G}_{1-\alpha,k},\qquad F_{n,k}(t):=\mathbb{P}(T_{n,[k]}\leq t).

By (69) and (71),

F_{n,k}(t)=G_{k}(t)+Q_{n,k}(t)+r_{n}(t),\qquad\sup_{t\in\mathcal{T}_{k,\epsilon}}|r_{n}(t)|\leq C\varepsilon_{n}^{2}.

(84)

Let $E_{n}$ denote the event on which the Cornish–Fisher expansion (81) holds and $\hat{c}_{1-\alpha,k}\in\mathcal{T}_{k,\epsilon/2}$ . Then

\mathbb{P}(E_{n}^{c})\leq\frac{C}{n}\leq C\varepsilon_{n}^{2}.

On $E_{n}$ define

\Delta_{k}:=\hat{c}_{1-\alpha,k}-c_{k}.

By Theorem A.3,

\Delta_{k}=-\frac{\hat{Q}_{n,\gamma,k}(c_{k})}{f_{k}(c_{k})}+R_{n,k}(\alpha)+\zeta_{n,k}(\alpha),\qquad|\zeta_{n,k}(\alpha)|\leq C\varepsilon_{n}^{3}.

(85)

Also $|\Delta_{k}|\leq C\varepsilon_{n}$ on $E_{n}$ . Since $F_{n,k}$ is deterministic, Taylor’s formula on $E_{n}$ gives

\displaystyle F_{n,k}(\hat{c}_{1-\alpha,k})

\displaystyle=F_{n,k}(c_{k})+F_{n,k}^{\prime}(c_{k})\Delta_{k}+\frac{1}{2}F_{n,k}^{\prime\prime}(\xi_{n,k})\Delta_{k}^{2}

(86)

for some $\xi_{n,k}$ between $c_{k}$ and $\hat{c}_{1-\alpha,k}$ . From (84), (71), and (64),

F_{n,k}(c_{k})=(1-\alpha)+Q_{n,k}(c_{k})+O(\varepsilon_{n}^{2}),

F_{n,k}^{\prime}(c_{k})=f_{k}(c_{k})+Q_{n,k}^{\prime}(c_{k})+O(\varepsilon_{n}^{2}),

and

F_{n,k}^{\prime\prime}(\xi_{n,k})=f_{k}^{\prime}(c_{k})+O(1).

Substituting these bounds and (85) into (86), using $Q_{n,k}^{\prime}(c_{k})=O(\varepsilon_{n})$ and $\Delta_{k}=O(\varepsilon_{n})$ , yields on $E_{n}$ ,

\displaystyle F_{n,k}(\hat{c}_{1-\alpha,k})

\displaystyle=(1-\alpha)+Q_{n,k}(c_{k})-\hat{Q}_{n,\gamma,k}(c_{k})+R_{n,k}(\alpha)+O(\varepsilon_{n}^{2}).

(87)

Now take expectations. Since $0\leq F_{n,k}(\hat{c}_{1-\alpha,k})\leq 1$ ,

\left|\mathbb{E}\bigl[F_{n,k}(\hat{c}_{1-\alpha,k})\mathbf{1}_{E_{n}^{c}}\bigr]\right|\leq\mathbb{P}(E_{n}^{c})\leq C\varepsilon_{n}^{2}.

Therefore

\mathbb{P}(T_{n,[k]}\leq\hat{c}_{1-\alpha,k})=\mathbb{E}\bigl[F_{n,k}(\hat{c}_{1-\alpha,k})\mathbf{1}_{E_{n}}\bigr]+O(\varepsilon_{n}^{2}).

(88)

Taking expectations in (87) and using Lemma A.16,

\mathbb{E}\bigl[Q_{n,k}(c_{k})-\hat{Q}_{n,\gamma,k}(c_{k})\bigr]=(1-\gamma)Q_{n,k}(c_{k})+O(\varepsilon_{n}^{2}).

Combining this with (88) gives

\mathbb{P}(T_{n,[k]}\leq\hat{c}_{1-\alpha,k})=(1-\alpha)+(1-\gamma)Q_{n,k}(c_{k})+\mathbb{E}\{R_{n,k}(\alpha)\}+O(\varepsilon_{n}^{2}).

Taking complements proves (2). ∎

Proof of Corollary 2.1.

If $\gamma=1$ , the first-order term disappears in Theorem 2.1. Also,

|R_{n,k}(\alpha)|\leq C\bigl(|\hat{Q}_{n,\gamma,k}(c_{k})|^{2}+|\hat{Q}_{n,\gamma,k}^{\prime}(c_{k})|\,|\hat{Q}_{n,\gamma,k}(c_{k})|\bigr)\leq C\varepsilon_{n}^{2}

by (72), hence $\mathbb{E}|R_{n,k}(\alpha)|\leq C\varepsilon_{n}^{2}$ uniformly in $\alpha$ . ∎

Proof of Corollary 2.2.

The claim follows immediately from Theorem 2.1 and the uniform bound $\mathbb{E}|R_{n,k}(\alpha)|\leq C\varepsilon_{n}^{2}$ . ∎

A.9 Deterministic conditional theorem and double bootstrap

Theorem A.4 (Deterministic-array conditional theorem).

Let $\bm{a}_{1},\dots,\bm{a}_{n}\in\mathbb{R}^{d}$ be deterministic and define

\bm{T}_{n}(\bm{a}):=\frac{1}{\sqrt{n}}\sum_{i=1}^{n}v_{i}\bm{a}_{i}.

Assume that for some constants $L_{n}$ and $r_{n}$ ,

\max_{1\leq i\leq n}\|\bm{a}_{i}\|_{\infty}\leq L_{n},

(89)

and for every $I\subset[d]$ with $1\leq|I|\leq k_{0}$ ,

\lambda_{\min}\!\left(\frac{1}{n}\sum_{i=1}^{n}\bm{P}_{I}\bm{a}_{i}(\bm{P}_{I}\bm{a}_{i})^{\top}\right)\geq\frac{1}{2}\sigma_{*}^{2},

(90)

\left\|\frac{1}{n}\sum_{i=1}^{n}\bm{P}_{I}\bm{a}_{i}(\bm{P}_{I}\bm{a}_{i})^{\top}-\mathbf{\Sigma}_{II}\right\|_{\max}\leq r_{n}.

(91)

Then the conclusions of Theorems A.2, A.3, and 2.1 hold for the conditional law $\mathbb{P}_{v}(\cdot)$ of the $k$ th order statistic of $\bm{T}_{n}(\bm{a})$ , with constants uniform over all deterministic arrays satisfying (89)–(91) and with the same second-order rate $C\varepsilon_{n}^{2}$ .

Comment. Theorem A.4 isolates the deterministic conditions needed for the second bootstrap level. Once the first-level resample satisfies these array conditions, the same second-order expansion follows conditionally.

Proof.

Fix a deterministic array $\bm{a}_{1},\dots,\bm{a}_{n}$ satisfying (89)–(91). For every nonempty $I\subset[d]$ with $|I|=s\leq k_{0}$ , define

\bm{a}_{i,I}:=\bm{P}_{I}\bm{a}_{i},\qquad\bar{\bm{a}}_{I}^{\,2}:=\frac{1}{n}\sum_{i=1}^{n}\bm{a}_{i,I}\bm{a}_{i,I}^{\top},\qquad\bar{\bm{a}}_{I}^{\,3}:=\frac{1}{n}\sum_{i=1}^{n}\bm{a}_{i,I}^{\otimes 3}.

Let

\bm{T}_{n,I}(\bm{a}):=\bm{P}_{I}\bm{T}_{n}(\bm{a})=\frac{1}{\sqrt{n}}\sum_{i=1}^{n}v_{i}\bm{a}_{i,I}.

Because $v_{i}$ satisfies the same regularity condition as Assumption 2.2, the projected summand

\xi_{i,I}^{\bm{a}}:=\frac{1}{\sqrt{n}}v_{i}\bm{a}_{i,I}

has exact Stein kernel

\tau_{i,I}^{\bm{a}}(\xi_{i,I}^{\bm{a}})=\frac{1}{n}\tau^{v}(v_{i})\bm{a}_{i,I}\bm{a}_{i,I}^{\top},

where $\tau^{v}$ denotes the scalar Stein kernel of $v_{i}$ (or $\tau^{v}\equiv 1$ in the Gaussian case). Hence

\sum_{i=1}^{n}\mathbb{E}_{v}[\tau_{i,I}^{\bm{a}}(\xi_{i,I}^{\bm{a}})]=\bar{\bm{a}}_{I}^{\,2},\qquad\mathbb{E}_{v}\left[\sum_{i=1}^{n}(\xi_{i,I}^{\bm{a}})^{\otimes 3}\right]=\frac{1}{\sqrt{n}}\bar{\bm{a}}_{I}^{\,3},\qquad\beta_{i,I}^{\bm{a}}\equiv 0.

Now define the projected deterministic-array Edgeworth density

p_{n,\bm{a},I}(u):=\phi_{I}(u)+\frac{1}{2}\left\langle\bar{\bm{a}}_{I}^{\,2}-\Sigma_{II},\nabla^{2}\phi_{I}(u)\right\rangle-\frac{1}{6\sqrt{n}}\left\langle\bar{\bm{a}}_{I}^{\,3},\nabla^{3}\phi_{I}(u)\right\rangle.

Fix $I\subset[d]$ with $1\leq|I|\leq k_{0}$ and write $s:=|I|$ . Set

\bm{T}_{n,I}(\bm{a}):=n^{-1/2}\sum_{i=1}^{n}v_{i}\bm{a}_{i,I},\qquad A_{t}^{-}:=(-\infty,-t]^{s},\qquad h_{t,u}(x):=\mathbb{E}\bigl[\mathbf{1}_{A_{t}^{-}}(\sqrt{1-u}\,x+\sqrt{u}\,Z_{I}^{-})\bigr].

Then

\mathbb{P}_{v}\bigl(\bm{T}_{n,I}(\bm{a})\in(t,\infty)^{s}\bigr)=\mathbb{P}_{v}\bigl(-\bm{T}_{n,I}(\bm{a})\in A_{t}^{-}\bigr).

For the deterministic array $\bm{a}$ , the proof of Proposition A.1 uses only the following inputs:

\max_{1\leq i\leq n}\|\bm{a}_{i}\|_{\infty}\leq L_{n},\qquad\left\|\bar{\bm{a}}_{I}^{\,2}-\Sigma_{II}\right\|_{\max}\leq r_{n},\qquad\lambda_{\min}(\bar{\bm{a}}_{I}^{\,2})\geq\frac{1}{2}\sigma_{*}^{2},

which are exactly (89)–(91). Therefore,

\sup_{t\in\mathcal{T}_{k,\epsilon}}\left|\mathbb{P}_{v}\bigl(\bm{T}_{n,I}(\bm{a})\in(t,\infty)^{s}\bigr)-\int_{(t,\infty)^{s}}p_{n,\bm{a},I}(u)\,du\right|\leq C\varepsilon_{n}^{2}\pi_{I}(t)

(92)

uniformly over all admissible deterministic arrays.

Starting from (92), the factorial-moment argument gives

\sup_{t\in\mathcal{T}_{k,\epsilon}}\left|\sum_{s=k}^{k_{0}}(-1)^{s-k}\binom{s-1}{k-1}\left\{V_{n,s}^{(\bm{a})}(t)-M_{n,s}^{(\bm{a})}(t)\right\}\right|\leq C\varepsilon_{n}^{2},

where $V_{n,s}^{(\bm{a})}$ and $M_{n,s}^{(\bm{a})}$ are the conditional factorial moment and its first-order approximation built from $\bm{T}_{n}(\bm{a})$ . Substituting this identity into the weighted inclusion–exclusion formula gives the deterministic-array analogue of Theorem A.2. The Cornish–Fisher and coverage expansions then follow from the same algebraic steps as in Sections A.7–A.8 after replacing $Q_{n,k}$ by the corresponding deterministic-array first-order term. All constants remain uniform under (89)–(91). This proves the theorem. ∎

Lemma A.17 (The first-level bootstrap array satisfies the deterministic conditions).

Define

\bm{a}_{i}:=w_{i}(\bm{X}_{i}-\bar{\bm{X}})-\bar{\bm{X}}^{*},\qquad\bar{\bm{X}}^{*}:=\frac{1}{n}\sum_{r=1}^{n}w_{r}(\bm{X}_{r}-\bar{\bm{X}}).

Then there exists an event $\Omega_{n}$ such that

\mathbb{P}(\Omega_{n}^{c})\leq\frac{C}{n}

and, on $\Omega_{n}$ , the deterministic array $\bm{a}_{1},\dots,\bm{a}_{n}$ satisfies (89)–(91) with

L_{n}:=C\log^{2}(dn),\qquad r_{n}:=C\log^{2}(dn)\sqrt{\frac{\log(dn)}{n}}.

Proof.

Let $\Omega_{n,X}$ be the event from Lemma A.9, and let $\Omega_{n,w}$ be the event from Lemma A.10. Define

\Omega_{n,1}:=\Omega_{n,X}\cap\Omega_{n,w}.

Then

\mathbb{P}(\Omega_{n,1}^{c})\leq\frac{C}{n^{2}}.

(93)

On $\Omega_{n,1}$ ,

\max_{1\leq i\leq n}\|w_{i}(\bm{X}_{i}-\bar{\bm{X}})\|_{\infty}\leq\left(\max_{1\leq i\leq n}|w_{i}|\right)\left(\max_{1\leq i\leq n}\|\bm{X}_{i}-\bar{\bm{X}}\|_{\infty}\right)\leq C\log^{2}(dn).

(94)

Set

\bm{X}_{i}^{*}:=w_{i}(\bm{X}_{i}-\bar{\bm{X}}),\qquad\bar{\bm{X}}^{*}:=\frac{1}{n}\sum_{i=1}^{n}\bm{X}_{i}^{*}.

We first bound $\bar{\bm{X}}^{*}$ . Conditional on the original data, the vectors $\bm{X}_{i}^{*}$ are independent and centered. On $\Omega_{n,1}$ , every coordinate satisfies

\left\|\frac{1}{n}X_{ij}^{*}\right\|_{\psi_{1}}\leq\frac{C\log(dn)}{n},

because either $w_{i}$ is bounded or $w_{i}$ is Gaussian, hence sub-Gaussian, and (36) holds on $\Omega_{n,X}$ . Apply Lemma D.10 of Koike (2026) conditionally with

Y_{i}:=\frac{1}{n}\bm{X}_{i}^{*},\qquad K=\frac{C\log(dn)}{n},\qquad\alpha=1,\qquad a=2.

Then, on $\Omega_{n,1}$ ,

\mathbb{P}\!\left(\|\bar{\bm{X}}^{*}\|_{\infty}>C\log(dn)\sqrt{\frac{\log(dn)}{n}}\ \middle|\ \bm{X}_{1},\dots,\bm{X}_{n}\right)\leq\frac{1}{n^{2}}.

(95)

Next, write

\hat{\mathbf{\Sigma}}_{w}:=\frac{1}{n}\sum_{i=1}^{n}\bm{X}_{i}^{*}\bm{X}_{i}^{*\top}=\frac{1}{n}\sum_{i=1}^{n}w_{i}^{2}(\bm{X}_{i}-\bar{\bm{X}})(\bm{X}_{i}-\bar{\bm{X}})^{\top}.

Then

\hat{\mathbf{\Sigma}}_{w}-\hat{\mathbf{\Sigma}}_{X}=\frac{1}{n}\sum_{i=1}^{n}(w_{i}^{2}-1)(\bm{X}_{i}-\bar{\bm{X}})(\bm{X}_{i}-\bar{\bm{X}})^{\top}.

(96)

Conditional on the original data, the summands in (96) are independent and centered. On $\Omega_{n,1}$ , each entry of the matrix

\frac{1}{n}(w_{i}^{2}-1)(\bm{X}_{i}-\bar{\bm{X}})(\bm{X}_{i}-\bar{\bm{X}})^{\top}

has conditional $\psi_{1}$ -norm at most

\frac{C\log^{2}(dn)}{n}.

Apply Lemma D.10 of Koike (2026) conditionally with

Y_{i}:=\frac{1}{n}(w_{i}^{2}-1)\mathrm{vec}\!\left((\bm{X}_{i}-\bar{\bm{X}})(\bm{X}_{i}-\bar{\bm{X}})^{\top}\right),\qquad K=\frac{C\log^{2}(dn)}{n},\qquad\alpha=1,\qquad a=2.

Then, on $\Omega_{n,1}$ ,

\mathbb{P}\!\left(\|\hat{\mathbf{\Sigma}}_{w}-\hat{\mathbf{\Sigma}}_{X}\|_{\max}>C\log^{2}(dn)\sqrt{\frac{\log(dn)}{n}}\ \middle|\ \bm{X}_{1},\dots,\bm{X}_{n}\right)\leq\frac{1}{n^{2}}.

(97)

Define the conditional events

\Omega_{n,2}:=\left\{\|\bar{\bm{X}}^{*}\|_{\infty}\leq C\log(dn)\sqrt{\frac{\log(dn)}{n}}\right\},

and

\Omega_{n,3}:=\left\{\|\hat{\mathbf{\Sigma}}_{w}-\hat{\mathbf{\Sigma}}_{X}\|_{\max}\leq C\log^{2}(dn)\sqrt{\frac{\log(dn)}{n}}\right\}.

Finally, set

\Omega_{n}:=\Omega_{n,1}\cap\Omega_{n,2}\cap\Omega_{n,3}.

Using (93), (95), and (97),

	$\displaystyle\mathbb{P}(\Omega_{n}^{c})$	$\displaystyle\leq\mathbb{P}(\Omega_{n,1}^{c})+\mathbb{E}\bigl[\mathbb{P}(\Omega_{n,2}^{c}\cup\Omega_{n,3}^{c}\mid\bm{X}_{1},\dots,\bm{X}_{n})\mathbf{1}_{\Omega_{n,1}}\bigr]$
		$\displaystyle\leq\frac{C}{n^{2}}+\frac{C}{n^{2}}\leq\frac{C}{n}.$

Now work on $\Omega_{n}$ . Recall

\bm{a}_{i}=\bm{X}_{i}^{*}-\bar{\bm{X}}^{*}.

By (94),

\|\bm{a}_{i}\|_{\infty}\leq\|\bm{X}_{i}^{*}\|_{\infty}+\|\bar{\bm{X}}^{*}\|_{\infty}\leq C\log^{2}(dn)=L_{n}.

Also,

\frac{1}{n}\sum_{i=1}^{n}\bm{a}_{i}\bm{a}_{i}^{\top}=\hat{\mathbf{\Sigma}}_{w}-\bar{\bm{X}}^{*}\bar{\bm{X}}^{*\top}.

Hence

	$\displaystyle\left\\|\frac{1}{n}\sum_{i=1}^{n}\bm{a}_{i}\bm{a}_{i}^{\top}-\mathbf{\Sigma}\right\\|_{\max}$	$\displaystyle\leq\\|\hat{\mathbf{\Sigma}}_{w}-\hat{\mathbf{\Sigma}}_{X}\\|_{\max}+\\|\hat{\mathbf{\Sigma}}_{X}-\mathbf{\Sigma}\\|_{\max}+\\|\bar{\bm{X}}^{}\bar{\bm{X}}^{\top}\\|_{\max}$
		$\displaystyle\leq C\log^{2}(dn)\sqrt{\frac{\log(dn)}{n}}+Cb^{2}\sqrt{\frac{\log(dn)}{n}}+C\log^{2}(dn)\frac{\log(dn)}{n}$
		$\displaystyle\leq Cr_{n}.$

Therefore, for every $I$ with $1\leq|I|\leq k_{0}$ ,

\left\|\frac{1}{n}\sum_{i=1}^{n}\bm{P}_{I}\bm{a}_{i}(\bm{P}_{I}\bm{a}_{i})^{\top}-\mathbf{\Sigma}_{II}\right\|_{\max}\leq Cr_{n},

which proves (91) after enlarging the constant in $r_{n}$ .

It remains to prove (90). Let

\mathbf{\Sigma}_{I}(\bm{a}):=\frac{1}{n}\sum_{i=1}^{n}\bm{P}_{I}\bm{a}_{i}(\bm{P}_{I}\bm{a}_{i})^{\top}.

Since $\lambda_{\min}(\mathbf{\Sigma}_{II})\geq\sigma_{*}^{2}$ by Lemma A.2,

$\displaystyle\lambda_{\min}(\mathbf{\Sigma}_{I}(\bm{a}))$	$\displaystyle\geq\lambda_{\min}(\mathbf{\Sigma}_{II})-\\|\mathbf{\Sigma}_{I}(\bm{a})-\mathbf{\Sigma}_{II}\\|_{\mathrm{op}}$
	$\displaystyle\geq\sigma_{*}^{2}-\|I\|\,\\|\mathbf{\Sigma}_{I}(\bm{a})-\mathbf{\Sigma}_{II}\\|_{\max}$
	$\displaystyle\geq\sigma_{*}^{2}-k_{0}Cr_{n}$	(98)

by Weyl’s inequality (see, e.g., (Horn and Johnson, 2012, Corollary 4.3.2)). Since $k_{0}r_{n}\to 0$ , (98) implies

\lambda_{\min}(\mathbf{\Sigma}_{I}(\bm{a}))\geq\frac{1}{2}\sigma_{*}^{2}

for all sufficiently large $n$ . This proves (90). Together with the bound for $\max_{i}\|\bm{a}_{i}\|_{\infty}$ , the proof is complete. ∎

Proof of Theorem 2.2.

Let $\Omega_{n}$ be the event from Lemma A.17. Since $n^{-1}=O(\varepsilon_{n}^{2})$ , it is enough to work on $\Omega_{n}$ . On that event, the first-level bootstrap array satisfies the deterministic conditions of Theorem A.4. Because the second-level multipliers satisfy $\mathbb{E}v_{1}^{3}=1$ , the conditional version of Corollary 2.1 gives

\sup_{\epsilon<\alpha<1-\epsilon}\left|\mathbb{P}^{*}\bigl(T_{n,[k]}^{*}\geq\hat{c}_{1-\alpha,k}^{**}\bigr)-\alpha\right|\leq C\varepsilon_{n}^{2}\qquad\text{on }\Omega_{n}.

(99)

Set $\delta_{n}:=C\varepsilon_{n}^{2}$ , with $C$ chosen large enough that both (99) and the first-level second-order accuracy bound hold with the same constant.

Fix $\alpha\in(2\epsilon,1-2\epsilon)$ . On $\Omega_{n}$ , (99) with nominal level $\alpha-\delta_{n}$ implies

\mathbb{P}^{*}\bigl(T_{n,[k]}^{*}>\hat{c}_{1-\alpha+\delta_{n},k}^{**}\bigr)\leq\alpha.

Equivalently,

\mathbb{P}^{*}\bigl(\hat{F}_{n,k}^{*}(T_{n,[k]}^{*})>1-\alpha+\delta_{n}\bigr)\leq\alpha.

By the definition of $\hat{\beta}_{\alpha,k}$ ,

\hat{\beta}_{\alpha,k}\leq 1-\alpha+\delta_{n}\qquad\text{on }\Omega_{n}.

Since $p\mapsto\hat{c}_{p,k}$ is nondecreasing,

\hat{c}_{\hat{\beta}_{\alpha,k},k}\leq\hat{c}_{1-\alpha+\delta_{n},k}\qquad\text{on }\Omega_{n}.

Therefore,

$\displaystyle\mathbb{P}\bigl(T_{n,[k]}\geq\hat{c}_{\hat{\beta}_{\alpha,k},k}\bigr)$	$\displaystyle\geq\mathbb{P}\bigl(T_{n,[k]}\geq\hat{c}_{1-\alpha+\delta_{n},k},\Omega_{n}\bigr)$
	$\displaystyle\geq\mathbb{P}\bigl(T_{n,[k]}\geq\hat{c}_{1-\alpha+\delta_{n},k}\bigr)-\mathbb{P}(\Omega_{n}^{c})$
	$\displaystyle\geq(\alpha-\delta_{n})-C\varepsilon_{n}^{2}-\mathbb{P}(\Omega_{n}^{c})$
	$\displaystyle\geq\alpha-C\varepsilon_{n}^{2}.$	(100)

Similarly, applying (99) with level $\alpha+\delta_{n}$ yields

\mathbb{P}^{*}\bigl(\hat{F}_{n,k}^{*}(T_{n,[k]}^{*})\leq 1-\alpha-\delta_{n}\bigr)<1-\alpha,

which implies

\hat{\beta}_{\alpha,k}>1-\alpha-\delta_{n}\qquad\text{on }\Omega_{n}.

Hence

\hat{c}_{\hat{\beta}_{\alpha,k},k}\geq\hat{c}_{1-\alpha-\delta_{n},k}\qquad\text{on }\Omega_{n},

and therefore

$\displaystyle\mathbb{P}\bigl(T_{n,[k]}\geq\hat{c}_{\hat{\beta}_{\alpha,k},k}\bigr)$	$\displaystyle\leq\mathbb{P}\bigl(T_{n,[k]}\geq\hat{c}_{1-\alpha-\delta_{n},k}\bigr)+\mathbb{P}(\Omega_{n}^{c})$
	$\displaystyle\leq(\alpha+\delta_{n})+C\varepsilon_{n}^{2}+\mathbb{P}(\Omega_{n}^{c})$
	$\displaystyle\leq\alpha+C\varepsilon_{n}^{2}.$	(101)

Combining (100) and (101) proves (3). ∎

Appendix B Appendix B: Proofs for the stationary exponential-mixing alternative

This appendix proves Theorem 2.3. Throughout Appendix B we work under Assumptions 2.1, 2.2, 2.3, and 2.5, and we use the notation introduced in Section 2.3. Only the Gaussian aggregation part of Appendix A needs to be modified; the projected local Edgeworth expansion is unchanged except for the shift/strip estimates established below.

B.1. Correlation decay and Gaussian cluster tails

Lemma B.1.

Under Assumption 2.5, for every $h\geq 1$ ,

|\rho(h)|\leq\sin\{2\pi\alpha(h)\}\leq 2\pi C_{\alpha}e^{-a_{\alpha}h}.

(102)

Consequently,

\sum_{h=1}^{\infty}|\rho(h)|\leq\frac{2\pi C_{\alpha}e^{-a_{\alpha}}}{1-e^{-a_{\alpha}}}<\infty.

(103)

Moreover, for every integer $m\geq 2$ , every index set $I\subset[d]$ with $|I|=m$ , and every $t>0$ ,

\mathbb{P}(Z_{j}>t,\ \forall j\in I)\leq\bar{\Phi}\!\left(\sqrt{\frac{m}{1+(m-1)\vartheta_{*}}}\,\frac{t}{\sigma}\right)\leq\frac{\sigma\sqrt{1+(m-1)\vartheta_{*}}}{\sqrt{2\pi m}\,t}\exp\!\left\{-\frac{mt^{2}}{2\sigma^{2}\{1+(m-1)\vartheta_{*}\}}\right\}.

(104)

In particular, when $m=2$ ,

\mathbb{P}(Z_{0}>t,Z_{h}>t)\leq\bar{\Phi}\!\left(\sqrt{\frac{2}{1+\vartheta_{*}}}\,\frac{t}{\sigma}\right)\leq\frac{\sigma\sqrt{1+\vartheta_{*}}}{\sqrt{4\pi}\,t}\exp\!\left\{-\frac{t^{2}}{\sigma^{2}(1+\vartheta_{*})}\right\}.

(105)

Proof.

For standard Gaussian variables $U,V$ with correlation $r$ , one has

\mathbb{P}(U>0,V>0)-\frac{1}{4}=\frac{1}{2\pi}\arcsin(r).

Therefore, with

\mathcal{F}_{0}:=\sigma(Z_{j}:j\leq 0),\qquad\mathcal{G}_{h}:=\sigma(Z_{j}:j\geq h),

we obtain

\alpha(h)\geq\left|\mathbb{P}(Z_{0}>0,Z_{h}>0)-\mathbb{P}(Z_{0}>0)\mathbb{P}(Z_{h}>0)\right|=\frac{1}{2\pi}|\arcsin\rho(h)|.

Hence

|\rho(h)|\leq\sin\{2\pi\alpha(h)\}\leq 2\pi\alpha(h)\leq 2\pi C_{\alpha}e^{-a_{\alpha}h},

which proves (102). Summing the geometric series yields (103).

Now fix $I=\{i_{1},\dots,i_{m}\}\subset[d]$ with $|I|=m$ and write

U_{r}:=Z_{i_{r}}/\sigma,\qquad 1\leq r\leq m.

By (4), every off-diagonal correlation of $(U_{1},\dots,U_{m})^{\top}$ is bounded above by $\vartheta_{*}$ . Therefore

\mathrm{Var}\!\left(\sum_{r=1}^{m}U_{r}\right)\leq m+m(m-1)\vartheta_{*}=m\{1+(m-1)\vartheta_{*}\}.

Since

\{U_{r}>u,\ \forall r\leq m\}\subset\left\{\sum_{r=1}^{m}U_{r}>mu\right\},

we obtain

\mathbb{P}(U_{r}>u,\ \forall r\leq m)\leq\bar{\Phi}\!\left(\sqrt{\frac{m}{1+(m-1)\vartheta_{*}}}\,u\right).

Applying Mills’ ratio proves (104), and (105) is the case $m=2$ . ∎

B.2. Bonferroni remainder for the $k$ th exceedance event

Lemma B.2.

For every integer $k\geq 1$ , every integer $m\geq k$ , and every nonnegative integer-valued random variable $N$ ,

\left|\mathbf{1}\{N\geq k\}-\sum_{s=k}^{m}(-1)^{s-k}\binom{s-1}{k-1}\binom{N}{s}\right|\leq\binom{m}{k-1}\binom{N}{m+1}.

(106)

Consequently,

\left|\mathbb{P}(N\geq k)-\sum_{s=k}^{m}(-1)^{s-k}\binom{s-1}{k-1}\mathbb{E}\binom{N}{s}\right|\leq\binom{m}{k-1}\mathbb{E}\binom{N}{m+1}.

(107)

Proof.

The generalized Bonferroni inequalities for the event $\{N\geq k\}$ imply

\sum_{s=k}^{m}(-1)^{s-k}\binom{s-1}{k-1}\binom{N}{s}\leq\mathbf{1}\{N\geq k\}\leq\sum_{s=k}^{m+1}(-1)^{s-k}\binom{s-1}{k-1}\binom{N}{s}

when $m-k$ is even, and the inequalities are reversed when $m-k$ is odd. In either case, the difference between the two adjacent truncations equals

\binom{m}{k-1}\binom{N}{m+1},

which proves (106). Taking expectations yields (107). ∎

B.3. Block construction and reduction to block exceedances

Let

s_{d}:=d-q_{d}(m_{d}+\ell_{d}),\qquad 0\leq s_{d}<m_{d}+\ell_{d}.

Define the main blocks and gaps by

I_{r}:=\{(r-1)(m_{d}+\ell_{d})+1,\dots,(r-1)(m_{d}+\ell_{d})+m_{d}\},\qquad r=1,\dots,q_{d},

J_{r}:=\{(r-1)(m_{d}+\ell_{d})+m_{d}+1,\dots,r(m_{d}+\ell_{d})\},\qquad r=1,\dots,q_{d},

and define the remainder interval

R_{d}:=\{q_{d}(m_{d}+\ell_{d})+1,\dots,d\}

when $s_{d}\geq 1$ . For $t\in\mathbb{R}$ , set

B_{r}(t):=\left\{\max_{j\in I_{r}}Z_{j}>t\right\},\qquad Y_{r}(t):=\mathbf{1}\{B_{r}(t)\},\qquad S_{d}(t):=\sum_{r=1}^{q_{d}}Y_{r}(t),\qquad N_{d}(t):=\sum_{j=1}^{d}\mathbf{1}\{Z_{j}>t\}.

Also define

q(t):=\mathbb{P}(B_{1}(t)),\qquad\mu_{d}(t):=q_{d}q(t).

Lemma B.3.

For every $t\in\mathbb{R}$ ,

0\leq m_{d}p(t)-q(t)\leq\binom{m_{d}}{2}\bar{\Phi}\!\left(\sqrt{\frac{2}{1+\vartheta_{*}}}\,\frac{t}{\sigma}\right).

(108)

Moreover,

	$\displaystyle\mathbb{P}\{N_{d}(t)\neq S_{d}(t)\}$	$\displaystyle\leq(q_{d}\ell_{d}+s_{d})p(t)+q_{d}\binom{m_{d}}{2}\bar{\Phi}\!\left(\sqrt{\frac{2}{1+\vartheta_{*}}}\,\frac{t}{\sigma}\right),$		(109)
	$\displaystyle\|\mu_{d}(t)-\lambda(t)\|$	$\displaystyle\leq(q_{d}\ell_{d}+m_{d}+\ell_{d})p(t)+q_{d}\binom{m_{d}}{2}\bar{\Phi}\!\left(\sqrt{\frac{2}{1+\vartheta_{*}}}\,\frac{t}{\sigma}\right).$		(110)

Proof.

The first Bonferroni inequality gives

q(t)=\mathbb{P}\Bigl(\bigcup_{j\in I_{1}}\{Z_{j}>t\}\Bigr)\leq m_{d}p(t),

and the second Bonferroni inequality yields

q(t)\geq m_{d}p(t)-\sum_{1\leq a<b\leq m_{d}}\mathbb{P}(Z_{a}>t,Z_{b}>t).

Using (105) proves (108).

If $N_{d}(t)\neq S_{d}(t)$ , then either at least one exceedance occurs in a gap or in $R_{d}$ , or some main block contains at least two exceedances. Therefore

\mathbb{P}\{N_{d}(t)\neq S_{d}(t)\}\leq\sum_{r=1}^{q_{d}}\mathbb{P}\left\{\max_{j\in J_{r}}Z_{j}>t\right\}+\mathbb{P}\left\{\max_{j\in R_{d}}Z_{j}>t\right\}+\sum_{r=1}^{q_{d}}\mathbb{P}\left\{\sum_{j\in I_{r}}\mathbf{1}\{Z_{j}>t\}\geq 2\right\}.

Now

\mathbb{P}\left\{\max_{j\in J_{r}}Z_{j}>t\right\}\leq\ell_{d}p(t),\qquad\mathbb{P}\left\{\max_{j\in R_{d}}Z_{j}>t\right\}\leq s_{d}p(t),

and, by the union bound and (105),

\mathbb{P}\left\{\sum_{j\in I_{r}}\mathbf{1}\{Z_{j}>t\}\geq 2\right\}\leq\sum_{1\leq a<b\leq m_{d}}\mathbb{P}(Z_{a}>t,Z_{b}>t)\leq\binom{m_{d}}{2}\bar{\Phi}\!\left(\sqrt{\frac{2}{1+\vartheta_{*}}}\,\frac{t}{\sigma}\right).

This proves (109). Finally,

|\mu_{d}(t)-\lambda(t)|\leq|q_{d}q(t)-q_{d}m_{d}p(t)|+|q_{d}m_{d}-d|\,p(t),

and

|q_{d}m_{d}-d|\leq q_{d}\ell_{d}+m_{d}+\ell_{d}.

Combining these displays with (108) proves (110). ∎

Lemma B.4.

Let $s\in\{1,\dots,k_{0}+1\}$ . Then, for every $t\in\mathbb{R}$ ,

\left|\sum_{1\leq r_{1}<\cdots<r_{s}\leq q_{d}}\mathbb{P}\Bigl(\bigcap_{j=1}^{s}B_{r_{j}}(t)\Bigr)-\binom{q_{d}}{s}q(t)^{s}\right|\leq s2^{s-1}\binom{q_{d}}{s}\alpha(\ell_{d}).

(111)

Consequently,

\left|\mathbb{E}\binom{S_{d}(t)}{s}-\frac{\mu_{d}(t)^{s}}{s!}\right|\leq C_{s}\left\{q_{d}^{-1}\mu_{d}(t)^{s}+d^{s}\alpha(\ell_{d})\right\},

(112)

where $C_{s}=s2^{s-1}+s!$ is deterministic.

Proof.

Fix $1\leq r_{1}<\cdots<r_{s}\leq q_{d}$ . Put

A_{j}(t):=B_{r_{j}}(t)^{c}=\left\{\max_{u\in I_{r_{j}}}Z_{u}\leq t\right\},\qquad j=1,\dots,s.

Since the selected main blocks are separated by at least $\ell_{d}$ , repeated application of Lemma 3.2.2 of Leadbetter, Lindgren, and Rootzén yields

\left|\mathbb{P}\Bigl(\bigcap_{j\in L}A_{j}(t)\Bigr)-\prod_{j\in L}\mathbb{P}\{A_{j}(t)\}\right|\leq(|L|-1)\alpha(\ell_{d})

(113)

for every nonempty $L\subset[s]$ . The inclusion–exclusion identity gives

\mathbb{P}\Bigl(\bigcap_{j=1}^{s}B_{r_{j}}(t)\Bigr)=\sum_{L\subset[s]}(-1)^{|L|}\mathbb{P}\Bigl(\bigcap_{j\in L}A_{j}(t)\Bigr),

and the same identity with each probability replaced by the corresponding product equals $q(t)^{s}$ , since $\mathbb{P}\{A_{j}(t)\}=1-q(t)$ . Therefore

	$\displaystyle\left\|\mathbb{P}\Bigl(\bigcap_{j=1}^{s}B_{r_{j}}(t)\Bigr)-q(t)^{s}\right\|$
	$\displaystyle\leq\sum_{L\subset[s]}\left\|\mathbb{P}\Bigl(\bigcap_{j\in L}A_{j}(t)\Bigr)-\prod_{j\in L}\mathbb{P}\{A_{j}(t)\}\right\|$
	$\displaystyle\leq\sum_{m=2}^{s}\binom{s}{m}(m-1)\alpha(\ell_{d})\leq s2^{s-1}\alpha(\ell_{d}).$

Summing over the $\binom{q_{d}}{s}$ choices of $(r_{1},\dots,r_{s})$ gives (111).

Now

\mathbb{E}\binom{S_{d}(t)}{s}=\sum_{1\leq r_{1}<\cdots<r_{s}\leq q_{d}}\mathbb{P}\Bigl(\bigcap_{j=1}^{s}B_{r_{j}}(t)\Bigr).

Therefore

\mathbb{E}\binom{S_{d}(t)}{s}=\binom{q_{d}}{s}q(t)^{s}+R_{d,s}(t),\qquad|R_{d,s}(t)|\leq s2^{s-1}\binom{q_{d}}{s}\alpha(\ell_{d}).

Also,

\left|\binom{q_{d}}{s}-\frac{q_{d}^{s}}{s!}\right|\leq s!\,q_{d}^{s-1},

\left|\binom{q_{d}}{s}q(t)^{s}-\frac{\mu_{d}(t)^{s}}{s!}\right|\leq s!\,q_{d}^{-1}\mu_{d}(t)^{s}.

Finally,

\binom{q_{d}}{s}\alpha(\ell_{d})\leq q_{d}^{s}\alpha(\ell_{d})\leq d^{s}\alpha(\ell_{d}).

Combining the last three displays proves (112). ∎

B.4. Direct Poisson approximation on the quantile window

Lemma B.5.

If $\lambda(t)\leq 2\Lambda_{k,\epsilon}$ , then

\frac{t^{2}}{\sigma^{2}}\geq 2\log d-3\log\log d-C_{k,\epsilon},

(114)

and hence

\bar{\Phi}\!\left(\sqrt{\frac{2}{1+\vartheta_{*}}}\,\frac{t}{\sigma}\right)\leq C_{k,\epsilon}(\log d)^{-1/2}d^{-(1+\beta_{*})}.

(115)

Consequently,

	$\displaystyle\mathbb{P}\{N_{d}(t)\neq S_{d}(t)\}$	$\displaystyle\leq C_{k,\epsilon}\eta_{1,d},$		(116)
	$\displaystyle\|\mu_{d}(t)-\lambda(t)\|$	$\displaystyle\leq C_{k,\epsilon}\eta_{1,d}.$		(117)

Proof.

If $\lambda(t)\leq 2\Lambda_{k,\epsilon}$ , then

d\,\bar{\Phi}(t/\sigma)\leq 2\Lambda_{k,\epsilon}.

Mills’ ratio implies

\bar{\Phi}(u)\geq\frac{1}{\sqrt{2\pi}(1+u)}e^{-u^{2}/2},\qquad u>0.

Applying this with $u=t/\sigma$ yields

\frac{1}{\sqrt{2\pi}(1+t/\sigma)}e^{-t^{2}/(2\sigma^{2})}\leq\frac{2\Lambda_{k,\epsilon}}{d}.

Taking logarithms and using $\log(1+t/\sigma)\leq\log(2+t^{2}/\sigma^{2})\leq\log(2+2\log d+C_{k,\epsilon})$ yields (114). Substituting (114) into (105) proves (115).

Now use (109) and (110). Since

p(t)=\frac{\lambda(t)}{d}\leq\frac{2\Lambda_{k,\epsilon}}{d},

we obtain

(q_{d}\ell_{d}+s_{d})p(t)\leq 2\Lambda_{k,\epsilon}\left\{\frac{q_{d}\ell_{d}}{d}+\frac{s_{d}}{d}\right\}\leq 2\Lambda_{k,\epsilon}\left\{\frac{\ell_{d}}{m_{d}}+\frac{m_{d}+\ell_{d}}{d}\right\},

because $q_{d}\leq d/m_{d}$ and $s_{d}<m_{d}+\ell_{d}$ . Also,

q_{d}\binom{m_{d}}{2}\bar{\Phi}\!\left(\sqrt{\frac{2}{1+\vartheta_{*}}}\,\frac{t}{\sigma}\right)\leq C_{k,\epsilon}\frac{d}{m_{d}}m_{d}^{2}(\log d)^{-1/2}d^{-(1+\beta_{*})}=C_{k,\epsilon}d^{-3\beta_{*}/4}(\log d)^{-1/2}.

Combining these bounds with (109) and (110) proves (116) and (117). ∎

Lemma B.6.

For every $t$ such that $\lambda(t)\leq 2\Lambda_{k,\epsilon}$ ,

\left|\mathbb{P}\{S_{d}(t)\leq k-1\}-h_{k}\bigl(\mu_{d}(t)\bigr)\right|\leq C_{k,\epsilon}\left\{q_{d}^{-1}+d^{k_{0}+1}\alpha(\ell_{d})+\frac{(3\Lambda_{k,\epsilon})^{k_{0}+1}}{(k_{0}+1)!}\right\}.

(118)

Consequently,

\left|G_{k}(t)-h_{k}\bigl(\lambda(t)\bigr)\right|\leq C_{k,\epsilon}r_{d}.

(119)

Proof.

Set

V_{S,s}(t):=\mathbb{E}\binom{S_{d}(t)}{s}.

By Lemma B.2 with $N=S_{d}(t)$ and $m=k_{0}$ ,

	$\displaystyle\left\|\mathbb{P}\{S_{d}(t)\geq k\}-\sum_{s=k}^{k_{0}}(-1)^{s-k}\binom{s-1}{k-1}V_{S,s}(t)\right\|$
	$\displaystyle\leq\binom{k_{0}}{k-1}V_{S,k_{0}+1}(t).$		(120)

By Lemma B.4, for each $s\in\{k,\dots,k_{0}+1\}$ ,

\left|V_{S,s}(t)-\frac{\mu_{d}(t)^{s}}{s!}\right|\leq C_{k,\epsilon}\left\{q_{d}^{-1}+d^{k_{0}+1}\alpha(\ell_{d})\right\},

because $\mu_{d}(t)\leq\lambda(t)+C_{k,\epsilon}\eta_{1,d}\leq 3\Lambda_{k,\epsilon}$ for all sufficiently large $d$ by (117). Therefore

	$\displaystyle\left\|\sum_{s=k}^{k_{0}}(-1)^{s-k}\binom{s-1}{k-1}V_{S,s}(t)-\sum_{s=k}^{k_{0}}(-1)^{s-k}\binom{s-1}{k-1}\frac{\mu_{d}(t)^{s}}{s!}\right\|$
	$\displaystyle\leq C_{k,\epsilon}\left\{q_{d}^{-1}+d^{k_{0}+1}\alpha(\ell_{d})\right\}.$

Applying Lemma B.2 to a Poisson random variable $\Pi_{\mu_{d}(t)}\sim\mathrm{Poi}(\mu_{d}(t))$ gives

\left|\sum_{s=k}^{k_{0}}(-1)^{s-k}\binom{s-1}{k-1}\frac{\mu_{d}(t)^{s}}{s!}-\mathbb{P}\{\Pi_{\mu_{d}(t)}\geq k\}\right|\leq\binom{k_{0}}{k-1}\frac{\mu_{d}(t)^{k_{0}+1}}{(k_{0}+1)!}\leq C_{k,\epsilon}\frac{(3\Lambda_{k,\epsilon})^{k_{0}+1}}{(k_{0}+1)!}.

Combining the last three displays with (120) proves (118).

Finally,

\left|G_{k}(t)-\mathbb{P}\{S_{d}(t)\leq k-1\}\right|=\left|\mathbb{P}\{N_{d}(t)\leq k-1\}-\mathbb{P}\{S_{d}(t)\leq k-1\}\right|\leq\mathbb{P}\{N_{d}(t)\neq S_{d}(t)\}\leq C_{k,\epsilon}\eta_{1,d}

by (116), and

\left|h_{k}\bigl(\mu_{d}(t)\bigr)-h_{k}\bigl(\lambda(t)\bigr)\right|\leq\sup_{0\leq u\leq 3\Lambda_{k,\epsilon}}|h_{k}^{\prime}(u)|\,|\mu_{d}(t)-\lambda(t)|\leq C_{k,\epsilon}\eta_{1,d}

by (117). Combining these bounds with (118) proves (119). ∎

B.5. Threshold scale, shift/strip bounds, and weighted Gaussian bounds

Lemma B.7.

There exist constants $0<c_{1}<C_{1}<\infty$ and an integer $d_{0}$ such that

c_{1}\log d\leq t^{2}\leq C_{1}\log d\qquad\text{for every }t\in\mathcal{T}_{k,\epsilon}\text{ and every }d\geq d_{0}.

(121)

Proof.

Choose $0<\lambda_{-}<\lambda_{+}<\infty$ such that

h_{k}(\lambda_{-})=1-\epsilon/4,\qquad h_{k}(\lambda_{+})=\epsilon/4.

Since $r_{d}\to 0$ , there exists $d_{0}$ such that

C_{k,\epsilon}r_{d}\leq\epsilon/4\qquad\text{for every }d\geq d_{0}.

If $t\in\mathcal{T}_{k,\epsilon}$ and $\lambda(t)\leq\lambda_{-}$ , then (119) gives

G_{k}(t)\geq h_{k}(\lambda(t))-C_{k,\epsilon}r_{d}\geq 1-\epsilon/4-\epsilon/4=1-\epsilon/2,

which contradicts the definition of $\mathcal{T}_{k,\epsilon}$ . Similarly, if $\lambda(t)\geq\lambda_{+}$ , then

G_{k}(t)\leq h_{k}(\lambda(t))+C_{k,\epsilon}r_{d}\leq\epsilon/4+\epsilon/4=\epsilon/2,

again contradicting the definition of $\mathcal{T}_{k,\epsilon}$ . Therefore

\lambda_{-}\leq\lambda(t)\leq\lambda_{+}\qquad(t\in\mathcal{T}_{k,\epsilon},\ d\geq d_{0}).

Since $\lambda(t)=d\bar{\Phi}(t/\sigma)$ and $\sigma\in[\underline{\sigma},\overline{\sigma}]$ , Mills’ ratio yields constants $c_{1},C_{1}$ depending only on $(k,\epsilon,\underline{\sigma},\overline{\sigma})$ such that (121) holds. ∎

Lemma B.8 (Shift and strip bounds).

Under Assumption 2.5, the conclusions of Lemmas A.12 and A.13 remain valid. More precisely, there exist constants $c_{0},C_{0}>0$ such that for every nonempty $I\subset[d]$ with $|I|\leq k_{0}+1$ , every $t\in\mathcal{T}_{k,\epsilon}$ , and every $0\leq a\leq c_{0}/t$ ,

\pi_{I}(t-a)\leq C_{0}\pi_{I}(t),

(122)

and

\pi_{I}(t-a)-\pi_{I}(t)\leq C_{0}a(1+t)\pi_{I}(t).

(123)

Proof.

Fix a nonempty $I\subset[d]$ with $|I|\leq k_{0}+1$ . Since $\mathbf{\Sigma}_{II}$ is a principal submatrix of $\mathbf{\Sigma}$ ,

\lambda_{\min}(\mathbf{\Sigma}_{II})\geq\lambda_{\min}(\mathbf{\Sigma})\geq\sigma_{*}^{2}.

On the other hand, by stationarity and (103),

\lambda_{\max}(\mathbf{\Sigma}_{II})\leq\overline{\sigma}^{2}\left(1+2\sum_{h=1}^{\infty}|\rho(h)|\right)\leq\overline{\sigma}^{2}\left(1+\frac{4\pi C_{\alpha}e^{-a_{\alpha}}}{1-e^{-a_{\alpha}}}\right)=:C_{\Sigma}.

Hence every principal covariance matrix of dimension at most $k_{0}+1$ is uniformly well conditioned and has operator norm bounded by $C_{\Sigma}$ . Repeating the proof of Lemmas A.12 and A.13 with these two spectral bounds gives (122) and (123). ∎

Lemma B.9.

There exists a constant $C_{k,\epsilon}>0$ such that, for every $d\geq d_{0}$ and every $t\in\mathcal{T}_{k,\epsilon}$ ,

\sum_{s=k}^{k_{0}}\binom{s-1}{k-1}M_{Z,s}(t)\leq C_{k,\epsilon},

(124)

and

\binom{k_{0}}{k-1}M_{Z,k_{0}+1}(t)\leq C_{k,\epsilon}r_{d}.

(125)

Proof.

For $s\in\{1,\dots,k_{0}+1\}$ , decompose $M_{Z,s}(t)$ according to the block partition used in Lemmas B.3 and B.4. The contribution of configurations that use only main blocks and place at most one exceedance in each selected block is $\mathbb{E}\binom{S_{d}(t)}{s}$ . Every remaining configuration necessarily contains either an exceedance in a gap or in the remainder interval, or at least two exceedances inside one main block. Therefore the same counting argument used in the proof of Lemma B.3, together with the cluster bound (104), yields

\left|M_{Z,s}(t)-\mathbb{E}\binom{S_{d}(t)}{s}\right|\leq C_{s,k,\epsilon}\eta_{1,d}.

(126)

Combining (126) with (112) gives

\left|M_{Z,s}(t)-\frac{\lambda(t)^{s}}{s!}\right|\leq C_{s,k,\epsilon}\left\{\eta_{1,d}+q_{d}^{-1}+d^{s}\alpha(\ell_{d})\right\},\qquad 1\leq s\leq k_{0}+1.

(127)

Since $t\in\mathcal{T}_{k,\epsilon}$ implies $\lambda(t)\in[\lambda_{-},\lambda_{+}]$ by Lemma B.7, summing (127) over $s=k,\dots,k_{0}$ yields

\displaystyle\sum_{s=k}^{k_{0}}\binom{s-1}{k-1}M_{Z,s}(t)

\displaystyle\leq\sum_{s=k}^{k_{0}}\binom{s-1}{k-1}\frac{\lambda_{+}^{s}}{s!}+\sum_{s=k}^{k_{0}}\binom{s-1}{k-1}C_{s,k,\epsilon}\left\{\eta_{1,d}+q_{d}^{-1}+d^{s}\alpha(\ell_{d})\right\}.

The first sum is bounded by a constant depending only on $(k,\epsilon)$ because it is dominated by the convergent series

\sum_{s=k}^{\infty}\binom{s-1}{k-1}\frac{\lambda_{+}^{s}}{s!}.

The second sum is also bounded because $k_{0}$ is finite for every $n$ , $\eta_{1,d}\leq 1$ for large $d$ , $q_{d}^{-1}\leq 1$ , and (10) implies

\sum_{s=k}^{k_{0}}\binom{s-1}{k-1}d^{s}\alpha(\ell_{d})\leq C_{k,\epsilon}d^{k_{0}}\alpha(\ell_{d})\leq C_{k,\epsilon}d^{-7k_{0}-16}n^{-8}.

This proves (124).

For $s=k_{0}+1$ , (127) yields

M_{Z,k_{0}+1}(t)\leq\frac{\lambda_{+}^{k_{0}+1}}{(k_{0}+1)!}+C_{k,\epsilon}\left\{\eta_{1,d}+q_{d}^{-1}+d^{k_{0}+1}\alpha(\ell_{d})\right\}.

Multiplying by $\binom{k_{0}}{k-1}$ and enlarging the constant proves (125). ∎

B.6. Regularity of $G_{k}$

Lemma B.10.

There exist constants $m_{k,\epsilon}>0$ , $B_{k,\epsilon}>0$ , and an integer $d_{1}\geq d_{0}$ such that

f_{k}(t)=G_{k}^{\prime}(t)\geq m_{k,\epsilon}\qquad\text{for every }t\in\mathcal{T}_{k,\epsilon}\text{ and every }d\geq d_{1},

(128)

and

\left|(G_{k}^{-1})^{\prime\prime}(p)\right|\leq B_{k,\epsilon}\qquad\text{for every }p\in[\epsilon/2,1-\epsilon/2]\text{ and every }d\geq d_{1}.

(129)

Proof.

Set

H_{k}(t):=h_{k}(\lambda(t)).

By Lemma B.7, there exist constants $c_{\lambda},C_{\lambda}>0$ such that

c_{\lambda}t\leq|\lambda^{\prime}(t)|\leq C_{\lambda}t,\qquad|\lambda^{\prime\prime}(t)|\leq C_{\lambda}(1+t^{2}),\qquad t\in\mathcal{T}_{k,\epsilon},\ d\geq d_{0}.

(130)

Since $\lambda(t)\in[\lambda_{-},\lambda_{+}]$ on $\mathcal{T}_{k,\epsilon}$ , the derivatives of $h_{k}$ are bounded on this compact interval. Hence there exist constants $c_{H},C_{H}>0$ such that

|H_{k}^{\prime}(t)|\geq c_{H}t,\qquad|H_{k}^{\prime\prime}(t)|\leq C_{H}(1+t^{2}),\qquad t\in\mathcal{T}_{k,\epsilon},\ d\geq d_{0}.

(131)

Define

\delta_{d}:=r_{d}^{1/4}.

Since $r_{d}\to 0$ , there exists $d_{1}\geq d_{0}$ such that

\delta_{d}(1+C_{1}\log d)\leq\frac{c_{H}}{4}\qquad\text{for every }d\geq d_{1},

(132)

where $C_{1}$ is the constant from Lemma B.7. For $t\in\mathcal{T}_{k,\epsilon}$ and $d\geq d_{1}$ , Taylor’s theorem gives

	$\displaystyle\left\|\frac{H_{k}(t+\delta_{d})-H_{k}(t-\delta_{d})}{2\delta_{d}}-H_{k}^{\prime}(t)\right\|$	$\displaystyle\leq C_{H}\delta_{d}(1+t^{2}),$		(133)
	$\displaystyle\left\|\frac{H_{k}(t+\delta_{d})-2H_{k}(t)+H_{k}(t-\delta_{d})}{\delta_{d}^{2}}-H_{k}^{\prime\prime}(t)\right\|$	$\displaystyle\leq C_{H}\delta_{d}(1+t^{2}).$		(134)

By (119),

	$\displaystyle\left\|\frac{G_{k}(t+\delta_{d})-G_{k}(t-\delta_{d})}{2\delta_{d}}-\frac{H_{k}(t+\delta_{d})-H_{k}(t-\delta_{d})}{2\delta_{d}}\right\|$	$\displaystyle\leq C_{k,\epsilon}\frac{r_{d}}{\delta_{d}}=C_{k,\epsilon}r_{d}^{3/4},$		(135)
	$\displaystyle\left\|\frac{G_{k}(t+\delta_{d})-2G_{k}(t)+G_{k}(t-\delta_{d})}{\delta_{d}^{2}}-\frac{H_{k}(t+\delta_{d})-2H_{k}(t)+H_{k}(t-\delta_{d})}{\delta_{d}^{2}}\right\|$	$\displaystyle\leq C_{k,\epsilon}\frac{r_{d}}{\delta_{d}^{2}}=C_{k,\epsilon}r_{d}^{1/2}.$		(136)

Combining (131)–(135), Lemma B.7, and (132) shows that

G_{k}^{\prime}(t)\geq\frac{c_{H}}{2}t\qquad(t\in\mathcal{T}_{k,\epsilon},\ d\geq d_{1}).

Since $\mathcal{T}_{k,\epsilon}$ is separated away from $0$ by Lemma B.7, this proves (128).

Likewise, (134) and (136) imply

|G_{k}^{\prime\prime}(t)|\leq C_{k,\epsilon}(1+t^{2})\qquad(t\in\mathcal{T}_{k,\epsilon},\ d\geq d_{1}).

Finally,

(G_{k}^{-1})^{\prime\prime}(p)=-\frac{G_{k}^{\prime\prime}(G_{k}^{-1}(p))}{G_{k}^{\prime}(G_{k}^{-1}(p))^{3}},\qquad p\in[\epsilon/2,1-\epsilon/2],

and (128) together with the bound on $G_{k}^{\prime\prime}$ proves (129). ∎

B.7. Completion of the proof of Theorem 2.3

The local projected Edgeworth expansion in Proposition A.1 and its bootstrap version depend on the Gaussian law only through the spectral bounds for principal submatrices and the shift/strip inequalities. By Lemma B.8, the proof of Proposition A.1 remains valid under Assumption 2.5; moreover, with the same argument one may enlarge the range from $|I|\leq k_{0}$ to $|I|\leq k_{0}+1$ . Thus, for every nonempty $I\subset[d]$ with $|I|\leq k_{0}+1$ ,

\sup_{t\in\mathcal{T}_{k,\epsilon}}\left|\mathbb{P}\bigl(\bm{S}_{n,I}\in(t,\infty)^{|I|}\bigr)-\int_{(t,\infty)^{|I|}}p_{n,I}(\bm{u})\,d\bm{u}\right|\leq C\varepsilon_{n}^{2}\pi_{I}(t),

(137)

and, with probability at least $1-C/n$ ,

\sup_{t\in\mathcal{T}_{k,\epsilon}}\left|\mathbb{P}^{*}\bigl(\bm{S}_{n,I}^{*}\in(t,\infty)^{|I|}\bigr)-\int_{(t,\infty)^{|I|}}\hat{p}_{n,\gamma,I}(\bm{u})\,d\bm{u}\right|\leq C\varepsilon_{n}^{2}\pi_{I}(t)

(138)

holds simultaneously for all such $I$ .

Summing (137) and (138) over $|I|=s$ gives, for every $s\in\{k,\dots,k_{0}+1\}$ ,

	$\displaystyle\|V_{n,s}(t)-M_{n,s}(t)\|$	$\displaystyle\leq C\varepsilon_{n}^{2}M_{Z,s}(t),$		(139)
	$\displaystyle\|V_{n,s}^{*}(t)-\hat{M}_{n,s,\gamma}(t)\|$	$\displaystyle\leq C\varepsilon_{n}^{2}M_{Z,s}(t)$		(140)

uniformly over $t\in\mathcal{T}_{k,\epsilon}$ , with the bootstrap bound holding on an event of probability at least $1-C/n$ .

To prove (11), apply Lemma B.2 with $N=N_{n}(t)$ and $m=k_{0}$ to obtain

\left|\mathbb{P}(T_{n,[k]}>t)-\sum_{s=k}^{k_{0}}(-1)^{s-k}\binom{s-1}{k-1}V_{n,s}(t)\right|\leq\binom{k_{0}}{k-1}V_{n,k_{0}+1}(t).

(141)

Using (139) with $s=k_{0}+1$ and the bound

|M_{n,k_{0}+1}(t)-M_{Z,k_{0}+1}(t)|\leq C\varepsilon_{n}M_{Z,k_{0}+1}(t)

from the proof of Theorem A.2, we obtain

V_{n,k_{0}+1}(t)\leq(1+C\varepsilon_{n}+C\varepsilon_{n}^{2})M_{Z,k_{0}+1}(t)\leq 2M_{Z,k_{0}+1}(t)

for all sufficiently large $n$ . Hence (125) yields

\binom{k_{0}}{k-1}V_{n,k_{0}+1}(t)\leq C_{k,\epsilon}r_{d}.

(142)

Also, (139) and (124) imply

\sum_{s=k}^{k_{0}}\binom{s-1}{k-1}|V_{n,s}(t)-M_{n,s}(t)|\leq C\varepsilon_{n}^{2}.

Applying Lemma B.2 with $N=N_{Z}(t)$ gives

\left|\mathbb{P}(T_{\bm{Z},[k]}>t)-\sum_{s=k}^{k_{0}}(-1)^{s-k}\binom{s-1}{k-1}M_{Z,s}(t)\right|\leq\binom{k_{0}}{k-1}M_{Z,k_{0}+1}(t)\leq C_{k,\epsilon}r_{d}.

Combining the last three displays with the definition of $Q_{n,k}(t)$ proves (11). The bootstrap expansion (12) follows in the same way from (140), and the probability of the exceptional event remains bounded by $C/n$ .

The derivative bounds in the proof of Theorem A.2 use only the derivative estimates for the projected Gaussian densities and the uniform weighted bound (124). Since both inputs are available here, the same argument yields

\sup_{t\in\mathcal{T}_{k,\epsilon}}\Bigl(|Q_{n,k}(t)|+|Q_{n,k}^{\prime}(t)|+|Q_{n,k}^{\prime\prime}(t)|\Bigr)\leq C\varepsilon_{n},

(143)

and, with probability at least $1-C/n$ ,

\sup_{t\in\mathcal{T}_{k,\epsilon}}\Bigl(|\hat{Q}_{n,\gamma,k}(t)|+|\hat{Q}_{n,\gamma,k}^{\prime}(t)|+|\hat{Q}_{n,\gamma,k}^{\prime\prime}(t)|\Bigr)\leq C\varepsilon_{n},

(144)

exactly as in Theorem A.2.

Next, let

\hat{F}_{n,k}(t)=G_{k}(t)+\hat{Q}_{n,\gamma,k}(t)+\hat{r}_{n}(t),\qquad\sup_{t\in\mathcal{T}_{k,\epsilon}}|\hat{r}_{n}(t)|\leq C(\varepsilon_{n}^{2}+r_{d}),

which follows from (12). Since $G_{k}^{\prime}(t)\geq m_{k,\epsilon}>0$ on $\mathcal{T}_{k,\epsilon}$ by Lemma B.10, the same implicit-function argument as in the proof of Theorem A.3 yields a unique solution

\hat{c}_{1-\alpha,k}=c^{G}_{1-\alpha,k}+\Delta_{n,k}(\alpha),\qquad|\Delta_{n,k}(\alpha)|\leq C(\varepsilon_{n}+r_{d}).

Substituting $t=c^{G}_{1-\alpha,k}+\Delta_{n,k}(\alpha)$ into the identity $\hat{F}_{n,k}(t)=1-\alpha$ and expanding as in (83) gives

\left|\Delta_{n,k}(\alpha)+\frac{\hat{Q}_{n,\gamma,k}(c^{G}_{1-\alpha,k})}{f_{k}(c^{G}_{1-\alpha,k})}-R_{n,k}(\alpha)\right|\leq C(\varepsilon_{n}^{3}+r_{d}),

uniformly in $\alpha\in(\epsilon,1-\epsilon)$ , which proves (13).

For the coverage expansion, write

F_{n,k}(t)=G_{k}(t)+Q_{n,k}(t)+r_{n}(t),\qquad\sup_{t\in\mathcal{T}_{k,\epsilon}}|r_{n}(t)|\leq C(\varepsilon_{n}^{2}+r_{d}),

which follows from (11). Insert (13) into the Taylor formula

F_{n,k}(\hat{c}_{1-\alpha,k})=F_{n,k}(c^{G}_{1-\alpha,k})+F_{n,k}^{\prime}(c^{G}_{1-\alpha,k})\Delta_{n,k}(\alpha)+\frac{1}{2}F_{n,k}^{\prime\prime}(\xi_{n,k,\alpha})\Delta_{n,k}(\alpha)^{2}.

Using (143), Lemma B.10, and Lemma A.16, the same algebra as in the proof of Theorem 2.1 yields

\left|\mathbb{P}(T_{n,[k]}\leq\hat{c}_{1-\alpha,k})-\left[(1-\alpha)+(1-\gamma)Q_{n,k}(c^{G}_{1-\alpha,k})+\mathbb{E}\{R_{n,k}(\alpha)\}\right]\right|\leq C(\varepsilon_{n}^{2}+r_{d}).

Taking complements proves (14).

If $\gamma=1$ , then the linear term disappears and

|R_{n,k}(\alpha)|\leq C\left(|\hat{Q}_{n,\gamma,k}(c^{G}_{1-\alpha,k})|^{2}+|\hat{Q}_{n,\gamma,k}^{\prime}(c^{G}_{1-\alpha,k})|\,|\hat{Q}_{n,\gamma,k}(c^{G}_{1-\alpha,k})|\right)\leq C\varepsilon_{n}^{2}

by (144). Therefore (15) follows from (14).

Finally, the deterministic-array conditional theorem in Section A.8 is proved from the conditional versions of Theorems A.2, A.3, and 2.1. Repeating that argument with (12), (13), and (14) gives the same deterministic-array statement with $C(\varepsilon_{n}^{2}+r_{d})$ in place of $C\varepsilon_{n}^{2}$ . Inserting that conditional bound into the proof of Theorem 2.2 yields (16). This completes the proof of Theorem 2.3.

References

S. M. Berman (1964) Limit theorems for the maximum term in stationary sequences. The Annals of Mathematical Statistics 35 (2), pp. 502–516. Cited by: §A.3.
J. Chang, X. Chen, and M. Wu (2024) Central limit theorems for high dimensional dependent data. Bernoulli 30 (1), pp. 712–742. External Links: Document Cited by: §4.
J. Chang, Q. Jiang, T. S. McElroy, and X. Shao (2025) Statistical inference for high-dimensional spectral density matrix. Journal of the American Statistical Association 120 (551), pp. 1960–1974. External Links: Document Cited by: §4.
J. Chang, Q. Jiang, and X. Shao (2023) Testing the martingale difference hypothesis in high dimension. Journal of Econometrics 235 (2), pp. 972–1000. External Links: Document Cited by: §4.
V. Chernozhukov, D. Chetverikov, K. Kato, and Y. Koike (2022) Improved central limit theorem and bootstrap approximation in high dimensions. The Annals of Statistics 50 (5), pp. 2562–2586. Cited by: §1.
V. Chernozhukov, D. Chetverikov, and K. Kato (2013) Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. The Annals of Statistics 41 (6), pp. 2786–2819. External Links: Document Cited by: §1.
V. Chernozhukov, D. Chetverikov, and K. Kato (2017) Central limit theorems and bootstrap in high dimensions. The Annals of Probability 45 (4), pp. 2309–2353. External Links: Document Cited by: §1.
V. Chernozhukov, D. Chetverikov, and Y. Koike (2023) Nearly optimal central limit theorem and bootstrap approximations in high dimensions. The Annals of Applied Probability 33 (3), pp. 2374–2425. Cited by: §1.
H. Deng and C. Zhang (2020) Beyond gaussian approximation: bootstrap for maxima of sums of independent random vectors. The Annals of Statistics 48 (6), pp. 3643–3671. Cited by: §1.
Y. Ding, Q. Li, Y. Shi, L. Sun, and L. Zhang (2026) Gaussian multiplier bootstrap procedure for the $k$ th largest coordinate of high-dimensional statistics. Note: arXiv:2508.14400v2 [math.ST] Cited by: §1.
X. Fang and Y. Koike (2021) High-dimensional central limit theorems by stein’s method. The Annals of Applied Probability 31 (4), pp. 1660–1686. Cited by: §1.
X. Fang and Y. Koike (2024) Sharp high-dimensional central limit theorems for log-concave distributions. Annales de l’Institut Henri Poincare Probabilites et Statistiques 60 (3), pp. 2129–2156. Cited by: §1.
R. A. Fisher and L. H. C. Tippett (1928) Limiting forms of the frequency distribution of the largest or smallest member of a sample. Mathematical Proceedings of the Cambridge Philosophical Society 24 (2), pp. 180–190. External Links: Document Cited by: §1.
P. Hall (1992) The bootstrap and edgeworth expansion. Springer Series in Statistics, Springer, New York. Cited by: §A.7.
R. A. Horn and C. R. Johnson (2012) Matrix analysis. 2 edition, Cambridge University Press, Cambridge. Cited by: §A.9.
Y. Koike (2021) Notes on the dimension dependence in high-dimensional central limit theorems for hyperrectangles. Japanese Journal of Statistics and Data Science 4 (1), pp. 257–297. External Links: Document Cited by: §1.
Y. Koike (2026) High-dimensional bootstrap and asymptotic expansion. Probability Theory and Related Fields. Note: Published online first External Links: Document Cited by: §A.3, §A.3, §A.3, §A.3, §A.4, §A.4, §A.9, §A.9, Lemma A.7, Lemma A.7, Lemma A.7, §1, Remark 2.1, Remark 2.2.
D. Kozbur (2021) Dimension-free anticoncentration bounds for gaussian order statistics with discussion of applications to multiple testing. Note: arXiv:2107.10766 Cited by: §1.
W. V. Li and Q. Shao (2002) A normal comparison inequality and its applications. Probability Theory and Related Fields 122 (4), pp. 494–508. Cited by: §A.3.
M. E. Lopes, Z. Lin, and H. Muller (2020) Bootstrapping max statistics in high dimensions: near-parametric rates under weak variance decay and application to functional and multinomial data. The Annals of Statistics 48 (2), pp. 1214–1229. Cited by: §1.
C. Y. Mu (1966) The types of limit distributions for some terms of variational series. Scientia Sinica 15, pp. 749–762. Cited by: §1.
X. Shao (2010) The dependent wild bootstrap. Journal of the American Statistical Association 105 (489), pp. 218–235. External Links: Document Cited by: §4.
V. Watts, H. Rootzen, and M. R. Leadbetter (1982) On limiting distributions of intermediate order statistics from stationary sequences. The Annals of Probability 10, pp. 653–662. Cited by: §1.
D. Zhang and W. B. Wu (2017) Gaussian approximation for high dimensional time series. The Annals of Statistics 45 (5), pp. 1895–1919. External Links: Document Cited by: §4.
X. Zhang and G. Cheng (2014) Bootstrapping high dimensional time series. Note: arXiv:1406.1037 External Links: Document, Link Cited by: §4.
X. Zhang and G. Cheng (2018) Gaussian approximation for high dimensional vector under physical dependence. Bernoulli 24 (4A), pp. 2640–2675. External Links: Document Cited by: §4.

	$\displaystyle\left\|\frac{1}{n}\sum_{i=1}^{n}b_{ij_{1}}b_{ij_{2}}b_{ij_{3}}\right\|$	$\displaystyle\leq\left(\max_{1\leq i\leq n}\\|\bm{b}_{i}\\|_{\infty}\right)\frac{1}{n}\sum_{i=1}^{n}\|b_{ij_{1}}b_{ij_{2}}\|$
		$\displaystyle\leq\left(\max_{1\leq i\leq n}\\|\bm{b}_{i}\\|_{\infty}\right)\left(\frac{1}{n}\sum_{i=1}^{n}b_{ij_{1}}^{2}\right)^{1/2}\left(\frac{1}{n}\sum_{i=1}^{n}b_{ij_{2}}^{2}\right)^{1/2}$
		$\displaystyle\leq Cb\log(dn)\cdot\max_{1\leq j\leq d}\hat{\Sigma}_{X,jj}$
		$\displaystyle\leq Cb^{3}\log(dn),$

High Dimensional Bootstrap and Asymptotic Expansion for the kk-th Largest Coordinate

Abstract

1 Introduction

2 Main Results

2.1 Asymptotic expansion of coverage probability

Assumption 2.1.

Remark 2.1.

Assumption 2.2.

Remark 2.2.

Assumption 2.3.

Remark 2.3.

Assumption 2.4.

Remark 2.4.

Theorem 2.1.

Corollary 2.1 (Third-moment matching).

Corollary 2.2 (Persistence of the first-order term).

2.2 Double wild bootstrap

Theorem 2.2.

2.3 A stationary exponential-mixing alternative

Assumption 2.5 (Stationary Gaussian coordinates with exponential strong mixing).

Theorem 2.3 (Stationary exponential-mixing alternative).

Remark 2.5.

3 Simulation

4 Conclusion

Appendix A Appendix A: Proofs of Theorems

A.1 Combinatorial identities

Lemma A.1 (Finite inclusion–exclusion identity).

Proof.

A.2 Projected quantities

Lemma A.2 (Projection preserves the data-side assumptions).

Proof.

A.3 External matrix, Gaussian-comparison, and Koike lemmas

Lemma A.3 (Gershgorin interval theorem).

Proof.

Lemma A.4 (Berman–Li–Shao normal comparison inequality).

Proof.

Lemma A.5 (Koike smoothing identity).

Proof.

Lemma A.6 (Koike orthant derivative bound).

Proof.

Lemma A.7 (Koike projected decomposition).

Proof.

Lemma A.8 (Koike tensor and concentration bounds).

Proof.

Lemma A.9 (Explicit high-probability event for the first bootstrap array).

Proof.

Lemma A.10 (Multiplier maximum event).

Proof.

A.4 The projected local input

Proposition A.1 (Projected local orthant expansion).

Proof.

Lemma A.11 (Gaussian threshold scale).

Proof.

Lemma A.12 (Gaussian shift and strip bounds).

Proof.

Lemma A.13 (Gaussian strip bound for the Edgeworth density).

Proof.

A.5 Gaussian factorial moments, aggregation, and regularity

Lemma A.14 (Gaussian factorial moments).

Proof.

Lemma A.15 (Weighted aggregation and Gaussian regularity).

Proof.

A.6 Factorial-moment and distribution expansions

Theorem A.1 (Factorial-moment expansion).

Proof.

Theorem A.2 (Distribution expansion).

Proof.

A.7 Bootstrap centering and Cornish–Fisher inversion

Lemma A.16 (Bootstrap centering).

Proof.

Theorem A.3 (Cornish–Fisher expansion).

Proof.

A.8 Coverage expansion

Proof of Theorem 2.1.

Proof of Corollary 2.1.

Proof of Corollary 2.2.

A.9 Deterministic conditional theorem and double bootstrap

Theorem A.4 (Deterministic-array conditional theorem).

Proof.

Lemma A.17 (The first-level bootstrap array satisfies the deterministic conditions).

High Dimensional Bootstrap and Asymptotic Expansion for the $k$ -th Largest Coordinate

B.2. Bonferroni remainder for the $k$ th exceedance event

B.6. Regularity of $G_{k}$