High Dimensional Bootstrap and Asymptotic Expansion for the -th Largest Coordinate
Abstract
We study bootstrap inference for the th largest coordinate of a normalized sum of independent high-dimensional random vectors. Existing second-order theory for maxima does not directly extend to order statistics, because the event is not a rectangle and its local structure is governed by exceedance counts rather than by a single boundary. We develop an approach based on factorial moments and weighted inclusion–exclusion that reduces the problem to a collection of rare-orthant probabilities and allows high-dimensional Edgeworth and Cornish–Fisher expansions to be transferred to the order-statistic setting. Under moment, variance, and weak-dependence conditions, we derive a second-order coverage expansion for wild-bootstrap critical values of the th order statistic. In particular, a third-moment matching wild bootstrap achieves coverage error of order up to logarithmic factors, and the same second-order accuracy is obtained for a prepivoted double wild bootstrap. We also show that the maximal-correlation condition can be replaced by a stationary Gaussian exponential-mixing assumption at the price of an explicit dependence remainder , and this remainder can itself be of order when the dimension is sufficiently large relative to the sample size. These results extend recent second-order Gaussian and bootstrap approximation theory from maxima to the th order statistic in high dimension.
Keywords: bootstrap coverage expansion; high-dimensional Gaussian approximation; th order statistic; second-order accuracy; wild bootstrap.
1 Introduction
High-dimensional Gaussian approximation for maxima and rectangular probabilities is now a basic tool in modern high-dimensional inference. For the maximum of a sum of independent random vectors, the seminal work of Chernozhukov et al. (2013) established Gaussian approximation and Gaussian multiplier bootstrap validity when the dimension is allowed to be much larger than the sample size. This line of work was sharpened substantially by Chernozhukov et al. (2017), who extended the approximation theory to hyperrectangles and improved the first-order rate. Later, Deng and Zhang (2020) showed that third-moment matching bootstrap procedures enjoy a better logarithmic dependence in the first-order bound, and Koike (2021) proved that the same logarithmic rate is already available for normal approximation. Among the general first-order results under mild moment assumptions, Chernozhukov et al. (2022) further improved the error bound to an -type rate up to logarithmic factors. Under additional nondegeneracy or structural assumptions, nearly parametric rates up to logarithmic losses are also available; see, for example, (Lopes et al., 2020; Fang and Koike, 2021; Chernozhukov et al., 2023; Fang and Koike, 2024).
A decisive recent development for maxima is the asymptotic expansion theory developed by Koike (2026). That paper developed high-dimensional Edgeworth and Cornish–Fisher expansions for maxima and related rectangular probabilities by combining Stein-kernel arguments, smoothing inequalities, and a careful analysis of Gaussian anti-concentration. As a consequence, Koike (2026) obtained a second-order bootstrap coverage expansion and showed that, in several important regimes, the coverage error can be improved from the first-order scale to for a suitable constant . In particular, for third-moment matching wild bootstrap, the maximum statistic becomes second-order accurate even without studentization under suitable covariance assumptions.
Compared with the theory for maxima, the literature for the th largest coordinate is still sparse. Classical results on order statistics and extremes, such as (Fisher and Tippett, 1928; Mu, 1966; Watts et al., 1982), do not address high-dimensional Gaussian approximation for sums of random vectors. On the Gaussian side, Kozbur (2021) studied dimension-free anti-concentration inequalities for Gaussian order statistics. In the genuinely high-dimensional setting, Ding et al. (2026) established Gaussian and Gaussian multiplier bootstrap approximations for the th largest coordinate and for more general functionals of the top- order statistics. For the th largest coordinate, their Kolmogorov bounds are of order
up to universal constants, and the bounds for general top- functionals are of even larger order. Therefore the currently available theory for the th largest coordinate is still essentially first-order and does not provide a second-order coverage expansion comparable to the one available for maxima.
The purpose of the present paper is to fill this gap. We prove that the th largest coordinate of a high-dimensional normalized sum also admits a Koike-type second-order bootstrap expansion. Our argument starts from the exact exceedance-count representation of the event and combines weighted inclusion–exclusion with a local rare-orthant analysis. This allows us to transfer the second-order expansion machinery from maxima to the th order statistic. As a result, we show that third-moment matching wild bootstrap retains second-order accuracy for the th largest coordinate, and we also obtain a second-order result for the prepivoted double wild bootstrap. In this way, the second-order theory that was previously available only for maxima is extended to the th largest coordinate in high dimension.
We also give a complementary dependence formulation based on a stationary Gaussian reference field with exponentially decaying strong-mixing coefficients. This assumption is structurally different from the maximal-correlation condition used in the baseline theory: it exploits one-dimensional dependence and allows local clusters of highly correlated coordinates. In that setting we rework the Gaussian aggregation argument and obtain the same distributional, quantile, and coverage expansions with an explicit additional remainder that isolates the effect of local exceedance clustering. The resulting expression is fully explicit and can again be of order when the dimension grows sufficiently quickly relative to the sample size.
The remainder of the paper is organized as follows. Section 2 presents the main theoretical results, including the exponential-mixing alternative in Section 2.3. Section 3 reports simulation results comparing several bootstrap methods. Section 4 concludes. Proofs are collected in Appendices A and B.
Notation. We write . For a vector , let
We denote by the all-ones vector. For , denotes the set of real-valued -dimensional -tensors. If and , then denotes their tensor product. When , we write
and
For , denotes the th tensor power of . Whenever are under discussion, we set
Given an -times differentiable function , we set for , where . For denotes the set of bounded functions with bounded derivatives. For a multi-index , we write
For a positive definite matrix , let denote the density of . We write for the standard normal distribution function and for its survival function. For a distribution function , its generalized inverse is defined by
For and a scalar random variable , let
For a matrix , we set
Also, and denote conditional probability and expectation given the data. We assume whenever an expression containing appears, and similarly for .
2 Main Results
2.1 Asymptotic expansion of coverage probability
Let be independent centered random vectors in , and define
Write
for the descending order statistics of the coordinates of , and define analogously from . Set
whenever the derivative exists.
Let be i.i.d. multipliers independent of the data. Put
Let denote the th largest coordinate of , and write
For each coordinate, For , define the Gaussian quantile Fix and define the quantile window
For , define the exceedance counts
Then
For every integer , define
For each nonempty , write
and
Let denote the density of , and abbreviate . The first-order Edgeworth density for is
Let The bootstrap Edgeworth density is defined by
For each integer , define
where and denote the corresponding projected densities defined later. Also set
We now present the assumptions underlying our analysis.
Assumption 2.1.
The vectors are independent and centered. Each admits a Stein kernel in the sense that
for every smooth vector-valued test function for which both sides are finite. There exist constants and such that
-
(i)
;
-
(ii)
-
(iii)
satisfies .
Remark 2.1.
Assumption 2.1 is the data-side regularity condition. The Stein identity provides the analytic device behind the projected Edgeworth expansion and is a convenient substitute for classical Cramér-type smoothness conditions in high dimension. As emphasized in Koike (2026, Remark 2.4), in one dimension the existence of a Stein kernel implies a nontrivial absolutely continuous component, and hence Cramér’s condition, whereas in higher dimensions Stein kernels remain available even in situations where a multivariate Cramér condition is not appropriate, such as Gaussian laws with singular covariance matrices. In our setting, part (i) requires a uniform lower bound on , which prevents global degeneracy of the Gaussian comparison law and guarantees that the projected Gaussian densities and their derivatives remain well behaved. Part (ii) imposes sub-exponential control on both the coordinates and the fluctuations of the Stein-kernel entries. Since
the centered quantity
measures the random fluctuation of the local covariance proxy around its population counterpart; controlling these fluctuations is exactly what allows Koike’s decomposition to be applied uniformly over the low-dimensional projections that enter our inclusion–exclusion argument. Finally, part (iii) is the high-dimensional scaling condition ensuring that the resulting remainder terms vanish. In particular, it specifies the regime in which the projected Edgeworth approximation is accurate enough to deliver a valid second-order expansion for the coverage probability.
Assumption 2.2.
The multipliers are i.i.d., independent of the data, satisfy
and, in addition, satisfy one of the following two conditions:
-
(i)
;
-
(ii)
admits a Stein kernel and there exists a constant such that
The constants in the sequel are allowed to depend on .
Remark 2.2.
Assumption 2.2 is the bootstrap analogue of Assumption 2.1. It ensures that, conditional on the data, the multiplier statistic admits the same kind of Stein–Edgeworth expansion as the original statistic. The Gaussian case is separated out because it is the canonical multiplier choice and automatically fits the required framework. The alternative bounded Stein-kernel condition covers smooth non-Gaussian multipliers and is particularly useful for moment matching, which is central to the second-order improvement. As discussed in Koike (2026), this framework does not cover two-point multipliers such as Mammen’s weights, since two-point laws do not admit Stein kernels. Thus, the restriction is a limitation of the present proof strategy rather than of the bootstrap principle itself.
Assumption 2.3.
There exist constants such that
Remark 2.3.
Assumption 2.3 places all coordinates on a common scale. Because our target is the raw order statistic , we are ranking the coordinates of the normalized sum without any coordinatewise rescaling. Uniform upper and lower bounds on the marginal variances therefore rule out the possibility that some coordinates dominate the ranking merely because their variances diverge, or become asymptotically irrelevant because their variances vanish. Without this assumption, the geometry of the th largest coordinate would depend on heterogeneous marginal scales, and the limiting problem would be substantially more complicated. In that regime one would typically need a different normalization or even a different target statistic.
Assumption 2.4.
Let We assume
Remark 2.4.
Assumption 2.4 is a weak-dependence condition tailored to our proof of the order-statistic expansion. The key step in the argument is to approximate the event by a finite-order inclusion–exclusion expansion and to show that the probability of having many coordinates simultaneously exceeding is negligible. For this strategy to work, exceedances above a high threshold must behave as rare events with only weak clustering, and the condition enforces exactly this feature. When pairwise correlations are too strong, exceedances can occur in large clusters, and then one can no longer guarantee that the probability of having more than coordinates above the threshold decays fast enough for the truncation argument to be valid. Handling such strongly dependent regimes would require substantially further studies.
Fix a constant and define . Throughout, is fixed. Finally define
| (1) |
and define analogously with in place of .
Theorem 2.1 is the main second-order coverage statement for the single wild bootstrap. It shows that the leading coverage distortion is described by the deterministic linear term together with the quadratic Cornish–Fisher correction , while the remaining error is of order .
Corollary 2.1 (Third-moment matching).
Under the assumptions of Theorem 2.1, if , then
Corollary 2.1 shows that matching the third multiplier moment removes the linear coverage distortion identified in Theorem 2.1. The wild bootstrap then becomes second-order accurate on the scale without any further correction.
Corollary 2.2 (Persistence of the first-order term).
Corollary 2.2 shows that the term is not an artifact of the proof. Unless the third moment is matched, the single-bootstrap coverage error typically remains of first-order size.
2.2 Double wild bootstrap
Let be i.i.d. multipliers, independent of everything else, satisfying
and the same regularity condition as in Assumption 2.2. Define
Let be the th largest coordinate of , let
and define
The prepivoted double-bootstrap test rejects when
Theorem 2.2.
Theorem 2.2 shows that prepivoting removes the leading single-bootstrap distortion and restores second-order accuracy. Thus the double wild bootstrap achieves the same coverage scale as the third-moment matching single bootstrap.
2.3 A stationary exponential-mixing alternative
The maximal-correlation condition in Assumption 2.4 can be replaced by a one-dimensional dependence condition when the Gaussian reference field is generated by a stationary Gaussian sequence. The price is an explicit additional remainder that records the contribution of local clusters of exceedances.
Assumption 2.5 (Stationary Gaussian coordinates with exponential strong mixing).
The Gaussian reference vector is the first coordinates of a centered stationary Gaussian sequence with covariance function
Its strong-mixing coefficients satisfy
for some constants and .
Write
Because every principal submatrix of has diagonal entries at most and smallest eigenvalue at least , we have
| (4) |
Set
| (5) |
Let be the unique constant satisfying
| (6) |
Theorem 2.3 (Stationary exponential-mixing alternative).
Theorem 2.3 replaces the maximal-correlation condition by a one-dimensional dependence assumption on the Gaussian reference field. The price is the explicit remainder , which isolates the effect of local clustering while leaving the structure of the second-order expansion unchanged.
Remark 2.5.
The remainder is driven mainly by the block-length ratio . Since , the definition of yields
Consequently,
In particular, a sufficient condition for is
If , then
so whenever .
3 Simulation
We investigate the finite-sample size of the bootstrap procedures for the th largest coordinate of The simulation design is kept fixed across all experiments, and only the target order statistic is varied. We report results for The case coincides with the maximum and is therefore omitted here.
Throughout the simulation, the dimension is fixed at and the sample size is taken from For the dependence structure, we consider two correlation designs. In Design I,
and in Design II,
with
Let denote the standard normal distribution function. For , let be the distribution function of the gamma distribution with shape parameter and unit scale. For each Monte Carlo repetition, we first generate
independently, and then define
This yields a Gaussian-copula model with gamma marginals.
We consider two cases.
-
•
Asymmetric case. We set and define
where . Since each marginal has mean , the vector is centered.
-
•
Symmetric case. We set . Let be an independent copy of , and define
This symmetrization removes skewness. The choice keeps the marginal kurtosis on the same scale as in the asymmetric setup.
We consider the following bootstrap methods:
-
•
Empirical bootstrap (EB). The classic naive bootstrap methods;
-
•
Gaussian wild bootstrap (GB):
-
•
Mammen wild bootstrap (MB):
-
•
Rademacher wild bootstrap (RB):
-
•
Beta wild bootstrap (BB): let and define
Let i.i.d., and standardize by
Then
-
•
double wild bootstrap (DB). The bootstrap method proposed in subsection 2.2.
For the Monte Carlo implementation, we use first-level bootstrap replications for EB, GB, MB, RB, and BB, and for DB we use at the first and second bootstrap levels, respectively.
Tables 1-3 report the emprical sizes of different bootstrap methods at the level for , respectively. Across , the qualitative ordering of the bootstrap procedures is largely unchanged. The dominant source of finite-sample distortion is the underlying design—most notably asymmetry and the more difficult Design II—rather than the value of itself. EB is uniformly conservative, with the under-rejection being especially visible in the asymmetric settings and in symmetric Design II, although the distortion is somewhat mitigated as increases. GB is more design-sensitive: it is reasonably well calibrated in symmetric Design I, but becomes distinctly liberal under asymmetry, particularly when is small and . MB and BB display the most stable behavior overall; both are typically mildly conservative, yet they avoid the substantial over-rejection exhibited by GB and, more markedly, RB, and their performance is comparatively robust across designs and values of . RB is the least robust method: it is very accurate, and often closest to the nominal level, in the symmetric experiments, but it becomes severely oversized under asymmetry, especially in Design II. DB is frequently numerically closest to the nominal level in the asymmetric designs, although this occurs through a persistent liberal bias; under symmetry it likewise remains slightly oversized. Larger generally improves calibration, and increasing attenuates some distortions, but these effects are quantitative rather than qualitative. Overall, the evidence points to MB and BB as the most reliable choices when uniform size control across heterogeneous designs is the primary concern, whereas RB is competitive only when symmetry is a credible approximation.
| Design | EB | GB | MB | RB | BB | DB | ||
|---|---|---|---|---|---|---|---|---|
| Panel A: Asymmetric | ||||||||
| I | 200 | 0.2 | 0.0612 | 0.1211 | 0.0757 | 0.1464 | 0.0743 | 0.1114 |
| I | 200 | 0.8 | 0.0731 | 0.0869 | 0.0739 | 0.0907 | 0.0719 | 0.1002 |
| I | 400 | 0.2 | 0.0732 | 0.1172 | 0.0809 | 0.1301 | 0.0802 | 0.1044 |
| I | 400 | 0.8 | 0.0814 | 0.0954 | 0.0809 | 0.0962 | 0.0821 | 0.1034 |
| II | 200 | 0.2 | 0.0619 | 0.152 | 0.0888 | 0.215 | 0.0884 | 0.115 |
| II | 200 | 0.8 | 0.0686 | 0.137 | 0.0865 | 0.174 | 0.0857 | 0.107 |
| II | 400 | 0.2 | 0.0790 | 0.153 | 0.0946 | 0.182 | 0.0936 | 0.109 |
| II | 400 | 0.8 | 0.0826 | 0.134 | 0.0927 | 0.156 | 0.0921 | 0.105 |
| Panel B: Symmetric | ||||||||
| I | 200 | 0.2 | 0.0731 | 0.0829 | 0.0892 | 0.1047 | 0.0907 | 0.120 |
| I | 200 | 0.8 | 0.0970 | 0.1015 | 0.1002 | 0.1058 | 0.0993 | 0.114 |
| I | 400 | 0.2 | 0.0868 | 0.0914 | 0.0963 | 0.1035 | 0.0934 | 0.109 |
| I | 400 | 0.8 | 0.0993 | 0.1016 | 0.1029 | 0.1038 | 0.1026 | 0.109 |
| II | 200 | 0.2 | 0.0584 | 0.0653 | 0.0830 | 0.106 | 0.0814 | 0.116 |
| II | 200 | 0.8 | 0.0677 | 0.0759 | 0.0859 | 0.101 | 0.0848 | 0.104 |
| II | 400 | 0.2 | 0.0763 | 0.0807 | 0.0902 | 0.102 | 0.0893 | 0.107 |
| II | 400 | 0.8 | 0.0877 | 0.0911 | 0.0977 | 0.106 | 0.0984 | 0.110 |
| Design | EB | GB | MB | RB | BB | DB | ||
|---|---|---|---|---|---|---|---|---|
| Panel A: Asymmetric | ||||||||
| I | 200 | 0.2 | 0.0645 | 0.1140 | 0.0731 | 0.1298 | 0.0730 | 0.107 |
| I | 200 | 0.8 | 0.0738 | 0.0866 | 0.0742 | 0.0878 | 0.0735 | 0.099 |
| I | 400 | 0.2 | 0.0754 | 0.1119 | 0.0823 | 0.1235 | 0.0809 | 0.102 |
| I | 400 | 0.8 | 0.0835 | 0.0953 | 0.0852 | 0.0955 | 0.0843 | 0.103 |
| II | 200 | 0.2 | 0.0565 | 0.1510 | 0.0887 | 0.2220 | 0.0854 | 0.118 |
| II | 200 | 0.8 | 0.0712 | 0.1380 | 0.0883 | 0.1730 | 0.0875 | 0.111 |
| II | 400 | 0.2 | 0.0731 | 0.1510 | 0.0907 | 0.1880 | 0.0884 | 0.111 |
| II | 400 | 0.8 | 0.0792 | 0.1290 | 0.0890 | 0.1450 | 0.0886 | 0.101 |
| Panel B: Symmetric | ||||||||
| I | 200 | 0.2 | 0.0820 | 0.0891 | 0.0918 | 0.1050 | 0.0914 | 0.113 |
| I | 200 | 0.8 | 0.0985 | 0.1035 | 0.1006 | 0.1060 | 0.0974 | 0.112 |
| I | 400 | 0.2 | 0.0933 | 0.0962 | 0.0986 | 0.1050 | 0.0979 | 0.109 |
| I | 400 | 0.8 | 0.1005 | 0.1009 | 0.1007 | 0.1030 | 0.1017 | 0.107 |
| II | 200 | 0.2 | 0.0587 | 0.0629 | 0.0826 | 0.1070 | 0.0798 | 0.122 |
| II | 200 | 0.8 | 0.0711 | 0.0780 | 0.0867 | 0.1010 | 0.0883 | 0.109 |
| II | 400 | 0.2 | 0.0781 | 0.0802 | 0.0918 | 0.1050 | 0.0917 | 0.115 |
| II | 400 | 0.8 | 0.0886 | 0.0906 | 0.0960 | 0.1040 | 0.0957 | 0.107 |
| Design | EB | GB | MB | RB | BB | DB | ||
|---|---|---|---|---|---|---|---|---|
| Panel A: Asymmetric | ||||||||
| I | 200 | 0.2 | 0.0695 | 0.1073 | 0.0769 | 0.1173 | 0.0757 | 0.1052 |
| I | 200 | 0.8 | 0.0745 | 0.0853 | 0.0749 | 0.0871 | 0.0730 | 0.0974 |
| I | 400 | 0.2 | 0.0770 | 0.1046 | 0.0798 | 0.1111 | 0.0796 | 0.0983 |
| I | 400 | 0.8 | 0.0843 | 0.0944 | 0.0867 | 0.0948 | 0.0856 | 0.1027 |
| II | 200 | 0.2 | 0.0537 | 0.142 | 0.0812 | 0.223 | 0.0782 | 0.117 |
| II | 200 | 0.8 | 0.0774 | 0.132 | 0.0883 | 0.159 | 0.0875 | 0.110 |
| II | 400 | 0.2 | 0.0732 | 0.146 | 0.0905 | 0.187 | 0.0895 | 0.113 |
| II | 400 | 0.8 | 0.0842 | 0.129 | 0.0932 | 0.146 | 0.0919 | 0.104 |
| Panel B: Symmetric | ||||||||
| I | 200 | 0.2 | 0.0851 | 0.0894 | 0.0901 | 0.0997 | 0.0900 | 0.110 |
| I | 200 | 0.8 | 0.1005 | 0.1033 | 0.1014 | 0.1056 | 0.1002 | 0.110 |
| I | 400 | 0.2 | 0.0945 | 0.0969 | 0.0974 | 0.1019 | 0.0982 | 0.106 |
| I | 400 | 0.8 | 0.1004 | 0.1014 | 0.1002 | 0.1026 | 0.1017 | 0.105 |
| II | 200 | 0.2 | 0.0608 | 0.0636 | 0.0822 | 0.108 | 0.0789 | 0.124 |
| II | 200 | 0.8 | 0.0764 | 0.0786 | 0.0890 | 0.103 | 0.0904 | 0.110 |
| II | 400 | 0.2 | 0.0756 | 0.0774 | 0.0894 | 0.104 | 0.0891 | 0.114 |
| II | 400 | 0.8 | 0.0885 | 0.0902 | 0.0951 | 0.102 | 0.0960 | 0.107 |
4 Conclusion
This paper studies Gaussian and bootstrap approximations for the th largest coordinate statistic in high dimensions. We establish theoretical guarantees that justify bootstrap critical values when the ambient dimension is allowed to grow with the sample size, thereby extending valid inference beyond the maximum to nonmaximal order statistics. The simulation results show that the proposed framework delivers accurate finite-sample inference and clarify the relative robustness of the competing bootstrap procedures across a range of designs.
An important direction for future research is to develop analogous Gaussian approximation results for temporally dependent observations. Doing so would require a theory that accommodates serial dependence, long-run covariance estimation, and resampling schemes that preserve the time-series structure; see, for example, (Shao, 2010; Zhang and Wu, 2017; Zhang and Cheng, 2014, 2018; Chang et al., 2024, 2023, 2025).
Appendix A Appendix A: Proofs of Theorems
A.1 Combinatorial identities
Lemma A.1 (Finite inclusion–exclusion identity).
For every integer and every nonnegative integer-valued random variable ,
| (17) |
Consequently,
| (18) |
and analogously with and .
A.2 Projected quantities
For every nonempty , let denote the coordinate projection. Define
Also define
The projected Edgeworth densities are
| (19) | ||||
| (20) |
Lemma A.2 (Projection preserves the data-side assumptions).
Assume Assumption 2.1. For every nonempty the projected vectors satisfy the same Stein identity with covariance matrix , the same sub-exponential envelope , and
Proof.
Let . For any smooth define
Then
Applying the Stein identity for yields
Hence is a Stein kernel for . The bounds follow by monotonicity under projection. Finally, for every nonzero ,
∎
A.3 External matrix, Gaussian-comparison, and Koike lemmas
Lemma A.3 (Gershgorin interval theorem).
Let be symmetric. Then
| (21) |
Proof.
Let be an eigenvalue of with eigenvector . Choose
Since ,
Hence
and therefore
This proves that every eigenvalue belongs to at least one Gershgorin interval
Taking the minimum and maximum over these intervals yields (21). ∎
Lemma A.4 (Berman–Li–Shao normal comparison inequality).
Let and be centered Gaussian vectors with
Write and , and define
Then for every ,
| (22) |
In particular, if has independent coordinates and
then
| (23) |
Proof.
Lemma A.5 (Koike smoothing identity).
Let be measurable, let , and define for ,
Then
| (24) |
Moreover, for every multi-index with ,
| (25) |
Proof.
Lemma A.6 (Koike orthant derivative bound).
Let
There exist constants , depending only on , such that for every , every , and every integer ,
| (26) |
In particular,
| (27) |
Proof.
The uniform estimate (27) is exactly Lemma 4.4 in Koike (2026) after replacing by and using from Lemma A.2. The localized bound (26) is obtained by combining (25) with the Anderson–Hall–Titterington bound stated as Lemma D.4 in Koike (2026). Indeed, for each multi-index with ,
If , then implies
Applying Lemma D.4 to the orthant enlarged by a cube of side length yields
and summing over proves (26). ∎
Lemma A.7 (Koike projected decomposition).
Let be independent centered -valued random vectors with approximate Stein kernels , and put
For a bounded measurable function and , define
Let be the first-order Edgeworth density around as in equation (4.1) of Koike (2026). Then for every bounded measurable and every ,
| (28) |
where each is one of the six terms displayed in equation (4.6) of Koike (2026), specialized to the projected dimension . In particular, each is a finite linear combination of iterated integrals involving only the tensors
and their -counterparts. If , then the last four terms in equation (4.6) of Koike (2026) disappear identically.
Proof.
Lemma A.8 (Koike tensor and concentration bounds).
Proof.
The mean and covariance bounds (29)–(30) follow from Lemma D.10 in Koike (2026) applied to and , respectively. The third-order tensor bound (31) is Lemma D.11 in Koike (2026) with . Finally, the coefficient-tensor estimate (32) is the projected specialization of the bounds obtained in the proof of Theorem 4.1 in Koike (2026, pp. 22–24). Since projection only removes coordinates, every projected -tensor norm is bounded by the corresponding full-dimensional norm. ∎
Lemma A.9 (Explicit high-probability event for the first bootstrap array).
Under Assumption 2.1, there exists an event such that
and, on ,
| (33) | ||||
| (34) |
Consequently, if
then on ,
| (35) | ||||
| (36) | ||||
| (37) | ||||
| (38) |
Proof.
Lemma A.10 (Multiplier maximum event).
A.4 The projected local input
Proposition A.1 (Projected local orthant expansion).
Proof.
Fix a nonempty and write . Set
Then
If , then
For , define
Lemma A.12(i) implies that there exists such that, uniformly for and ,
| (45) |
Choose
Then
| (46) |
because by Lemma A.11. Since , (46) implies
Therefore, integrating (44) with respect to the law of and using (45),
| (47) |
Exactly the same estimate holds with replaced by the Gaussian law.
Let
Apply the smoothing inequality in Lemma 4.1 of Koike (2026) to the bounded measurable function , with
Using Lemma A.13(with ) and Lemma A.12(ii), we obtain
| (48) |
For each , define
Then
with projected Stein kernels inherited from Lemma A.2. Therefore Lemma A.7 gives
Because , only the terms involving ,
and remain. By (32),
| (49) |
Substituting (47) and (49) into the six explicit terms of equation (4.6) in Koike (2026), and integrating the kernels exactly as they appear there, yields
| (50) |
Since on and , (46) implies
Hence
| (51) |
Changing variables gives (42).
Work on the event from Lemma A.9. Then, for every with ,
| (52) |
Condition on the data. Define
If , then has exact Stein kernel
If Assumption 2.2(ii) holds, then
is again an exact Stein kernel, because for every smooth vector field ,
Thus, conditionally on ,
Therefore, conditionally on , set
Then
And we have, for ,
uniformly over , because Lemma A.12 depends only on the Gaussian reference law . Here, we gives the smoothing error
Finally, the coefficient tensors in Koike’s decomposition satisfy
by (52). Substituting these conditional bounds into the six remainder terms in (50) yields, on ,
Since , this proves (43). ∎
Lemma A.11 (Gaussian threshold scale).
Proof.
Lemma A.12 (Gaussian shift and strip bounds).
Proof.
Let and standardize . The covariance matrix of has diagonal entries and off-diagonal entries bounded by . Since , Lemma A.3 applied to the correlation matrix yields
for all sufficiently large . Hence the Gaussian density on is bounded above and below, on the relevant orthant boundary region, by the density of an independent Gaussian vector up to multiplicative constants depending only on . Consequently,
| (55) |
uniformly for and . By Mills’ ratio,
which proves part (i) after multiplication over . Also,
again by Mills’ ratio. Multiplying over coordinates and using (55) proves part (ii). ∎
Lemma A.13 (Gaussian strip bound for the Edgeworth density).
A.5 Gaussian factorial moments, aggregation, and regularity
Define
Also define the elementary symmetric polynomial
Lemma A.14 (Gaussian factorial moments).
Proof.
Fix . First compare with the elementary symmetric polynomial . For every , Lemma A.4 applied repeatedly to the standardized vector yields
Summing over and using from Lemma A.11 gives
| (60) |
Next compare with . Writing
and separating the terms with repeated indices, we obtain
| (61) |
Because stays in a compact interval by Lemma A.11 and
we obtain
Lemma A.15 (Weighted aggregation and Gaussian regularity).
Proof.
Part (i) was already proved in the proof of Lemma A.11. For part (ii), use Lemma A.14:
Since and is fixed,
Therefore (62) follows. For the tail, use
for every , and then Stirling’s formula gives
Choosing large enough yields (63).
A.6 Factorial-moment and distribution expansions
Theorem A.1 (Factorial-moment expansion).
Comment. Theorem A.1 converts the local projected Edgeworth expansions into a weighted approximation for the factorial moments of the exceedance count. This is the combinatorial bridge from rare orthant probabilities to the law of the th largest coordinate.
Proof.
Theorem A.2 (Distribution expansion).
Comment. Theorem A.2 upgrades the factorial-moment approximation to a distributional expansion for and its bootstrap analogue. It also shows that the correction term is smooth enough for the quantile inversion carried out later.
Proof.
By Lemma A.1,
| (73) |
Split the right-hand side at . For , Theorem A.1 gives
| (74) |
Substituting (74) into (73), and using
from (63), we obtain
For the bootstrap expansion, work on the event (68). On that event,
uniformly on , which is exactly (70).
It remains to prove the derivative bounds. Fix . By (19),
| (75) |
Since uniformly for ,
| (76) |
where the first inequality follows from the Gaussian derivative bound
and the second uses .
Differentiate (75) with respect to . By the fundamental theorem of calculus, each derivative creates a finite sum of boundary integrals over -dimensional faces. Therefore
hence
Differentiating once more produces second-face integrals and diagonal boundary terms. The same Gaussian derivative estimate and the strip estimate of Lemma A.13 imply
| (77) |
Summing (76)–(77) with the weights in (1) and using (62) proves (71).
A.7 Bootstrap centering and Cornish–Fisher inversion
Proof.
Theorem A.3 (Cornish–Fisher expansion).
Comment. Theorem A.3 identifies the bootstrap critical value as a Gaussian quantile perturbed by an explicit linear term and a quadratic correction. This is the quantile-level expansion needed to turn the distributional approximation into a coverage expansion.
Proof.
Fix and abbreviate
On the event of (70) and (72),
| (82) |
Because and , the implicit function theorem yields a unique root with . Substituting into (82) and using Taylor’s formula up to order gives
| (83) |
for some between and . Since by (72), the last quadratic term in (83) is . Solving (83) iteratively,
This is exactly (81); compare also the classical Cornish–Fisher inversion formulas in Hall (1992, Chapter 2). ∎
A.8 Coverage expansion
Proof of Theorem 2.1.
Fix and write
| (84) |
Let denote the event on which the Cornish–Fisher expansion (81) holds and . Then
On define
By Theorem A.3,
| (85) |
Also on . Since is deterministic, Taylor’s formula on gives
| (86) |
for some between and . From (84), (71), and (64),
and
Substituting these bounds and (85) into (86), using and , yields on ,
| (87) |
Now take expectations. Since ,
Therefore
| (88) |
Taking expectations in (87) and using Lemma A.16,
Combining this with (88) gives
Taking complements proves (2). ∎
Proof of Corollary 2.1.
A.9 Deterministic conditional theorem and double bootstrap
Theorem A.4 (Deterministic-array conditional theorem).
Let be deterministic and define
Assume that for some constants and ,
| (89) |
and for every with ,
| (90) |
| (91) |
Then the conclusions of Theorems A.2, A.3, and 2.1 hold for the conditional law of the th order statistic of , with constants uniform over all deterministic arrays satisfying (89)–(91) and with the same second-order rate .
Comment. Theorem A.4 isolates the deterministic conditions needed for the second bootstrap level. Once the first-level resample satisfies these array conditions, the same second-order expansion follows conditionally.
Proof.
Fix a deterministic array satisfying (89)–(91). For every nonempty with , define
Let
Because satisfies the same regularity condition as Assumption 2.2, the projected summand
has exact Stein kernel
where denotes the scalar Stein kernel of (or in the Gaussian case). Hence
Now define the projected deterministic-array Edgeworth density
Fix with and write . Set
Then
For the deterministic array , the proof of Proposition A.1 uses only the following inputs:
which are exactly (89)–(91). Therefore,
| (92) |
uniformly over all admissible deterministic arrays.
Starting from (92), the factorial-moment argument gives
where and are the conditional factorial moment and its first-order approximation built from . Substituting this identity into the weighted inclusion–exclusion formula gives the deterministic-array analogue of Theorem A.2. The Cornish–Fisher and coverage expansions then follow from the same algebraic steps as in Sections A.7–A.8 after replacing by the corresponding deterministic-array first-order term. All constants remain uniform under (89)–(91). This proves the theorem. ∎
Lemma A.17 (The first-level bootstrap array satisfies the deterministic conditions).
Proof.
Set
We first bound . Conditional on the original data, the vectors are independent and centered. On , every coordinate satisfies
because either is bounded or is Gaussian, hence sub-Gaussian, and (36) holds on . Apply Lemma D.10 of Koike (2026) conditionally with
Then, on ,
| (95) |
Next, write
Then
| (96) |
Conditional on the original data, the summands in (96) are independent and centered. On , each entry of the matrix
has conditional -norm at most
Apply Lemma D.10 of Koike (2026) conditionally with
Then, on ,
| (97) |
Proof of Theorem 2.2.
Let be the event from Lemma A.17. Since , it is enough to work on . On that event, the first-level bootstrap array satisfies the deterministic conditions of Theorem A.4. Because the second-level multipliers satisfy , the conditional version of Corollary 2.1 gives
| (99) |
Set , with chosen large enough that both (99) and the first-level second-order accuracy bound hold with the same constant.
Appendix B Appendix B: Proofs for the stationary exponential-mixing alternative
This appendix proves Theorem 2.3. Throughout Appendix B we work under Assumptions 2.1, 2.2, 2.3, and 2.5, and we use the notation introduced in Section 2.3. Only the Gaussian aggregation part of Appendix A needs to be modified; the projected local Edgeworth expansion is unchanged except for the shift/strip estimates established below.
B.1. Correlation decay and Gaussian cluster tails
Lemma B.1.
Under Assumption 2.5, for every ,
| (102) |
Consequently,
| (103) |
Moreover, for every integer , every index set with , and every ,
| (104) |
In particular, when ,
| (105) |
B.2. Bonferroni remainder for the th exceedance event
Lemma B.2.
For every integer , every integer , and every nonnegative integer-valued random variable ,
| (106) |
Consequently,
| (107) |
B.3. Block construction and reduction to block exceedances
Let
Define the main blocks and gaps by
and define the remainder interval
when . For , set
Also define
Lemma B.3.
For every ,
| (108) |
Moreover,
| (109) | ||||
| (110) |
Proof.
The first Bonferroni inequality gives
and the second Bonferroni inequality yields
Lemma B.4.
Let . Then, for every ,
| (111) |
Consequently,
| (112) |
where is deterministic.
Proof.
Fix . Put
Since the selected main blocks are separated by at least , repeated application of Lemma 3.2.2 of Leadbetter, Lindgren, and Rootzén yields
| (113) |
for every nonempty . The inclusion–exclusion identity gives
and the same identity with each probability replaced by the corresponding product equals , since . Therefore
Summing over the choices of gives (111).
B.4. Direct Poisson approximation on the quantile window
Lemma B.5.
If , then
| (114) |
and hence
| (115) |
Consequently,
| (116) | ||||
| (117) |
Proof.
Lemma B.6.
For every such that ,
| (118) |
Consequently,
| (119) |
B.5. Threshold scale, shift/strip bounds, and weighted Gaussian bounds
Lemma B.7.
There exist constants and an integer such that
| (121) |
Proof.
Lemma B.8 (Shift and strip bounds).
Proof.
Fix a nonempty with . Since is a principal submatrix of ,
On the other hand, by stationarity and (103),
Hence every principal covariance matrix of dimension at most is uniformly well conditioned and has operator norm bounded by . Repeating the proof of Lemmas A.12 and A.13 with these two spectral bounds gives (122) and (123). ∎
Lemma B.9.
There exists a constant such that, for every and every ,
| (124) |
and
| (125) |
Proof.
For , decompose according to the block partition used in Lemmas B.3 and B.4. The contribution of configurations that use only main blocks and place at most one exceedance in each selected block is . Every remaining configuration necessarily contains either an exceedance in a gap or in the remainder interval, or at least two exceedances inside one main block. Therefore the same counting argument used in the proof of Lemma B.3, together with the cluster bound (104), yields
| (126) |
Combining (126) with (112) gives
| (127) |
Since implies by Lemma B.7, summing (127) over yields
The first sum is bounded by a constant depending only on because it is dominated by the convergent series
The second sum is also bounded because is finite for every , for large , , and (10) implies
This proves (124).
B.6. Regularity of
Lemma B.10.
There exist constants , , and an integer such that
| (128) |
and
| (129) |
Proof.
Set
By Lemma B.7, there exist constants such that
| (130) |
Since on , the derivatives of are bounded on this compact interval. Hence there exist constants such that
| (131) |
B.7. Completion of the proof of Theorem 2.3
The local projected Edgeworth expansion in Proposition A.1 and its bootstrap version depend on the Gaussian law only through the spectral bounds for principal submatrices and the shift/strip inequalities. By Lemma B.8, the proof of Proposition A.1 remains valid under Assumption 2.5; moreover, with the same argument one may enlarge the range from to . Thus, for every nonempty with ,
| (137) |
and, with probability at least ,
| (138) |
holds simultaneously for all such .
Summing (137) and (138) over gives, for every ,
| (139) | ||||
| (140) |
uniformly over , with the bootstrap bound holding on an event of probability at least .
To prove (11), apply Lemma B.2 with and to obtain
| (141) |
Using (139) with and the bound
from the proof of Theorem A.2, we obtain
for all sufficiently large . Hence (125) yields
| (142) |
Applying Lemma B.2 with gives
Combining the last three displays with the definition of proves (11). The bootstrap expansion (12) follows in the same way from (140), and the probability of the exceptional event remains bounded by .
The derivative bounds in the proof of Theorem A.2 use only the derivative estimates for the projected Gaussian densities and the uniform weighted bound (124). Since both inputs are available here, the same argument yields
| (143) |
and, with probability at least ,
| (144) |
exactly as in Theorem A.2.
Next, let
which follows from (12). Since on by Lemma B.10, the same implicit-function argument as in the proof of Theorem A.3 yields a unique solution
Substituting into the identity and expanding as in (83) gives
uniformly in , which proves (13).
For the coverage expansion, write
which follows from (11). Insert (13) into the Taylor formula
Using (143), Lemma B.10, and Lemma A.16, the same algebra as in the proof of Theorem 2.1 yields
Taking complements proves (14).
Finally, the deterministic-array conditional theorem in Section A.8 is proved from the conditional versions of Theorems A.2, A.3, and 2.1. Repeating that argument with (12), (13), and (14) gives the same deterministic-array statement with in place of . Inserting that conditional bound into the proof of Theorem 2.2 yields (16). This completes the proof of Theorem 2.3.
References
- Limit theorems for the maximum term in stationary sequences. The Annals of Mathematical Statistics 35 (2), pp. 502–516. Cited by: §A.3.
- Central limit theorems for high dimensional dependent data. Bernoulli 30 (1), pp. 712–742. External Links: Document Cited by: §4.
- Statistical inference for high-dimensional spectral density matrix. Journal of the American Statistical Association 120 (551), pp. 1960–1974. External Links: Document Cited by: §4.
- Testing the martingale difference hypothesis in high dimension. Journal of Econometrics 235 (2), pp. 972–1000. External Links: Document Cited by: §4.
- Improved central limit theorem and bootstrap approximation in high dimensions. The Annals of Statistics 50 (5), pp. 2562–2586. Cited by: §1.
- Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. The Annals of Statistics 41 (6), pp. 2786–2819. External Links: Document Cited by: §1.
- Central limit theorems and bootstrap in high dimensions. The Annals of Probability 45 (4), pp. 2309–2353. External Links: Document Cited by: §1.
- Nearly optimal central limit theorem and bootstrap approximations in high dimensions. The Annals of Applied Probability 33 (3), pp. 2374–2425. Cited by: §1.
- Beyond gaussian approximation: bootstrap for maxima of sums of independent random vectors. The Annals of Statistics 48 (6), pp. 3643–3671. Cited by: §1.
- Gaussian multiplier bootstrap procedure for the th largest coordinate of high-dimensional statistics. Note: arXiv:2508.14400v2 [math.ST] Cited by: §1.
- High-dimensional central limit theorems by stein’s method. The Annals of Applied Probability 31 (4), pp. 1660–1686. Cited by: §1.
- Sharp high-dimensional central limit theorems for log-concave distributions. Annales de l’Institut Henri Poincare Probabilites et Statistiques 60 (3), pp. 2129–2156. Cited by: §1.
- Limiting forms of the frequency distribution of the largest or smallest member of a sample. Mathematical Proceedings of the Cambridge Philosophical Society 24 (2), pp. 180–190. External Links: Document Cited by: §1.
- The bootstrap and edgeworth expansion. Springer Series in Statistics, Springer, New York. Cited by: §A.7.
- Matrix analysis. 2 edition, Cambridge University Press, Cambridge. Cited by: §A.9.
- Notes on the dimension dependence in high-dimensional central limit theorems for hyperrectangles. Japanese Journal of Statistics and Data Science 4 (1), pp. 257–297. External Links: Document Cited by: §1.
- High-dimensional bootstrap and asymptotic expansion. Probability Theory and Related Fields. Note: Published online first External Links: Document Cited by: §A.3, §A.3, §A.3, §A.3, §A.4, §A.4, §A.9, §A.9, Lemma A.7, Lemma A.7, Lemma A.7, §1, Remark 2.1, Remark 2.2.
- Dimension-free anticoncentration bounds for gaussian order statistics with discussion of applications to multiple testing. Note: arXiv:2107.10766 Cited by: §1.
- A normal comparison inequality and its applications. Probability Theory and Related Fields 122 (4), pp. 494–508. Cited by: §A.3.
- Bootstrapping max statistics in high dimensions: near-parametric rates under weak variance decay and application to functional and multinomial data. The Annals of Statistics 48 (2), pp. 1214–1229. Cited by: §1.
- The types of limit distributions for some terms of variational series. Scientia Sinica 15, pp. 749–762. Cited by: §1.
- The dependent wild bootstrap. Journal of the American Statistical Association 105 (489), pp. 218–235. External Links: Document Cited by: §4.
- On limiting distributions of intermediate order statistics from stationary sequences. The Annals of Probability 10, pp. 653–662. Cited by: §1.
- Gaussian approximation for high dimensional time series. The Annals of Statistics 45 (5), pp. 1895–1919. External Links: Document Cited by: §4.
- Bootstrapping high dimensional time series. Note: arXiv:1406.1037 External Links: Document, Link Cited by: §4.
- Gaussian approximation for high dimensional vector under physical dependence. Bernoulli 24 (4A), pp. 2640–2675. External Links: Document Cited by: §4.