Conversation
WalkthroughIntroduces a bounded "slow plan cache" for preserving expensive/slow-to-plan GraphQL execution plans across reloads, adds router engine configuration options to tune its threshold and size, updates cache-warming behavior and examples, and clarifies metrics/observability to indicate slow plan cache usage. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
📝 Coding Plan for PR comments
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
docs/concepts/cache-warmer.mdx (1)
164-173: Add direct links from observability to setup/querying docs.This section lists the new signals well, but it would be easier to navigate if readers could jump straight to the config and metrics reference pages from here.
↗️ Suggested doc cross-links### Observability When the expensive query cache is enabled, the following observability signals are available: @@ These attributes are only emitted when `in_memory_fallback` is enabled. Existing users who do not enable this feature will see no change in their telemetry. + +For setup details, see [Router Configuration](/router/configuration#router-engine-configuration). For querying and interpreting the emitted metrics, see [Metrics & Monitoring](/router/metrics-and-monitoring).As per coding guidelines, "Link between related documentation pages".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/concepts/cache-warmer.mdx` around lines 164 - 173, Update the "### Observability" section to add direct cross-links to the relevant setup and querying docs: link the config key pages for in_memory_fallback and engine.debug.enable_cache_response_headers, link the metrics reference pages that document router.graphql.operation.planning_time and the Prometheus label wg_engine_expensive_plan_cache_hit, and link the tracing docs that show the Operation - Plan span and wg.engine.expensive_plan_cache_hit attribute so readers can jump from the observability signals to configuration and metrics/tracing usage.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@docs/concepts/cache-warmer.mdx`:
- Around line 164-173: Update the "### Observability" section to add direct
cross-links to the relevant setup and querying docs: link the config key pages
for in_memory_fallback and engine.debug.enable_cache_response_headers, link the
metrics reference pages that document router.graphql.operation.planning_time and
the Prometheus label wg_engine_expensive_plan_cache_hit, and link the tracing
docs that show the Operation - Plan span and wg.engine.expensive_plan_cache_hit
attribute so readers can jump from the observability signals to configuration
and metrics/tracing usage.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4abafec7-aa91-4306-8602-dd3ad3d211c4
📒 Files selected for processing (3)
docs/concepts/cache-warmer.mdxdocs/router/configuration.mdxdocs/router/metrics-and-monitoring.mdx
Resolve conflict in metrics-and-monitoring.mdx: keep plan fallback cache description update and new Cost Control metrics section from upstream. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
docs/concepts/cache-warmer.mdx (1)
176-178: Add a direct link to the metrics reference from this observability note.This section introduces new observability semantics, but it doesn't point readers to the metrics reference page. A direct link here would make the new behavior easier to discover.
🔗 Suggested addition
Plan fallback cache hits are counted as regular plan cache hits — the `wg.engine.plan_cache_hit` attribute is set to `true` for hits from either the main cache or the fallback cache. There is no separate observability signal for fallback cache hits. + +For the full GraphQL metrics reference, see [Metrics and Monitoring](/router/metrics-and-monitoring).As per coding guidelines, "Link between related documentation pages".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/concepts/cache-warmer.mdx` around lines 176 - 178, In the "Observability" section update the paragraph that mentions wg.engine.plan_cache_hit to include a direct link to the metrics reference page; edit the text around "There is no separate observability signal for fallback cache hits." to append a short inline link like "(see metrics reference)" pointing to the metrics reference doc so readers can jump directly from the Observability note to the full metrics documentation for wg.engine.plan_cache_hit.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/concepts/cache-warmer.mdx`:
- Around line 122-123: The bullets overstate guarantees of preserving queries
above plan_fallback_threshold; clarify that retention is only
eligibility-limited by the bounded plan_fallback_cache_size so some eligible
entries may be evicted once the cache fills. Update the two affected bullets
(the instances referencing plan_fallback_threshold and the claim about
eliminating reload spikes) to mention plan_fallback_cache_size explicitly and
use language like "eligible for retention" or "retained while cache capacity
allows" rather than absolute preservation or elimination of spikes.
---
Nitpick comments:
In `@docs/concepts/cache-warmer.mdx`:
- Around line 176-178: In the "Observability" section update the paragraph that
mentions wg.engine.plan_cache_hit to include a direct link to the metrics
reference page; edit the text around "There is no separate observability signal
for fallback cache hits." to append a short inline link like "(see metrics
reference)" pointing to the metrics reference doc so readers can jump directly
from the Observability note to the full metrics documentation for
wg.engine.plan_cache_hit.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8c75bde8-f7b0-4534-be30-57288ceeb543
📒 Files selected for processing (3)
docs/concepts/cache-warmer.mdxdocs/router/configuration.mdxdocs/router/metrics-and-monitoring.mdx
🚧 Files skipped from review as they are similar to previous changes (2)
- docs/router/metrics-and-monitoring.mdx
- docs/router/configuration.mdx
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
docs/concepts/cache-warmer.mdx (1)
115-116:⚠️ Potential issue | 🟡 MinorQualify the “all queries” / “no spikes” guarantees.
These bullets still read as if lowering
slow_plan_cache_thresholdguarantees full preservation. In practice it only makes more queries eligible; actual retention is still capped byslow_plan_cache_size, so1nsdoes not mean "all queries are cached" unless the slow cache can hold the whole working set. The "Eliminates reload spikes" wording should also be softened to the retained subset.✏️ Suggested wording
-- **Coverage of expensive queries**: By default, queries with planning times above the threshold (5s) are preserved and warmed on reload, protecting slow-to-plan queries from cold-start latency. Users can lower the threshold to any positive duration (e.g., `slow_plan_cache_threshold: 100ms`) to capture all queries. Users can also set the duration to 1 nanosecond (`slow_plan_cache_threshold: 1ns`), this would ensure that all queries are cached in the fallback, and thus would be available to rewarm the cache upon reloads. -- **Eliminates reload spikes for expensive queries**: You won't experience query planning spikes for queries above the threshold after configuration or schema reloads. Users can tune the threshold to cover more or fewer queries. +- **Coverage of expensive queries**: By default, queries with planning times above the threshold (5s) become eligible for the slow plan cache and can be rewarmed on reload while `slow_plan_cache_size` has capacity. Lowering the threshold widens eligibility and usually requires a larger `slow_plan_cache_size`. +- **Reduces reload spikes for retained expensive queries**: Queries retained in the slow plan cache avoid a re-plan after configuration or schema reloads. Users can tune the threshold and cache size to cover more or fewer queries. -- **Lower threshold → more queries protected**: Setting `slow_plan_cache_threshold: 1ns` captures all queries regardless of planning time. This gives you full "carry forward everything" behaviour similar to preserving the entire plan cache. +- **Lower threshold → more queries eligible**: Setting `slow_plan_cache_threshold: 1ns` makes every query eligible for the slow plan cache, subject to `slow_plan_cache_size`. You only get "carry forward everything" behaviour when the slow plan cache is large enough to hold the full working set.Also applies to: 202-203
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/concepts/cache-warmer.mdx` around lines 115 - 116, Update the two bullets that claim lowering slow_plan_cache_threshold (and the "Eliminates reload spikes" statement) to qualify that eligibility does not equal full retention: explicitly state that setting slow_plan_cache_threshold to a low value (e.g., 1ns) only makes more queries eligible for caching but actual preservation is limited by slow_plan_cache_size and available working set, and change "Eliminates reload spikes" to language like "reduces reload spikes for the subset retained in the slow plan cache" (referencing slow_plan_cache_threshold and slow_plan_cache_size and apply the same softened wording to the other occurrence in the file).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/concepts/cache-warmer.mdx`:
- Around line 91-94: The second bullet currently describes a cache hit after
main-cache eviction rather than a population path; update the phrasing for the
slow plan cache (referencing slow_plan_cache_threshold and TinyLFU) so it
describes retention/update on eviction (e.g., "On eviction: when TinyLFU evicts
a plan that exists in the slow plan cache, the slow cache retains/updates the
plan to avoid recomputation") or move the cache-hit behavior into the following
paragraph to keep lifecycle description consistent with Lines 166-167.
---
Duplicate comments:
In `@docs/concepts/cache-warmer.mdx`:
- Around line 115-116: Update the two bullets that claim lowering
slow_plan_cache_threshold (and the "Eliminates reload spikes" statement) to
qualify that eligibility does not equal full retention: explicitly state that
setting slow_plan_cache_threshold to a low value (e.g., 1ns) only makes more
queries eligible for caching but actual preservation is limited by
slow_plan_cache_size and available working set, and change "Eliminates reload
spikes" to language like "reduces reload spikes for the subset retained in the
slow plan cache" (referencing slow_plan_cache_threshold and slow_plan_cache_size
and apply the same softened wording to the other occurrence in the file).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 2e441373-0160-4791-96a2-b0ef610a496f
📒 Files selected for processing (2)
docs/concepts/cache-warmer.mdxdocs/router/configuration.mdx
🚧 Files skipped from review as they are similar to previous changes (1)
- docs/router/configuration.mdx
Description
This PR adds documentation for the feature here wundergraph/cosmo#2611
Summary by CodeRabbit
Checklist
mintlify dev