# HTML Cache — Improvement & Growth Plan

> Package: capell-app/html-cache · Kind: package · Tier: free · Product group: Capell Foundation · Bundle: foundation · Status: Draft

## 1. Snapshot

Full-page HTML caching for Capell public pages. Caching is **filesystem-based**: a `frontend.cache` middleware (`src/Http/Middleware/HtmlCacheMiddleware.php`) writes/serves static `.html` (and `.404.html`) files on the `page_cache` disk, keyed by `{scheme}.{host}/{url-segments}/{filename}` (`PageCache::getFileFromRequest`). Two persistence tables back it: `cached_model_urls` (model→URL dependency index) and `stale_cached_urls` (scheduled-invalidation queue). Invalidation has two modes — `instant` (delete files on model events) and `scheduled` (mark rows stale, then `capell:html-cache:process-stale` re-renders through the kernel and atomically replaces files). Variance is by host+path only; query-string requests are never cached (`PageCache::shouldCachePage` bails on `query->count() > 0`), authenticated/session/`Authorization` requests bypass, access-gate areas bypass, and any extension declaring `cacheable:false`/`sensitiveOutput` blocks caching site-wide via `ExtensionCacheSafetyResolver`. Surfaces: `admin`, `frontend`. Requires `admin`, `core`, `frontend`; supports `site-discovery`. Dependents/integrators: access-gate, frontend-optimizer, frontend-authoring, layout-builder, publishing-studio all touch its invalidation actions. Shipped marketplace summary: **"Serve Capell pages as static HTML for sub-millisecond responses — with automatic, dependency-aware invalidation that keeps cached pages fresh and never leaks gated or authoring content to anonymous visitors."** Manifest now advertises the extension card plus all **7 required** committed captures from `docs/screenshots.json`.

## 2. Improvements (existing functionality)

1. **Shipped 2026-06-05: Replace blanket translation create/delete flushes with targeted invalidation** — route-structure creates/deletes (`Page`, `PageUrl`) still clear broadly, but `Translation` create/update/delete now falls through to dependency-indexed invalidation so adding a new translation no longer deletes unrelated cached pages. — `src/Providers/HtmlCacheServiceProvider.php` (`registerModelInvalidationHooks`) — M

2. **Make `cache_vary_headers` actually vary by the things the manifest claims** — `config/capell-html-cache.php` sets `cache_vary_headers => ['Accept-Encoding']` only, but the manifest declares `variesBy: ["site","locale"]`. Site/locale variance is currently _implicit_ (each locale is a distinct `SiteDomain` host/path). That holds only while every locale maps to a distinct host or path prefix; a locale served via cookie/header on a shared URL would collide. Document the host/path assumption explicitly and add `Vary` coverage (or an explicit eligibility bail) for any locale negotiation that is not host/path-based. — `config/capell-html-cache.php`, `src/Http/Middleware/HtmlCacheMiddleware.php` (`applyCacheHeaders`) — M

3. **Shipped 2026-06-06: served cache headers are configurable and documented.** The on-disk store remains TTL-less by design, while public HTTP/CDN headers are controlled by `capell-html-cache.http_cache.shared_max_age`, `browser_max_age`, and `stale_while_revalidate`. `shared_max_age` still falls back to `cache_ttl / 6` for compatibility when unset, and the config file documents that these values only affect response headers. — `config/capell-html-cache.php`, `src/Http/Middleware/HtmlCacheMiddleware.php` (`applyCacheHeaders`, `sharedMaxAge`, `browserMaxAge`, `staleWhileRevalidateSeconds`)

4. **Resolved: implemented real `HtmlCacheHealthCheck` diagnostics** — `HtmlCacheHealthCheck` now reports writable `page_cache` storage, `frontend.cache` middleware wiring, required table presence, and scheduled stale-processing command registration. The disk probe cleans up temporary files in a `finally` block, and manifest/health coverage locks the Diagnostics contract. — `src/Health/HtmlCacheHealthCheck.php`, `tests/Feature/HtmlCacheHealthCheckTest.php`, `tests/Unit/HtmlCacheManifestCopyTest.php` — Closed.

5. **De-duplicate the two cookie-stripping code paths** — `PreventSessionCookieOnCacheableRequests::stripSessionCookies()` and `HtmlCacheMiddleware::stripConfiguredCookies()` hold identical `[session.cookie, XSRF-TOKEN, PHPDEBUGBAR_STACK_DATA]` lists. Drift between them is a latent safety bug (a cookie stripped in one path but not the other). Extract a shared helper / config list. — `src/Http/Middleware/PreventSessionCookieOnCacheableRequests.php`, `src/Http/Middleware/HtmlCacheMiddleware.php` — S

6. **Done/Shipped: Reduce per-model-class closure registration overhead** — generic model invalidation now uses three wildcard Eloquent listeners (`created`, `updated`, `deleted`) routed through `HtmlCacheModelInvalidationObserver` instead of registering up to three closures for every model returned by `CapellCore::getModels()`. Route-structure models (`Page`, `PageUrl`, `SiteDomain`) keep their explicit invalidation paths, while ordinary Capell models use the single generic observer path. — `src/Providers/HtmlCacheServiceProvider.php`, `src/Observers/HtmlCacheModelInvalidationObserver.php` — M

7. **Document/guard the `static_generation.internal_requests` re-entrancy** — `StaticSiteCommand --internal` and scheduled refresh render _through the live kernel_; combined with the `retrieved` hook this re-enters `RetrievedModelStore` and re-dispatches `RegisterCachedModelUrlsJob` during generation. Confirm (and test) that generation-time renders don't recursively queue dependency jobs or strip cookies on the synthetic request. — `src/Console/Commands/StaticSiteCommand.php`, `src/Support/StaticSite/StaticSiteGenerator.php`, `src/Actions/RefreshCachedUrlAtomicallyAction.php` — M

## 3. Missing Features (gaps)

The single declared capability is `cache-blocking` — which describes the _extension veto_ mechanism, not the cache itself. For a foundation full-page-cache package the capability set and feature surface are thin against table-stakes:

- **Done/Shipped: tag-based purge API / CDN integration.** `Surrogate-Key` headers are now backed by a package-owned `CachePurger` contract. The default `NullCachePurger` keeps local invalidation only, while `HttpSurrogateKeyCachePurger` posts normalized surrogate keys to a configured edge/CDN endpoint with optional bearer token, method, header, and timeout settings. `ClearCachedUrlAction` derives `site-*`, `lang-*`, and `page-*` keys from cached URL rows and purges them after local file deletion. — `src/Contracts/CachePurger.php`, `src/Support/Cache/Purgers/*`, `src/Actions/ClearCachedUrlAction.php`, `config/capell-html-cache.php`
- **Per-locale / per-currency / per-segment variance hooks.** `variesBy` is host/path only. There is no extension point to add a vary dimension (e.g. currency cookie, A/B bucket) — meaning experiments/campaign-studio personalization must _disable_ caching entirely (via `cacheable:false`) rather than vary it. A registrable "cache variant key" contract would let those packages cache per-segment instead of opting out.
- **ESI / partial holes for dynamic fragments.** All-or-nothing: a single dynamic region forces the whole page uncacheable. No edge-side-include or placeholder-hole mechanism to cache the shell and hydrate dynamic bits. Differentiator for high-traffic sites with a personalized header/cart.
- **Done/Shipped: Stale-while-revalidate at the origin.** When a stale row exists but the old cache file is still present, `HtmlCacheMiddleware` serves the stale file immediately and dispatches `RefreshOriginStaleCachedUrlAction` after the response. The refresh action claims the stale row, refreshes atomically through the same `RefreshCachedUrlAtomicallyAction` path, respects retry budgets, and marks rows processed/failed/exhausted. — `src/Http/Middleware/HtmlCacheMiddleware.php`, `src/Actions/RefreshOriginStaleCachedUrlAction.php`
- **Cache warming as a first-class, queued, incremental job.** Warming exists only as the synchronous `capell:static-site` command. No queued warm-after-deploy, no warm-top-N-URLs-by-traffic, no warm-on-invalidate. The stale queue is regeneration, not proactive warming.
- **Shipped 2026-06-06: explicit bypass-rule configuration is available.** `capell-html-cache.bypass.paths`, `cookies`, and `headers` are consumed by `ConfiguredHtmlCacheBypassRules`, and the shared resolver is checked by the middleware, eligibility report, and `PageCache` read/write paths. Existing coverage exercises path, cookie, and locale/header bypass scenarios. — `config/capell-html-cache.php`, `src/Support/Cache/ConfiguredHtmlCacheBypassRules.php`, `src/Http/Middleware/HtmlCacheMiddleware.php`, `src/Support/Cache/PageCache.php`
- **Done/Shipped: Cache hit-rate / coverage telemetry.** Cache hits now increment `cached_model_urls.hit_count`, `bytes_served`, and `last_hit_at` for the request URL hash. Cached coverage dashboard rows show tracked hit/last-hit data instead of "not tracked", and the manifest advertises `cache-hit-telemetry`. — `database/migrations/2026_06_07_000001_add_telemetry_to_cached_model_urls_table.php`, `src/Actions/RecordHtmlCacheHitAction.php`, `src/Http/Middleware/HtmlCacheMiddleware.php`, `src/Actions/Dashboard/BuildHtmlCacheUrlRowsAction.php`

## 4. Issues / Risks

**Public-output safety (the crux) — assessed strong, with caveats.** Gated/personalized content is excluded on multiple independent layers and these are well tested (141 cases across 15 files):

- Authenticated/session bypass: `cache_skip_authenticated` + `hasIncomingSessionCookie` + `Authorization` header + `?signature` short-circuit both reads and writes. Tested ("bypasses cached html for requests with a session cookie", "...authenticated requests without a session cookie", "can serve cached html ... when configured").
- Access-gate bypass: `shouldBypassForAccessGate` checks request attribute, browser-token cookie, and an active `access_gate_areas` row. Tested ("bypasses cached html for access gated protected requests", "...browser token requests even when authenticated cache reads are enabled").
- Authoring-surface leak prevention: dual inspection via `PublicHtmlSafetyInspector::containsAuthoringSurface` in both `HtmlCacheMiddleware` and `PageCache::cache`, with an `xxh128`-hash optimization to skip re-inspection. Writes are skipped if markers present; responses are marked `BYPASS` + `private, no-store`. Tested ("blocks unsafe public html during stale refresh and keeps the old cache file").
- Extension veto: `ExtensionCacheSafetyResolver::isPublicCacheSafe()` blocks the whole response if any recorded extension contribution is non-cacheable/sensitive.

Caveats / risks to verify or close:

1. **`access_gate_areas` existence query runs on the hot path.** `hasActiveAccessGateArea()` issues `DB::table('access_gate_areas')->where('key',...)->where('status','active')->exists()` on cache-read decisions when no request attribute/cookie is present. This is a per-request DB query against the `frontendRenderBudgetMs: 20` budget and couples html-cache to access-gate's schema by string. Cache this lookup (request-scoped or short TTL) and guard the `Schema::hasTable` branch. — `src/Http/Middleware/HtmlCacheMiddleware.php`

2. **Manifest ↔ behaviour mismatch on `variesBy`.** Manifest says `variesBy: ["site","locale"]`; the on-disk key and `Vary` header encode neither directly (host/path only; `Vary: Accept-Encoding`). Safe under the "distinct host/path per locale" assumption, but undocumented and unenforced — a future shared-URL locale scheme would silently poison the cache. Add a test that asserts two locales on the same host+path are not served the same file (or are explicitly bypassed). (§2.2)

3. **`cacheSafety.cacheable: false` in the manifest, `capabilities: ["cache-blocking"]`.** For the package that _provides_ the cache this reads as self-contradictory and will confuse marketplace/diagnostics consumers. Clarify: the package's own _output_ is infrastructure (not user content), but the capability naming ("cache-blocking") describes the veto resolver, not caching. Consider a `full-page-cache` / `cache-provider` capability. (§5)

4. **Shipped 2026-06-05: Translation create/delete no longer triggers full-cache flushes.** `Translation` create/update/delete now uses dependency-indexed invalidation; broad flushes remain only for route/structure changes (`Page`, `PageUrl`, `SiteDomain`) where existing URL maps can genuinely change. Origin SWR/locking remains future depth work (§3).

5. **Path-traversal defence is string-replace, not canonicalization.** `HtmlCacheStore` strips `../`/`..\\` and `PageCache::safeRequestSegments` rejects `..` segments — reasonable, and there is an `__invalid__` fallback, but the defence is ad-hoc. A test fixture of hostile paths (encoded dots, null bytes, overlong) would harden the guarantee. Tested partially ("caches ... invalid request paths").

6. **Redis-cluster safety: not applicable but worth a note.** The cache layer is filesystem-based and the queues are DB tables, so none of the banned `Redis::scan/keys/flushdb` commands appear here. No action needed; call this out so reviewers don't assume a Redis store.

7. **Performance budgets are declared but unverified.** `frontendRenderBudgetMs: 20`, `adminQueryBudget: 40`. No test asserts the middleware decision path stays within budget, and items §4.1/§2.6 add per-request queries/closures that erode it. Add a budget assertion or benchmark.

8. **`composer.json` has no `test`/`analyse`/`lint` scripts.** Verification relies on monorepo-root tooling; the package can't be checked in isolation. Add package-local script aliases for portability.

## 5. Marketplace & Positioning

Foundation/bundled and free — correct: this is infrastructure every public Capell site needs, and it anchors the platform's performance story. The package copy now leads with the speed and public-output-safety outcomes.

- **Shipped `marketplace.summary`:** _"Serve Capell pages as static HTML for sub-millisecond responses — with automatic, dependency-aware invalidation that keeps cached pages fresh and never leaks gated or authoring content to anonymous visitors."_
- **Shipped composer/manifest `description`:** _"Full-page static HTML cache for Capell with dependency-indexed invalidation, scheduled stale-regeneration, and public-output safety guarantees."_
- **Tiering:** Keep the core free/foundation — it must ship by default for the platform to feel fast. The §3 differentiators (CDN/surrogate-key purge driver, per-segment variance, ESI holes, traffic-ranked warming, hit-rate telemetry) are credible **premium add-on** material ("HTML Cache Pro" / fold into frontend-optimizer's commercial tier). The safety mechanism and local cache stay free; edge integration and personalization-aware caching are the upsell.
- **Screenshot/media status:** `capell.json` lists the extension card plus all **7 required** committed captures from `docs/screenshots.json` (maintenance-cache-page, cached-model-urls, dashboard-widgets, site-health-cache-map, page-table-extension, public-cache-hit, maintenance-page). The "anonymous public cache hit with no authoring markers/cookies" shot is the single most persuasive safety+performance asset and should lead the gallery.
- **Performance anchor:** none of the marketing surfaces show a number. Pair the launch with a hit-rate/coverage widget (§3) so the marketplace can cite a real before/after TTFB or hit-ratio.
- **Keywords/tags:** `full-page-cache`, `html-cache`, `static-site-generation`, `cache-invalidation`, `surrogate-key`, `stale-while-revalidate`, `cdn-purge`, `performance`, `ttfb`, `public-output-safety`, `multi-site-cache`, `edge-cache`.

## 6. Prioritized Roadmap

| Item                                                                                      | Bucket | Effort | Impact | Section ref                                                                                                                                                                                                    |
| ----------------------------------------------------------------------------------------- | ------ | ------ | ------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Shipped 2026-06-05: Targeted invalidation on Translation create/update/delete             | Done   | M      | High   | §2.1, §4.4                                                                                                                                                                                                     |
| Cache the `access_gate_areas` hot-path existence query                                    | Done   | S      | High   | §4.1 — Done 2026-06-04: memoized lookup plus refresh/disabled-cache tests.                                                                                                                                     |
| Implement real `HtmlCacheHealthCheck` (writable disk, middleware wired, tables, schedule) | Done   | M      | High   | §2.4 — closed by real diagnostics and focused health/manifest coverage                                                                                                                                         |
| Reconcile manifest screenshots (1) with screenshots.json (7); capture all 7               | Done   | M      | High   | §5 — closed 2026-06-06: seven Capell runner PNG captures are committed and promoted into marketplace media.                                                                                                    |
| Rewrite marketplace summary + composer description (outcome-led)                          | Done   | S      | Med    | §5 — Shipped 2026-06-05: `capell.json` marketplace summary, manifest/composer descriptions, README, and docs index now use outcome-led speed + public-output-safety copy.                                      |
| Cookie-strip list de-duped 2026-06-04; both strip paths tested                            | Done   | S      | Med    | §2.5                                                                                                                                                                                                           |
| Shipped 2026-06-04: add test/guard so same-host/path locale variants are not cross-served | Done   | S      | High   | §4.2                                                                                                                                                                                                           |
| Done/Shipped: `CachePurger` contract + null driver + one CDN surrogate-key purge driver   | Done   | L      | High   | §3 — local invalidation now has a pluggable edge purge path via `CachePurger`, default null driver, and configurable HTTP surrogate-key driver.                                                                |
| Shipped 2026-06-06: Config-driven bypass allow/deny path, cookie, and header matcher      | Done   | M      | Med    | §3                                                                                                                                                                                                             |
| Done/Shipped: Origin stale-while-revalidate (lock + serve-stale-then-regenerate)          | Done   | M      | High   | §3, §4.4 — stale cache files are served immediately while `RefreshOriginStaleCachedUrlAction` claims and refreshes the stale row after response using the existing atomic refresh path.                        |
| Shipped 2026-06-06: Lift `s-maxage`/`max-age`/SWR magic numbers into config               | Done   | S      | Low    | §2.3                                                                                                                                                                                                           |
| Done/Shipped: Hit-rate / coverage telemetry widget + per-URL counters                     | Done   | M      | Med    | §3, §5 — cache reads now record per-URL hit counts, bytes served, and last-hit timestamps on `cached_model_urls`; coverage rows show tracked hit data and manifest capabilities include `cache-hit-telemetry`. |
| Done/Shipped: Single global model observer instead of N per-class closures                | Done   | M      | Med    | §2.6 — generic invalidation now uses three wildcard Eloquent listeners and `HtmlCacheModelInvalidationObserver`; route-structure models keep explicit handlers for broad/page-url clears.                      |
| Registrable per-segment cache-variant key contract (currency/AB)                          | Later  | L      | Med    | §3                                                                                                                                                                                                             |
| ESI / partial-hole hydration for dynamic fragments                                        | Later  | L      | Med    | §3                                                                                                                                                                                                             |
| Queued, traffic-ranked incremental cache warming (warm-on-deploy / warm-on-invalidate)    | Later  | L      | Med    | §3                                                                                                                                                                                                             |
| Hostile-path test fixtures + budget assertion for middleware decision path                | Later  | S      | Med    | §4.5, §4.7                                                                                                                                                                                                     |