Skip to content

Migrate auto-purge to Action Scheduler with batched deletion#1882

Open
PatelUtkarsh wants to merge 26 commits into
developfrom
feature/XWPENG-28-auto-purge-action-scheduler
Open

Migrate auto-purge to Action Scheduler with batched deletion#1882
PatelUtkarsh wants to merge 26 commits into
developfrom
feature/XWPENG-28-auto-purge-action-scheduler

Conversation

@PatelUtkarsh
Copy link
Copy Markdown
Member

@PatelUtkarsh PatelUtkarsh commented May 19, 2026

Summary

The old auto-purge runs on WP-Cron and issues one unbatched DELETE … LEFT JOIN against every eligible row. On low-traffic sites and sites with DISABLE_WP_CRON, the cron never fires. On bloated tables, the DELETE hits max_execution_time or an InnoDB lock-wait timeout and aborts, leaving orphan stream_meta rows behind.

This PR moves the work to Action Scheduler, which the manual reset path already uses:

  • purge_schedule_setup() registers a recurring AS action (stream_auto_purge_action, 12h interval) and clears the legacy wp_stream_auto_purge WP-Cron event on upgrade.
  • The recurring callback fires the wp_stream_auto_purge action for back-compat, snapshots a UTC cutoff from the TTL, applies an overlap guard, and branches on wp_stream_is_large_records_table:
    • Small table (default: <= 1M rows): one inline multi-table DELETE, then enqueues the reaper as a one-shot action so the heal step stays visible in Tools > Scheduled Actions.
    • Large table: enqueues the batched chain.
  • The batch worker (stream_auto_purge_batch_action) deletes one window of up to wp_stream_batch_size rows (default 250,000, shared with the manual reset), then chains the next batch. last_entry threads through the chain to guarantee forward progress, mirroring Admin::erase_large_records().
  • The reaper (stream_auto_purge_reaper_action) runs delete_orphaned_meta() once at the end of every chain, so installs with historical orphans heal over time.
  • Settings > Advanced gains a Clean Orphaned Meta link that schedules the reaper on demand. Admin::is_running_auto_purge() hides the link while a chain is active and swaps the description to explain why.
Flow of clean up

flowchart TD
    %% Triggers
    T1["⏰ Recurring AS tick<br/>(every ~12h)"]
    T2["⚙️ Settings save:<br/>records_ttl shortened"]
    T3["🔗 Settings → Advanced:<br/>'Clean Orphaned Meta' link"]
    T4["🗑️ Settings → Reset:<br/>'Reset Stream Records' button"]

    %% Routing
    T2 -->|"do_action('wp_stream_auto_purge') (BC)<br/>+ enqueue AUTO_PURGE_ACTION async"| RP
    T1 --> RP["AUTO_PURGE_ACTION<br/>Admin::purge_scheduled_action()"]

    RP --> B{{"Bail-out checks:<br/>• network-admin scope<br/>• keep_records_indefinitely<br/>• records_ttl < 1<br/>• is_running_auto_purge()"}}
    B -->|"any check fails"| X1[/"return (no work, no BC action)"/]
    B -->|"all checks pass"| BC["do_action('wp_stream_auto_purge')<br/>(BC hook fires here)"]

    BC --> CNT["SELECT COUNT(ID) FROM stream<br/>(scoped by blog_id if per-site)"]
    CNT --> D{"is_large_records_table()<br/>default: count > 1,000,000<br/>filter: wp_stream_is_large_records_table"}

    D -->|"NO (small)"| FAST["Inline DELETE stream + meta<br/>WHERE created < cutoff"]
    FAST --> R

    D -->|"YES (large)"| BATCH["AUTO_PURGE_BATCH_ACTION<br/>Admin::auto_purge_batch(cutoff, blog_id, last_entry)"]

    BATCH --> SEL["SELECT max ID < cutoff<br/>(and < last_entry if set)"]
    SEL --> E{"rows remaining?"}
    E -->|"YES"| DEL["DELETE window<br/>(parent + meta, by ID range + cutoff)<br/>default batch: 250,000<br/>filter: wp_stream_batch_size"]
    DEL -->|"enqueue next batch<br/>last_entry = window_low"| BATCH
    E -->|"NO (terminal)"| R

    %% Reaper
    T3 -->|"admin-ajax + nonce<br/>idempotent enqueue"| R["AUTO_PURGE_REAPER_ACTION<br/>Admin::auto_purge_reaper()"]
    R --> RM["delete_orphaned_meta()<br/>DELETE meta WHERE parent stream row missing"]

    %% Manual reset
    T4 --> RST{"is_multisite_not_network_activated?"}
    RST -->|"NO (network / single-site)"| TRUNC["TRUNCATE stream + streammeta<br/>then delete_orphaned_meta() inline"]
    RST -->|"YES (per-site)"| RSTSZ{"is_large_records_table(blog count)?"}
    RSTSZ -->|"NO"| RSTSMALL["Inline DELETE WHERE blog_id = X"]
    RSTSZ -->|"YES"| RSTBIG["ASYNC_DELETION_ACTION<br/>Admin::erase_large_records() self-chains<br/>⚠ does NOT enqueue reaper at end (asymmetry)"]

    %% Styling
    classDef trigger fill:#e3f2fd,stroke:#1976d2,color:#0d47a1
    classDef action  fill:#fff3e0,stroke:#f57c00,color:#e65100
    classDef decision fill:#f3e5f5,stroke:#7b1fa2,color:#4a148c
    classDef bailout fill:#fafafa,stroke:#9e9e9e,color:#424242,stroke-dasharray: 4 4
    classDef warn    fill:#ffebee,stroke:#c62828,color:#b71c1c

    class T1,T2,T3,T4 trigger
    class RP,BATCH,R,FAST,DEL,RM,TRUNC,RSTSMALL,RSTBIG,BC action
    class B,D,E,RST,RSTSZ decision
    class X1 bailout
    class RM,RSTBIG warn
Loading

Checklist

  • Project documentation has been updated to reflect the changes in this pull request, if applicable.
  • I have tested the changes in the local development environment (see contributing.md).
  • I have added phpunit tests.

Release Changelog

  • Fix: Auto-purge runs through Action Scheduler with batched deletion. Sites where the old WP-Cron purge silently fails or times out on large stream / stream_meta tables now drain cleanly.
  • Fix: Orphan stream_meta rows are removed at the end of every auto-purge cycle, so installs that accumulated orphans from earlier interrupted purges heal over time.
  • New: Clean Orphaned Meta link under Settings > Advanced for one-shot cleanup. The link hides while a chain is running and returns once it drains.
  • New: Auto-purge honors wp_stream_is_large_records_table. Small tables get one inline DELETE; bloated tables go through the batched chain. Same knob ops already tune for the manual reset.

Testing Instructions

  1. Activate Stream (single-site or network-activated multisite).
  2. Settings > Advanced: set Keep Records for to 1 day. Confirm Keep Records Indefinitely is off.
  3. Seed aged records:
    npm run cli -- wp eval 'global \$wpdb; \$wpdb->query("UPDATE {\$wpdb->stream} SET created = DATE_SUB(NOW(), INTERVAL 2 DAY)");'
    
  4. Trigger the recurring action:
    npm run cli -- wp eval 'wp_stream_get_instance()->admin->purge_scheduled_action();'
    
  5. Open Tools > Scheduled Actions and filter by group stream-auto-purge. On a small table only the reaper is queued (the inline DELETE has already run). To force the batched chain on a small dataset: add_filter( 'wp_stream_is_large_records_table', '__return_true' );.
  6. Let AS drain on subsequent admin requests, or run npm run cli -- wp action-scheduler run --group=stream-auto-purge.
  7. Verify:
    • Aged rows gone: SELECT COUNT(*) FROM wp_stream WHERE created < DATE_SUB(NOW(), INTERVAL 1 DAY); returns 0.
    • No orphan meta: SELECT COUNT(*) FROM wp_stream_meta m LEFT JOIN wp_stream s ON s.ID = m.record_id WHERE s.ID IS NULL; returns 0.
  8. Multisite scoping:
    • Network-activated: the purge spans all blogs.
    • Per-site activated: the purge touches only the current blog.
  9. Manual cleanup, idle state: Settings > Advanced, click Clean Orphaned Meta. The page redirects with ?wp_stream_message=orphan_meta_cleanup_scheduled and the reaper appears in AS.
  10. Manual cleanup, active state: while a chain is queued, reload Settings > Advanced. The link is hidden and the description reads "Auto-purge is currently running...". Once the chain drains, the link returns.

Fixes: #1056 #1236

Adds stub method bodies so the action registration resolves; the real
implementations land in subsequent commits.

Refs XWPENG-28
Replaces the legacy twicedaily WP-Cron event with a recurring AS action
scheduled at 12h intervals. Clears any pre-existing legacy event on
upgrade so the two cannot double-fire. Idempotent: re-running the
setup while a recurring action is pending is a no-op.

Refs XWPENG-28
The recurring action now snapshots the TTL cutoff in UTC, fires
wp_stream_auto_purge for back-compat, applies an overlap guard, and
enqueues the first batch into the auto-purge chain. Multisite scoping
(per-site activation) is encoded as blog_id; 0 means 'all blogs'.
Defaults are merged into the options array via wp_parse_args so a
partially-saved option still gets the missing keys.

Refs XWPENG-28
Window-based deletion (ID range \u2264 wp_stream_batch_size) joined against
stream_meta in a single statement, mirroring erase_large_records().
Snapshotted UTC cutoff is threaded through each batch in the chain.
blog_id == 0 means 'all blogs' for network-activated installs; non-zero
scopes to that blog. Schedules the orphan reaper as the terminal step
when no more rows are eligible.

Refs XWPENG-28
Runs delete_orphaned_meta() once at the end of every chain so installs
that already had orphan meta from historical timed-out purges heal
over time without operator intervention. Lifts delete_orphaned_meta()
visibility from private to protected so the reaper can call it.

Refs XWPENG-28
… Advanced

Settings UI link is nonced for users with WP_STREAM_SETTINGS_CAPABILITY;
the ajax handler schedules a one-shot reaper via Action Scheduler.
Idempotent: re-clicking while a reaper is pending is a no-op. Bails out
early under WP_STREAM_TESTS so PHPUnit doesn't exit the worker.

Refs XWPENG-28
Asserts the Clean Orphaned Meta link renders on Settings \u2192 Advanced,
points at admin-ajax.php with the expected action + nonce, and that
following the link redirects to the settings page with a confirmation
marker in the URL. Also fixes the redirect target in the handler to
use network_settings_page_slug on network-activated installs.

Refs XWPENG-28
The previous auto_purge_batch SELECT used 'WHERE created < cutoff ORDER
BY ID DESC LIMIT 1' to find the next window's top, but on hosts that
are actively logging during a chain the ID space is sparse (eligible
rows interleaved with fresh rows whose created > cutoff). That caused
each subsequent batch to find a top only ~30 IDs below the previous,
stalling progress.

Match Admin::erase_large_records()'s pattern instead: pass last_entry
(the lower bound of the previous window) through the chain and use
'WHERE ID < last_entry' to guarantee the next batch starts strictly
below the previous window. Stride is now exactly wp_stream_batch_size
IDs per batch.

Verified end-to-end on a multisite install seeded to ~320k aged
records: chain drains in ~35 batches at batch_size=10000 and
terminates with zero orphans.

Refs XWPENG-28
- Replace interpolated-SQL pattern in auto_purge_batch with explicit
  prepared statements per (blog_id, last_entry) combination.
- Correct @return doctype on wp_ajax_clean_orphan_meta to bool|void.
- Settings UI array alignment.
- Add @param to set_records_ttl test helper.

Refs XWPENG-28
@PatelUtkarsh PatelUtkarsh marked this pull request as draft May 19, 2026 06:52
Settings::get_defaults() runs every field through wp_stream_settings_option_fields,
which Network::get_network_admin_fields() uses to strip 'records_ttl' from the
per-site option's defaults set. Outside any admin context (Action Scheduler,
WP-CLI, system cron) the per-site option_key is in effect, so the filtered
defaults never contained general_records_ttl. Without this fallback the
auto-purge silently no-ops on every install where the option is missing,
defeating the whole point of fixing this on bloated sites.

Hardcoded fallback to 30 days (the documented default on the settings field
itself) when general_records_ttl is absent after the merge.

Caught by section 9 of the e2e plan.

Refs XWPENG-28
… fast path

The acceptance criteria require the auto-purge to honor the same
wp_stream_is_large_records_table filter the manual reset uses, so
ops only have one knob to tune table-size semantics.

The recurring callback now counts eligible rows once, passes the count
through Plugin::is_large_records_table(), and:

- Small table (filter returns false): runs a single inline multi-table
  DELETE for the eligible rows, then enqueues the orphan reaper as a
  terminal AS action so the heal step stays observable in
  Tools \u2192 Scheduled Actions.
- Large table (filter returns true, default for record_count > 1M):
  enqueues the batched chain as before.

Tests:
- New test_purge_scheduled_action_small_table_fast_path covering the
  inline-DELETE branch.
- New test_purge_scheduled_action_large_table_uses_batched_chain
  exercising the batched branch via the filter.
- Existing batched-path tests now opt into the chain explicitly via
  add_filter('wp_stream_is_large_records_table','__return_true').

Refs XWPENG-28
Mirrors is_running_async_deletion() but checks the auto-purge group
(batch + reaper). The recurring scheduler is intentionally excluded
from the probe so it doesn't always report 'running' under normal
operation. Settings UI uses this in the next commit to render an
'Auto-purge currently running' notice on Settings \u2192 Advanced.

Refs XWPENG-28
When the auto-purge chain is active (batch worker or reaper pending),
the manual Settings \u2192 Advanced cleanup link is hidden and the field
description is swapped to explain that the reaper will run as part of
the active cycle. Mirrors how Reset Stream Database hides itself
during async deletion.

Refs XWPENG-28
Two new specs exercise the Settings \u2192 Advanced field behaviour driven
by Admin::is_running_auto_purge():

- Active state: seed a pending reaper action via wp-cli, assert the
  Clean Orphaned Meta link is removed from the DOM and the swapped
  description ('Auto-purge is currently running') is visible.
- Idle state: drain the seeded action and assert the link is restored.

Both specs use a small wp-cli helper (execSync into the wordpress
container) to seed/clear AS state, keeping the test free of any
browser-side timing on the AS worker.

Refs XWPENG-28
- wpEval helper now swallows non-zero exits from the wordpress
  container instead of throwing into the test runner. State seeding
  is best-effort; the test's own assertions are the source of truth.
- beforeAll waits for the post-activation navigation and explicitly
  confirms 'Network Deactivate Stream' is visible before any test
  runs, failing fast instead of letting every test silently hit the
  'Sorry, you are not allowed to access this page' redirect when a
  prior suite leaves Stream deactivated.

In-isolation: spec is stable across 3 consecutive runs. Pre-existing
cross-spec activation races (editor-new-post, admin-ui-smoke) remain
out of scope here.

Refs XWPENG-28
CI runs ESLint with no-unused-vars; the wpEval helper's catch block
declared 'err' but never referenced it. Use the optional binding form
'catch {}' which is supported on the runner's Node version (22+).

Refs XWPENG-28
- test_auto_purge_action_constants_exist was a tautology: it asserted
  that four constants equal the string values they are declared with.
  The test catches nothing that static analysis or a typo in the
  consumer wouldn't catch.

- test_auto_purge_batch_respects_wp_stream_batch_size_filter only
  verified the filter was *called* (invocation counter), not that the
  returned value affected behaviour. test_auto_purge_batch_deletes_
  window_and_chains_next_batch already drives the chain with a custom
  batch_size and asserts on the resulting chain, which is the
  functional contract that matters.

No coverage lost.

Refs XWPENG-28
@PatelUtkarsh PatelUtkarsh marked this pull request as ready for review May 19, 2026 09:16
@PatelUtkarsh PatelUtkarsh requested a review from Copilot May 19, 2026 09:16
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates Stream’s TTL-based auto-purge from a legacy WP-Cron event to Action Scheduler, adding a batched deletion path for large tables and a terminal orphan-meta cleanup step, plus UI/test coverage for the new behavior.

Changes:

  • Replace WP-Cron–driven auto-purge with a recurring Action Scheduler action, including an overlap guard and a batched worker chain for large datasets.
  • Add an orphan-meta “reaper” action (runs at the end of each purge chain) and expose a manual “Clean Orphaned Meta” link in Settings → Advanced.
  • Expand PHPUnit coverage for scheduling/branching/batching and add a Playwright E2E spec for the new admin UX.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
classes/class-admin.php Adds AS recurring auto-purge scheduling, batched purge worker, terminal reaper, and an AJAX endpoint to schedule the reaper.
classes/class-settings.php Adds “Clean Orphaned Meta” link/description with running-state UX driven by Admin::is_running_auto_purge().
tests/phpunit/test-class-admin.php Adds PHPUnit coverage for AS scheduling, back-compat hook firing, small/large table branches, batching, reaper behavior, and running-state probe.
tests/e2e/admin-orphan-cleanup.spec.js Adds Playwright coverage for the manual orphan cleanup link and its hidden “running” state UX.
changelog.md Documents the migration to Action Scheduler, batched deletion, and the new orphan-meta cleanup capability.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread classes/class-admin.php Outdated
Comment thread classes/class-admin.php Outdated
Comment thread classes/class-settings.php Outdated
1. auto_purge_batch() now throws InvalidArgumentException on empty
   cutoff instead of silently returning. A bare 'return' caused AS to
   mark the action complete; throwing makes AS log it as failed and
   surface it in Tools > Scheduled Actions. New PHPUnit test covers
   the throw path. In practice this branch is unreachable because
   purge_scheduled_action() always populates the cutoff, but the
   guard exists for third-party code that may enqueue with bad input.

2. wp_ajax_clean_orphan_meta() now uses $this->settings_cap to match
   the rest of the file (wp_ajax_reset, ajax_filters). The bare
   WP_STREAM_SETTINGS_CAPABILITY constant could break installs that
   override the capability via the property after construction.

3. The Settings 'Clean Orphaned Meta' field now calls
   Admin::is_running_auto_purge() once per render instead of twice.
   Extracted the field-building logic into a small helper so the
   state probe runs once and both the 'type' and 'desc' branches
   reuse the result.

Refs XWPENG-28
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Comment thread classes/class-admin.php
Comment thread classes/class-admin.php Outdated
Comment thread classes/class-admin.php Outdated
1. Settings::updated_option_ttl_remove_records() now triggers an
   immediate purge directly. Previously it relied on the legacy
   wp_stream_auto_purge action being hooked to purge_scheduled_action,
   which this PR severed. Without this fix, shortening the TTL did not
   take effect until the next 12h recurring tick.

2. TTL fallback uses isset() instead of empty(), so an explicit '0'
   set via CLI/SQL is no longer silently overridden to 30. Added an
   explicit short-circuit: a non-positive TTL bails out of the cycle
   entirely. The UI enforces min=1; the only paths to 0 are operator
   error, and bailing out (records stop being purged) is a less
   destructive failure mode than honoring it (records get wiped
   repeatedly every 12h).

3. Overlap guard now reuses Admin::is_running_auto_purge(), so a
   pending reaper also blocks a new chain. Previously only a pending
   batch action blocked the guard, leaving a small window between
   chain completion and reaper completion where a new chain could
   stack.

Three new PHPUnit tests cover each behaviour.

Refs XWPENG-28
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comment thread classes/class-admin.php Outdated
Comment on lines +776 to +779
return (
as_has_scheduled_action( self::AUTO_PURGE_BATCH_ACTION )
|| as_has_scheduled_action( self::AUTO_PURGE_REAPER_ACTION )
);
Comment thread tests/e2e/admin-orphan-cleanup.spec.js Outdated
Comment on lines +62 to +65
*/
function seedRunningAutoPurge() {
wpEval(
`as_enqueue_async_action("stream_auto_purge_reaper_action", array(), "stream-auto-purge");`,
- Move wp_stream_auto_purge BC action to after all bail-out checks so it
  fires only when a purge actually runs (was firing on every recurring
  tick regardless of whether work happened).
- Extend is_running_auto_purge() to also check IN-PROGRESS actions via
  as_get_scheduled_actions() so the overlap guard cannot let a second
  chain stack against rows the running batch worker is still touching.
- Settings TTL-shortened path now enqueues AUTO_PURGE_ACTION via
  as_enqueue_async_action() so the immediate purge serializes through
  Action Scheduler instead of bypassing the overlap guard with an
  inline call. Inline fallback retained for when AS is unavailable.
- Render an admin notice for wp_stream_message=orphan_meta_cleanup_scheduled
  on both admin_notices and network_admin_notices so the post-redirect
  UX is actually visible (was a half-built feature: redirect happened,
  no notice rendered).
- E2E wpEval(): switch from 'docker compose run --rm' to 'docker
  compose exec -T' to attach to the long-lived container (~3-5s saved
  per call) and surface failures via console.warn instead of silently
  swallowing errors.
- PHPUnit: add coverage for both the BC-action-suppressed-on-bailout
  path and the in-progress-action overlap guard. Update the
  TTL-shortened test to assert the AS enqueue path.
The previous quick-win switch from 'docker compose run --rm --user
$(id -u)' to 'docker compose exec -T' dropped the user mapping. On
local dev that happened to work because the wordpress container's
default exec user is what the runtime image inherits; on CI it defaults
to root and wp-cli refuses to run with:

  YIKES! It looks like you're running this as root.

$(id -u) only worked on the host because UID 1000 mapped to www-data
inside the container — that's not portable to the GitHub Actions runner
(UID 1001). Pin the exec user explicitly to www-data, which owns the
WordPress files inside the container.

Failing run: actions/runs/26144103763
…auto-purge-action-scheduler

# Conflicts:
#	classes/class-settings.php
Set baseURL in playwright.config.js and replace absolute
https://stream.wpenv.net references in every spec with relative
paths. Also drops the docker-exec state seeding from
admin-orphan-cleanup.spec.js so it matches the browser-only
convention of the rest of the suite; the running-state UX it
seeded is covered by PHPUnit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Old records are not being removed

2 participants