-
Notifications
You must be signed in to change notification settings - Fork 14
Speed up the capture set list view #1301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
mihow
wants to merge
10
commits into
main
Choose a base branch
from
perf/sourceimagecollection-cached-counts
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
83450a9
perf(api): denormalize SourceImageCollection counts as cached columns
mihow 457e090
refactor(models): introduce CachedCountField marker, apply to all den…
mihow 2725599
chore: drop tautological count-default test, clarify reactive comments
mihow 1a242aa
refactor(counts): async signals + cached_counts integrity check
mihow 451c8be
docs: cached-counts update_cached_counts method design
mihow 9ac2ce1
refactor(counts): unify cached-count refresh as BaseModel/BaseQuerySe…
mihow 1120396
fix(serializers): mark SourceImageCollection cached counts read-only
mihow 5113ff2
refactor(migrations): split SourceImageCollection counts schema and b…
mihow f0cc055
refactor(counts): defer reconcile tooling to follow-up
mihow 2533857
docs(planning): cached-counts reconcile follow-up + design space
mihow File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,60 @@ | ||
| """Helpers for scheduling cached-count recomputes after a transaction commits. | ||
|
|
||
| Wraps the per-connection dedup + ``transaction.on_commit`` plumbing that | ||
| ``BaseModel.update_cached_counts`` and ``BaseQuerySet.update_cached_counts`` | ||
| build on. The actual recompute body lives in each model's | ||
| ``update_calculated_fields(save=True)`` implementation; this module only | ||
| handles scheduling. | ||
|
|
||
| Per-(model_label, pk) dedup means N writes affecting the same target row | ||
| collapse to one task, regardless of how many signal handlers fire in the | ||
| transaction. The dedup set lives on the active DB connection (thread-local | ||
| in Django's default setup) and is drained by a single ``on_commit`` hook. | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import logging | ||
| from typing import Any | ||
|
|
||
| from django.db import connection, transaction | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| _PENDING_ATTR = "_pending_cached_count_recomputes" | ||
|
|
||
|
|
||
| def schedule_recompute(model_label: str, pk: Any) -> None: | ||
| """Queue a ``(model_label, pk)`` for recompute at the next commit. | ||
|
|
||
| The pending set lives on the active DB connection; ``transaction.on_commit`` | ||
| fires the flush. ``_flush_pending_recomputes`` is idempotent — the first | ||
| call drains the set; subsequent ones no-op — so we register on_commit on | ||
| every call. That keeps us correct across transaction rollbacks (which | ||
| discard registered on_commits but leave attributes on ``connection`` | ||
| untouched, e.g. between a rolled-back ``TestCase`` and a fresh | ||
| ``TransactionTestCase``). | ||
|
|
||
| Outside an atomic block, ``on_commit`` fires synchronously at | ||
| registration time — so the ``add`` below must precede the | ||
| ``transaction.on_commit`` call or the flush sees an empty set. | ||
| """ | ||
| pending: set[tuple[str, Any]] | None = getattr(connection, _PENDING_ATTR, None) | ||
| if pending is None: | ||
| pending = set() | ||
| setattr(connection, _PENDING_ATTR, pending) | ||
| pending.add((model_label, pk)) | ||
| transaction.on_commit(_flush_pending_recomputes) | ||
|
|
||
|
|
||
| def _flush_pending_recomputes() -> None: | ||
| """Drain the per-connection dedup set; dispatch one task per ``(model, pk)``.""" | ||
| from ami.main.tasks import recompute_cached_counts_task | ||
|
|
||
| pending: set[tuple[str, Any]] = getattr(connection, _PENDING_ATTR, set()) | ||
| try: | ||
| delattr(connection, _PENDING_ATTR) | ||
| except AttributeError: | ||
| pass | ||
| for model_label, pk in pending: | ||
| recompute_cached_counts_task.delay(model_label, pk) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
39 changes: 39 additions & 0 deletions
39
ami/main/migrations/0085_denormalize_sourceimagecollection_counts.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| """ | ||
| Add three denormalized count columns to ``SourceImageCollection`` so the list | ||
| endpoint reads them in O(1) instead of running 3 correlated count subqueries | ||
| per row. | ||
|
|
||
| Schema only. The backfill runs in the separate, non-atomic, re-runnable | ||
| migration ``0086_backfill_sourceimagecollection_counts`` so an interrupted | ||
| backfill on production-sized data cannot leave the schema half-applied | ||
| (columns added but migration unrecorded -> retry fails on duplicate column). | ||
|
|
||
| ``AddField`` with a constant ``default`` is a metadata-only operation on | ||
| PostgreSQL 11+, so this is safe to run atomically even on large tables. | ||
| """ | ||
|
|
||
| from django.db import migrations, models | ||
|
|
||
|
|
||
| class Migration(migrations.Migration): | ||
| dependencies = [ | ||
| ("main", "0084_revoke_delete_job_from_roles"), | ||
| ] | ||
|
|
||
| operations = [ | ||
| migrations.AddField( | ||
| model_name="sourceimagecollection", | ||
| name="source_images_count", | ||
| field=models.IntegerField(default=0), | ||
| ), | ||
| migrations.AddField( | ||
| model_name="sourceimagecollection", | ||
| name="source_images_with_detections_count", | ||
| field=models.IntegerField(default=0), | ||
| ), | ||
| migrations.AddField( | ||
| model_name="sourceimagecollection", | ||
| name="source_images_processed_count", | ||
| field=models.IntegerField(default=0), | ||
| ), | ||
|
Comment on lines
+24
to
+38
|
||
| ] | ||
65 changes: 65 additions & 0 deletions
65
ami/main/migrations/0086_backfill_sourceimagecollection_counts.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,65 @@ | ||
| """ | ||
| Backfill the denormalized ``SourceImageCollection`` count columns added in | ||
| 0085. | ||
|
|
||
| Split from the schema migration on purpose: this is the slow step on | ||
| production-sized M2M tables. ``atomic = False`` lets the UPDATE run outside a | ||
| single transaction, and the UPDATE writes absolute computed values (not | ||
| deltas) so it is idempotent — safe to re-run if interrupted. Collections with | ||
| no images keep the column ``default=0`` from 0085 (the GROUP BY only emits | ||
| rows for collections that have images). | ||
|
|
||
| ``with_det`` checks for a valid (non-null / non-empty) detection bbox to match | ||
| the runtime ``NULL_DETECTIONS_FILTER`` semantics in ``ami/main/models.py``. | ||
| """ | ||
|
|
||
| from django.db import migrations | ||
|
|
||
|
|
||
| def backfill_counts(apps, schema_editor): | ||
| schema_editor.execute( | ||
| """ | ||
| UPDATE main_sourceimagecollection sc | ||
| SET source_images_count = c.total, | ||
| source_images_processed_count = c.processed, | ||
| source_images_with_detections_count = c.with_det | ||
| FROM ( | ||
| SELECT msci.sourceimagecollection_id AS coll_id, | ||
| COUNT(*) AS total, | ||
| COUNT(*) FILTER ( | ||
| WHERE EXISTS ( | ||
| SELECT 1 FROM main_detection d | ||
| WHERE d.source_image_id = si.id | ||
| ) | ||
| ) AS processed, | ||
| COUNT(*) FILTER ( | ||
| WHERE EXISTS ( | ||
| SELECT 1 FROM main_detection d | ||
| WHERE d.source_image_id = si.id | ||
| AND d.bbox IS NOT NULL | ||
| AND d.bbox::text <> '[]' | ||
| ) | ||
| ) AS with_det | ||
| FROM main_sourceimagecollection_images msci | ||
| INNER JOIN main_sourceimage si ON si.id = msci.sourceimage_id | ||
| GROUP BY msci.sourceimagecollection_id | ||
| ) c | ||
| WHERE sc.id = c.coll_id; | ||
| """ | ||
| ) | ||
|
|
||
|
|
||
| def reverse_noop(apps, schema_editor): | ||
| pass | ||
|
|
||
|
|
||
| class Migration(migrations.Migration): | ||
| atomic = False | ||
|
|
||
| dependencies = [ | ||
| ("main", "0085_denormalize_sourceimagecollection_counts"), | ||
| ] | ||
|
|
||
| operations = [ | ||
| migrations.RunPython(backfill_counts, reverse_noop), | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,73 @@ | ||
| # Generated by Django 4.2.10 on 2026-05-13 20:05 | ||
|
|
||
| import ami.base.models | ||
| from django.db import migrations | ||
|
|
||
|
|
||
| class Migration(migrations.Migration): | ||
| dependencies = [ | ||
| ("main", "0086_backfill_sourceimagecollection_counts"), | ||
| ] | ||
|
|
||
| operations = [ | ||
| migrations.AlterField( | ||
| model_name="deployment", | ||
| name="captures_count", | ||
| field=ami.base.models.CachedCountField(blank=True, null=True), | ||
| ), | ||
| migrations.AlterField( | ||
| model_name="deployment", | ||
| name="detections_count", | ||
| field=ami.base.models.CachedCountField(blank=True, null=True), | ||
| ), | ||
| migrations.AlterField( | ||
| model_name="deployment", | ||
| name="events_count", | ||
| field=ami.base.models.CachedCountField(blank=True, null=True), | ||
| ), | ||
| migrations.AlterField( | ||
| model_name="deployment", | ||
| name="occurrences_count", | ||
| field=ami.base.models.CachedCountField(blank=True, null=True), | ||
| ), | ||
| migrations.AlterField( | ||
| model_name="deployment", | ||
| name="taxa_count", | ||
| field=ami.base.models.CachedCountField(blank=True, null=True), | ||
| ), | ||
| migrations.AlterField( | ||
| model_name="event", | ||
| name="captures_count", | ||
| field=ami.base.models.CachedCountField(blank=True, null=True), | ||
| ), | ||
| migrations.AlterField( | ||
| model_name="event", | ||
| name="detections_count", | ||
| field=ami.base.models.CachedCountField(blank=True, null=True), | ||
| ), | ||
| migrations.AlterField( | ||
| model_name="event", | ||
| name="occurrences_count", | ||
| field=ami.base.models.CachedCountField(blank=True, null=True), | ||
| ), | ||
| migrations.AlterField( | ||
| model_name="sourceimage", | ||
| name="detections_count", | ||
| field=ami.base.models.CachedCountField(blank=True, null=True), | ||
| ), | ||
| migrations.AlterField( | ||
| model_name="sourceimagecollection", | ||
| name="source_images_count", | ||
| field=ami.base.models.CachedCountField(default=0), | ||
| ), | ||
| migrations.AlterField( | ||
| model_name="sourceimagecollection", | ||
| name="source_images_processed_count", | ||
| field=ami.base.models.CachedCountField(default=0), | ||
| ), | ||
| migrations.AlterField( | ||
| model_name="sourceimagecollection", | ||
| name="source_images_with_detections_count", | ||
| field=ami.base.models.CachedCountField(default=0), | ||
| ), | ||
| ] |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mohamedelabbas1996 I think we discussed a field type for cached counts last year. it finally happened!