Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions tests/waku_archive/test_driver_postgres.nim
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ suite "Postgres driver":

driver = PostgresDriver(driverRes.get())

(await driver.waitForPartition()).expect("Test has no DB partition")

asyncTeardown:
let resetRes = await driver.reset()
if resetRes.isErr():
Expand Down
2 changes: 2 additions & 0 deletions tests/waku_archive/test_driver_postgres_query.nim
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ suite "Postgres driver - queries":

driver = PostgresDriver(driverRes.get())

(await driver.waitForPartition()).expect("Test has no DB partition")
Comment thread
Ivansete-status marked this conversation as resolved.

asyncTeardown:
let resetRes = await driver.reset()

Expand Down
15 changes: 15 additions & 0 deletions waku/waku_archive/driver/postgres_driver/postgres_driver.nim
Original file line number Diff line number Diff line change
Expand Up @@ -1422,6 +1422,21 @@ proc removeOldestPartition(
proc containsAnyPartition*(self: PostgresDriver): bool =
return not self.partitionMngr.isEmpty()

proc waitForPartition*(
self: PostgresDriver, timeout = chronos.seconds(5)
): Future[ArchiveDriverResult[void]] {.async.} =
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we mention that this is meant to avoid flaky testing only?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a generic timed wrapper for containsAnyPartition so I think we would have to put the same warning there as well. I'm not sure we should be afraid of people using timed-out queries to "a partition exists" outside of tests or as part of helping an actual test to test something useful.

let pollInterval = chronos.milliseconds(100)
var elapsed = chronos.milliseconds(0)

while elapsed < timeout:
if self.containsAnyPartition():
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should confirm that there is a valid partition for "now".
Each partition contains data for o'clock hours unix time.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if we step out of "there is a partition" as a basic check we may step into second-guessing what the partition manager is doing w.r.t. time calc.

Maybe that's the actual solution to this. I should probably get more into what the partition manager actually does. Now I'm not sure I want to merge this fix as it is :-)

I think this PR will linger here a bit since this testing race condition is so difficult to reproduce. It's the opposite of urgent. Certainly not triggering in the CI machines, which is slower than our own machines (where this is already very difficult to trigger). So this is worth waiting for doing actually right.

return ok()

await sleepAsync(pollInterval)
elapsed += pollInterval

return err("PostgresDriver.waitForPartition() timed out after " & $timeout)

method decreaseDatabaseSize*(
driver: PostgresDriver, targetSizeInBytes: int64, forceRemoval: bool = false
): Future[ArchiveDriverResult[void]] {.async.} =
Expand Down
Loading