From 6c882cb177561e60a3459e1200f52d941654e8ed Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Thu, 26 Mar 2026 09:44:57 -0500 Subject: [PATCH 1/9] updated to 0.0.7 --- CHANGELOG.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 4b9bcad..f2984c7 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,21 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.0.7] - 2026-03-26 + +### Added + +- Pod scheduling support via `--node-selector` and `--tolerations-file` flags in `scripts/master.sh`, `scripts/setup/ingress.sh`, and `scripts/setup/gateway.sh` +- Multi-node Kind topology (`--kind-topology scheduling-e2e`) that creates a control-plane node plus a labeled/tainted worker node for scheduling validation +- Pod placement verification in `master.sh` that checks server pods are scheduled on expected nodes when `--node-selector` is provided +- CI scheduling test variant that validates pod placement on multi-node Kind clusters +- Helm chart scheduling render checks in CI workflow + +### Changed + +- `modules/kind/run/run.sh` now accepts `--topology` flag to select cluster shape (`default` or `scheduling-e2e`) +- `scripts/install/nginx.sh` patches the ingress-nginx manifest to add control-plane tolerations to admission Jobs, fixing scheduling on multi-node Kind clusters + ## [0.0.6] - 2026-03-23 ### Changed From 6382dcd61dc4d3ec377b65593535ede271fb85f4 Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Mon, 30 Mar 2026 11:29:36 -0400 Subject: [PATCH 2/9] updated v0.0.8 --- CHANGELOG.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index f2984c7..40faefc 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,13 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.0.8] - 2026-03-30 + +### Fixed + +- Fixed OOM errors in `modules/vegeta/run/run.sh` at high RPS by streaming attack output to a temp file before processing, reducing memory usage from O(n) to O(buffer_size) (e.g., 20k RPS for 3 minutes = 3.6M requests no longer causes OOMKilled) +- Added `sleep 2` delay after streaming vegeta results to jaggr to ensure the final time bucket is flushed before the pipe closes + ## [0.0.7] - 2026-03-26 ### Added From 600700bc8d9cb3eb3511057448c7dd08780964d6 Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Mon, 30 Mar 2026 12:02:20 -0400 Subject: [PATCH 3/9] changed from a fix to a change --- CHANGELOG.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 40faefc..7f453b0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,9 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [0.0.8] - 2026-03-30 -### Fixed +### Changed -- Fixed OOM errors in `modules/vegeta/run/run.sh` at high RPS by streaming attack output to a temp file before processing, reducing memory usage from O(n) to O(buffer_size) (e.g., 20k RPS for 3 minutes = 3.6M requests no longer causes OOMKilled) +- `modules/vegeta/run/run.sh` now streams attack output to a temp file before processing, reducing memory usage from O(n) to O(buffer_size) for high-RPS workloads - Added `sleep 2` delay after streaming vegeta results to jaggr to ensure the final time bucket is flushed before the pipe closes ## [0.0.7] - 2026-03-26 From 2b228f783a12ac41c8abb06b331cccbd11988803 Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Wed, 1 Apr 2026 14:39:23 -0400 Subject: [PATCH 4/9] update readme --- README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/README.md b/README.md index 4791f6c..924f365 100644 --- a/README.md +++ b/README.md @@ -117,6 +117,9 @@ docker run module/vegeta/install docker run module/vegeta/run --target-url http://localhost:8080 --rate 50 --duration 30s docker run module/kind/output host_port +# Merge multiple vegeta .bin files into per-second JSON +docker run merge --output-file merged.json pod0.bin pod1.bin pod2.bin + # Run the server docker run -p 3333:3333 server ``` @@ -138,6 +141,7 @@ This repository is a collection of modules that follow consistent patterns to cr - /install contains a `install.sh` script that installs the required tool - /run contains `run.sh` that run the tool. These can be functions and modules can contain many different functions for running - /output collects the output of the run into a standardized json file +- /merge (vegeta only) merges multiple raw `.bin` files from parallel pods into unified per-second JSON output - /test contains a `test.sh` script that tests and validates the module is working correctly Note: all modules expect to be **run from the root directory of this project**. From 7bf3e38067e9e651bc690aef2d34bf9188dfa1ea Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Wed, 1 Apr 2026 14:39:34 -0400 Subject: [PATCH 5/9] update claude --- CLAUDE.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index 41a3248..b6ffcc6 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -76,6 +76,7 @@ Each module under `modules/` follows the same interface: - `install/install.sh` — installs the tool - `run/run.sh` — runs the tool (can be sourced as a library or executed directly) - `output/output.sh` — reads from `../statefile.json` and emits results +- `merge/merge.sh` — (vegeta only) merges multiple `.bin` files into per-second JSON output using timestamp-based bucketing - `test/test.sh` — integration test Modules communicate via **statefile.json** (one per module, gitignored in practice). The kind module writes `{"cluster_name": "...", "host_port": "..."}`. The vegeta module writes streaming jaggr JSON (one line per second of the test). @@ -88,6 +89,7 @@ The Dockerfile entrypoint routes commands to scripts via prefix matching: - `install/ [args...]` → `scripts/install/.sh` - `setup/ [args...]` → `scripts/setup/.sh` - `module// [args...]` → `modules///.sh` +- `merge [args...]` → `modules/vegeta/merge/merge.sh` - `server` → starts the Go HTTP server ### Scenario interface @@ -96,7 +98,9 @@ Scenarios accept CLI arguments (`--ingress-url`, `--rate`, `--duration`, `--work ### Data flow for results -Vegeta attack → `vegeta encode` → `jaggr` (aggregates per-second) → `tee` to `modules/vegeta/statefile.json`. The scenario script then copies statefile content to the `--output-file`. The result file contains **one JSON line per second** of the test. The CI validation sums `code.hist["200"]` across **all lines** and fails only if the total is zero (i.e., no connection was ever successfully routed). +The vegeta run pipeline streams results in real time: `vegeta attack | tee statefile.bin | vegeta encode | jaggr | tee statefile.json`. This saves raw binary results to `.bin` alongside the per-second jaggr JSON. The scenario script then copies statefile.json content to the `--output-file`. The result file contains **one JSON line per second** of the test. The CI validation sums `code.hist["200"]` across **all lines** and fails only if the total is zero (i.e., no connection was ever successfully routed). + +For multi-pod tests, `modules/vegeta/merge/merge.sh` combines multiple `.bin` files into unified per-second JSON. It uses actual request timestamps for bucketing (not wall-clock time), so it correctly interleaves results from pods that started at slightly different times. The first and last second-buckets may be partial. ### Helm chart (charts/server) @@ -113,11 +117,11 @@ The server image is `ghcr.io/azure/aks-traffic-ingress-competitive-testing` (a G ### CI (validate.yaml) -The validation workflow runs on PRs to main. The `test-scenarios` job uses a matrix over `traffic: [ingress, gateway]` × `scenario: [basic-rps, restarting-backend-rps]` × `variant: [default]` — each combination gets its own runner and Kind cluster. An additional `scheduling-e2e` variant tests pod placement with node selectors and tolerations on a multi-node Kind cluster. Other jobs: module tests (matrix over discovered modules), chart validation (including scheduling render checks), and project structure validation. +The validation workflow runs on PRs to main. The `test-scenarios` job uses a matrix over `traffic: [ingress, gateway]` × `scenario: [basic-rps, restarting-backend-rps]` × `variant: [default]` — each combination gets its own runner and Kind cluster. An additional `scheduling-e2e` variant tests pod placement with node selectors and tolerations on a multi-node Kind cluster. The `test-merge` job validates multi-pod merge by running 4 simultaneous vegeta attacks against a Docker server and verifying the merged output (RPS totals, code histograms, JSON structure). Other jobs: module tests (matrix over discovered modules), chart validation (including scheduling render checks), and project structure validation. ## Conventions - All scripts expect to be **run from the repository root**. -- All shell scripts use `set -ex` (or `set -e` for library-style scripts). +- All shell scripts use `set -ex` (or `set -e` for library-style scripts). Scripts with data-processing pipelines also use `set -o pipefail` so that failures in any pipeline stage propagate correctly. - The `--traffic` flag accepts `ingress` or `gateway`. The `--scenario` flag accepts `basic-rps` or `restarting-backend-rps`. - Releases are driven by CHANGELOG.md — the Release workflow reads it to create GitHub releases. Follow [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) format. From 196a5121d145bf27ca737d71e528b34ae3513f58 Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Wed, 1 Apr 2026 14:39:58 -0400 Subject: [PATCH 6/9] update changelog to v0.0.9 --- CHANGELOG.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 7f453b0..90f520f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,21 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.0.9] - 2026-04-01 + +### Added + +- Multi-pod merge script (`modules/vegeta/merge/merge.sh`) that combines multiple vegeta `.bin` files into per-second JSON output using timestamp-based bucketing +- CI `test-merge` job that validates multi-pod merge by running 4 simultaneous vegeta attacks and verifying merged output (RPS, code histograms, JSON structure) +- `gawk` added to Dockerfile base image for merge script support +- `merge` command added to Docker entrypoint routing + +### Changed + +- `modules/vegeta/run/run.sh` now uses a streaming tee pipeline (`vegeta attack | tee binfile | vegeta encode | jaggr`) instead of writing to a temp file then replaying, saving raw `.bin` results alongside jaggr output +- Added `set -o pipefail` to `modules/vegeta/run/run.sh` and `modules/vegeta/merge/merge.sh` so pipeline failures propagate correctly +- Expanded vegeta module tests (`.bin` file production, per-second bucketing, single-file merge, multi-file merge, synthetic percentile verification) + ## [0.0.8] - 2026-03-30 ### Changed From f5ee8853f9701e1993dee67bd9e57720f8832ce8 Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Tue, 7 Apr 2026 16:05:19 -0400 Subject: [PATCH 7/9] update log to v0.0.10 --- CHANGELOG.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 90f520f..49ceaa1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,18 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.0.10] - 2026-04-07 + +### Changed + +- Increased wait timeouts in `scripts/setup/gateway.sh` and `scripts/setup/ingress.sh` to 30 minutes for large-scale deployments (e.g., 50,000 replicas), including resource creation checks, readiness polling, and Helm install retries +- Removed `--timeout` from `kubectl rollout status` in both setup scripts, relying on Kubernetes' built-in `progressDeadlineSeconds` (default 600s) to detect stalls while allowing arbitrarily long rollouts that are making progress +- Gateway and HTTPRoute readiness checks now fail hard instead of silently continuing when not ready + +### Fixed + +- Added `kubectl wait` for Istio gateway proxy pod in `scripts/setup/gateway.sh` to prevent port-forward failures when the proxy pod is still Pending + ## [0.0.9] - 2026-04-01 ### Added From 6a1be3ff8fa918ba7387ab47b2963fd1fc3e6a21 Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Thu, 23 Apr 2026 09:35:22 -0400 Subject: [PATCH 8/9] updated log to v0.0.11 --- CHANGELOG.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 49ceaa1..0f3d6cf 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,14 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.0.11] - 2026-04-23 + +### Added + +- DNS test scripts (`scripts/setup/dns-ingresses.sh`, `scripts/setup/dns-httproutes.sh`, `scripts/cleanup/dns-ingresses.sh`, `scripts/cleanup/dns-httproutes.sh`) that bulk-create or delete N `Ingress` / `HTTPRoute` resources with unique `test-{i}.{domain}` hostnames (all labeled `dns-test=true`) for external-DNS reconciliation testing in downstream telescope pipelines +- CI `test-dns-resources` job with matrix over `[ingresses, httproutes]` covering positive (count=5) and negative paths (missing service, invalid domain, count=0) +- Docker entrypoint coverage and README/CLAUDE.md documentation for the new DNS subcommands + ## [0.0.10] - 2026-04-07 ### Changed From c9fc29937ee105da84e66cdf679c256aecf4d87b Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Thu, 7 May 2026 12:51:37 -0400 Subject: [PATCH 9/9] Update changelog to v0.0.12 Co-Authored-By: Claude Opus 4.7 --- CHANGELOG.md | 10 ++++++++++ CLAUDE.md | 2 +- README.md | 4 +++- 3 files changed, 14 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 0f3d6cf..f58926d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,16 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.0.12] - 2026-05-07 + +### Added + +- `--existing-n ` index offset flag in `scripts/setup/dns-ingresses.sh` and `scripts/setup/dns-httproutes.sh`, so additional batches can be appended without colliding with existing `test-{i}.{domain}` hostnames (objects are numbered `(existing-n+1)..(existing-n+count)`) + +### Changed + +- Setup DNS scripts now stream the generated multi-document YAML to a unique `mktemp -t dns-{ingresses,httproutes}.XXXXXX.yaml` file under `$TMPDIR` and apply it with a single `kubectl apply --server-side -f`, instead of building the manifest in memory and piping it into `kubectl` + ## [0.0.11] - 2026-04-23 ### Added diff --git a/CLAUDE.md b/CLAUDE.md index eff20b5..580b41d 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -121,7 +121,7 @@ The validation workflow runs on PRs to main. The `test-scenarios` job uses a mat ### DNS test scripts -`scripts/setup/dns-ingresses.sh`, `scripts/setup/dns-httproutes.sh`, `scripts/cleanup/dns-ingresses.sh`, and `scripts/cleanup/dns-httproutes.sh` bulk-create or delete N `Ingress` / `HTTPRoute` resources (each with a unique `test-{i}.{domain}` hostname, all labeled `dns-test=true`) for external-DNS reconciliation testing in downstream telescope pipelines (app-routing-nginx, app-routing-istio). They are intentionally **not** wired into `master.sh` — that script remains RPS-focused. The httproutes setup script requires an existing parent `Gateway`; the ingresses setup script requires an existing backend `Service`. Defaults: namespace `default`, service/gateway `server`, port `8080`. Cleanup uses label selector `dns-test=true` with `--wait=false --ignore-not-found` and leaves the parent Gateway intact. +`scripts/setup/dns-ingresses.sh`, `scripts/setup/dns-httproutes.sh`, `scripts/cleanup/dns-ingresses.sh`, and `scripts/cleanup/dns-httproutes.sh` bulk-create or delete N `Ingress` / `HTTPRoute` resources (each with a unique `test-{i}.{domain}` hostname, all labeled `dns-test=true`) for external-DNS reconciliation testing in downstream telescope pipelines (app-routing-nginx, app-routing-istio). They are intentionally **not** wired into `master.sh` — that script remains RPS-focused. The httproutes setup script requires an existing parent `Gateway`; the ingresses setup script requires an existing backend `Service`. Defaults: namespace `default`, service/gateway `server`, port `8080`. Setup scripts accept `--existing-n ` to offset the index range (objects are numbered `(existing-n+1)..(existing-n+count)`) so additional batches can be appended without colliding with existing hostnames. Setup scripts stream the generated multi-document YAML to a unique `mktemp` file under `$TMPDIR` (path printed at the top of the run) and apply it with a single `kubectl apply --server-side -f`. Cleanup uses label selector `dns-test=true` with `--wait=false --ignore-not-found` and leaves the parent Gateway intact. ## Conventions diff --git a/README.md b/README.md index 70c4032..a281475 100644 --- a/README.md +++ b/README.md @@ -115,6 +115,8 @@ docker run setup/ingress --ingress-class nginx --replica-count 3 # Bulk-create dns-test Ingresses or HTTPRoutes (for external-DNS testing) docker run setup/dns-ingresses --count 100 --domain extdns.telescope.test docker run setup/dns-httproutes --count 100 --domain extdns.telescope.test +# Append a second batch of 100 starting at index 101 (no hostname collisions) +docker run setup/dns-ingresses --count 100 --existing-n 100 --domain extdns.telescope.test docker run cleanup/dns-ingresses docker run cleanup/dns-httproutes @@ -159,7 +161,7 @@ Note: all modules expect to be **run from the root directory of this project**. - `/install` — traffic controller install scripts (`nginx.sh`, `istio.sh`) - `/setup` — server deployment scripts with readiness checks (`ingress.sh`, `gateway.sh`) - `/scenarios` — load test scenario scripts. These assume the cluster, traffic controller, and server are already running. Their output is JSON so that consumers can decide on the final display format themselves. -- `/setup/dns-ingresses.sh`, `/setup/dns-httproutes.sh` — bulk-create N `Ingress` or Gateway API `HTTPRoute` resources (each with a unique `test-{i}.{domain}` hostname, all labeled `dns-test=true`) for external-DNS reconciliation testing. Paired with `/cleanup/dns-ingresses.sh` and `/cleanup/dns-httproutes.sh`, which delete by label. +- `/setup/dns-ingresses.sh`, `/setup/dns-httproutes.sh` — bulk-create N `Ingress` or Gateway API `HTTPRoute` resources (each with a unique `test-{i}.{domain}` hostname, all labeled `dns-test=true`) for external-DNS reconciliation testing. Use `--existing-n ` to offset the index range so additional batches can be appended without colliding with existing hostnames. The generated manifest is written to a unique `mktemp` file under `$TMPDIR` and applied with a single `kubectl apply --server-side -f`. Paired with `/cleanup/dns-ingresses.sh` and `/cleanup/dns-httproutes.sh`, which delete by label. ### /server