From 6c882cb177561e60a3459e1200f52d941654e8ed Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Thu, 26 Mar 2026 09:44:57 -0500 Subject: [PATCH 1/8] updated to 0.0.7 --- CHANGELOG.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 4b9bcad..f2984c7 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,21 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.0.7] - 2026-03-26 + +### Added + +- Pod scheduling support via `--node-selector` and `--tolerations-file` flags in `scripts/master.sh`, `scripts/setup/ingress.sh`, and `scripts/setup/gateway.sh` +- Multi-node Kind topology (`--kind-topology scheduling-e2e`) that creates a control-plane node plus a labeled/tainted worker node for scheduling validation +- Pod placement verification in `master.sh` that checks server pods are scheduled on expected nodes when `--node-selector` is provided +- CI scheduling test variant that validates pod placement on multi-node Kind clusters +- Helm chart scheduling render checks in CI workflow + +### Changed + +- `modules/kind/run/run.sh` now accepts `--topology` flag to select cluster shape (`default` or `scheduling-e2e`) +- `scripts/install/nginx.sh` patches the ingress-nginx manifest to add control-plane tolerations to admission Jobs, fixing scheduling on multi-node Kind clusters + ## [0.0.6] - 2026-03-23 ### Changed From 6382dcd61dc4d3ec377b65593535ede271fb85f4 Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Mon, 30 Mar 2026 11:29:36 -0400 Subject: [PATCH 2/8] updated v0.0.8 --- CHANGELOG.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index f2984c7..40faefc 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,13 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.0.8] - 2026-03-30 + +### Fixed + +- Fixed OOM errors in `modules/vegeta/run/run.sh` at high RPS by streaming attack output to a temp file before processing, reducing memory usage from O(n) to O(buffer_size) (e.g., 20k RPS for 3 minutes = 3.6M requests no longer causes OOMKilled) +- Added `sleep 2` delay after streaming vegeta results to jaggr to ensure the final time bucket is flushed before the pipe closes + ## [0.0.7] - 2026-03-26 ### Added From 600700bc8d9cb3eb3511057448c7dd08780964d6 Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Mon, 30 Mar 2026 12:02:20 -0400 Subject: [PATCH 3/8] changed from a fix to a change --- CHANGELOG.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 40faefc..7f453b0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,9 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [0.0.8] - 2026-03-30 -### Fixed +### Changed -- Fixed OOM errors in `modules/vegeta/run/run.sh` at high RPS by streaming attack output to a temp file before processing, reducing memory usage from O(n) to O(buffer_size) (e.g., 20k RPS for 3 minutes = 3.6M requests no longer causes OOMKilled) +- `modules/vegeta/run/run.sh` now streams attack output to a temp file before processing, reducing memory usage from O(n) to O(buffer_size) for high-RPS workloads - Added `sleep 2` delay after streaming vegeta results to jaggr to ensure the final time bucket is flushed before the pipe closes ## [0.0.7] - 2026-03-26 From 2b228f783a12ac41c8abb06b331cccbd11988803 Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Wed, 1 Apr 2026 14:39:23 -0400 Subject: [PATCH 4/8] update readme --- README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/README.md b/README.md index 4791f6c..924f365 100644 --- a/README.md +++ b/README.md @@ -117,6 +117,9 @@ docker run module/vegeta/install docker run module/vegeta/run --target-url http://localhost:8080 --rate 50 --duration 30s docker run module/kind/output host_port +# Merge multiple vegeta .bin files into per-second JSON +docker run merge --output-file merged.json pod0.bin pod1.bin pod2.bin + # Run the server docker run -p 3333:3333 server ``` @@ -138,6 +141,7 @@ This repository is a collection of modules that follow consistent patterns to cr - /install contains a `install.sh` script that installs the required tool - /run contains `run.sh` that run the tool. These can be functions and modules can contain many different functions for running - /output collects the output of the run into a standardized json file +- /merge (vegeta only) merges multiple raw `.bin` files from parallel pods into unified per-second JSON output - /test contains a `test.sh` script that tests and validates the module is working correctly Note: all modules expect to be **run from the root directory of this project**. From 7bf3e38067e9e651bc690aef2d34bf9188dfa1ea Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Wed, 1 Apr 2026 14:39:34 -0400 Subject: [PATCH 5/8] update claude --- CLAUDE.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index 41a3248..b6ffcc6 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -76,6 +76,7 @@ Each module under `modules/` follows the same interface: - `install/install.sh` — installs the tool - `run/run.sh` — runs the tool (can be sourced as a library or executed directly) - `output/output.sh` — reads from `../statefile.json` and emits results +- `merge/merge.sh` — (vegeta only) merges multiple `.bin` files into per-second JSON output using timestamp-based bucketing - `test/test.sh` — integration test Modules communicate via **statefile.json** (one per module, gitignored in practice). The kind module writes `{"cluster_name": "...", "host_port": "..."}`. The vegeta module writes streaming jaggr JSON (one line per second of the test). @@ -88,6 +89,7 @@ The Dockerfile entrypoint routes commands to scripts via prefix matching: - `install/ [args...]` → `scripts/install/.sh` - `setup/ [args...]` → `scripts/setup/.sh` - `module// [args...]` → `modules///.sh` +- `merge [args...]` → `modules/vegeta/merge/merge.sh` - `server` → starts the Go HTTP server ### Scenario interface @@ -96,7 +98,9 @@ Scenarios accept CLI arguments (`--ingress-url`, `--rate`, `--duration`, `--work ### Data flow for results -Vegeta attack → `vegeta encode` → `jaggr` (aggregates per-second) → `tee` to `modules/vegeta/statefile.json`. The scenario script then copies statefile content to the `--output-file`. The result file contains **one JSON line per second** of the test. The CI validation sums `code.hist["200"]` across **all lines** and fails only if the total is zero (i.e., no connection was ever successfully routed). +The vegeta run pipeline streams results in real time: `vegeta attack | tee statefile.bin | vegeta encode | jaggr | tee statefile.json`. This saves raw binary results to `.bin` alongside the per-second jaggr JSON. The scenario script then copies statefile.json content to the `--output-file`. The result file contains **one JSON line per second** of the test. The CI validation sums `code.hist["200"]` across **all lines** and fails only if the total is zero (i.e., no connection was ever successfully routed). + +For multi-pod tests, `modules/vegeta/merge/merge.sh` combines multiple `.bin` files into unified per-second JSON. It uses actual request timestamps for bucketing (not wall-clock time), so it correctly interleaves results from pods that started at slightly different times. The first and last second-buckets may be partial. ### Helm chart (charts/server) @@ -113,11 +117,11 @@ The server image is `ghcr.io/azure/aks-traffic-ingress-competitive-testing` (a G ### CI (validate.yaml) -The validation workflow runs on PRs to main. The `test-scenarios` job uses a matrix over `traffic: [ingress, gateway]` × `scenario: [basic-rps, restarting-backend-rps]` × `variant: [default]` — each combination gets its own runner and Kind cluster. An additional `scheduling-e2e` variant tests pod placement with node selectors and tolerations on a multi-node Kind cluster. Other jobs: module tests (matrix over discovered modules), chart validation (including scheduling render checks), and project structure validation. +The validation workflow runs on PRs to main. The `test-scenarios` job uses a matrix over `traffic: [ingress, gateway]` × `scenario: [basic-rps, restarting-backend-rps]` × `variant: [default]` — each combination gets its own runner and Kind cluster. An additional `scheduling-e2e` variant tests pod placement with node selectors and tolerations on a multi-node Kind cluster. The `test-merge` job validates multi-pod merge by running 4 simultaneous vegeta attacks against a Docker server and verifying the merged output (RPS totals, code histograms, JSON structure). Other jobs: module tests (matrix over discovered modules), chart validation (including scheduling render checks), and project structure validation. ## Conventions - All scripts expect to be **run from the repository root**. -- All shell scripts use `set -ex` (or `set -e` for library-style scripts). +- All shell scripts use `set -ex` (or `set -e` for library-style scripts). Scripts with data-processing pipelines also use `set -o pipefail` so that failures in any pipeline stage propagate correctly. - The `--traffic` flag accepts `ingress` or `gateway`. The `--scenario` flag accepts `basic-rps` or `restarting-backend-rps`. - Releases are driven by CHANGELOG.md — the Release workflow reads it to create GitHub releases. Follow [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) format. From 196a5121d145bf27ca737d71e528b34ae3513f58 Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Wed, 1 Apr 2026 14:39:58 -0400 Subject: [PATCH 6/8] update changelog to v0.0.9 --- CHANGELOG.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 7f453b0..90f520f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,21 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.0.9] - 2026-04-01 + +### Added + +- Multi-pod merge script (`modules/vegeta/merge/merge.sh`) that combines multiple vegeta `.bin` files into per-second JSON output using timestamp-based bucketing +- CI `test-merge` job that validates multi-pod merge by running 4 simultaneous vegeta attacks and verifying merged output (RPS, code histograms, JSON structure) +- `gawk` added to Dockerfile base image for merge script support +- `merge` command added to Docker entrypoint routing + +### Changed + +- `modules/vegeta/run/run.sh` now uses a streaming tee pipeline (`vegeta attack | tee binfile | vegeta encode | jaggr`) instead of writing to a temp file then replaying, saving raw `.bin` results alongside jaggr output +- Added `set -o pipefail` to `modules/vegeta/run/run.sh` and `modules/vegeta/merge/merge.sh` so pipeline failures propagate correctly +- Expanded vegeta module tests (`.bin` file production, per-second bucketing, single-file merge, multi-file merge, synthetic percentile verification) + ## [0.0.8] - 2026-03-30 ### Changed From f5ee8853f9701e1993dee67bd9e57720f8832ce8 Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Tue, 7 Apr 2026 16:05:19 -0400 Subject: [PATCH 7/8] update log to v0.0.10 --- CHANGELOG.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 90f520f..49ceaa1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,18 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.0.10] - 2026-04-07 + +### Changed + +- Increased wait timeouts in `scripts/setup/gateway.sh` and `scripts/setup/ingress.sh` to 30 minutes for large-scale deployments (e.g., 50,000 replicas), including resource creation checks, readiness polling, and Helm install retries +- Removed `--timeout` from `kubectl rollout status` in both setup scripts, relying on Kubernetes' built-in `progressDeadlineSeconds` (default 600s) to detect stalls while allowing arbitrarily long rollouts that are making progress +- Gateway and HTTPRoute readiness checks now fail hard instead of silently continuing when not ready + +### Fixed + +- Added `kubectl wait` for Istio gateway proxy pod in `scripts/setup/gateway.sh` to prevent port-forward failures when the proxy pod is still Pending + ## [0.0.9] - 2026-04-01 ### Added From 6a1be3ff8fa918ba7387ab47b2963fd1fc3e6a21 Mon Sep 17 00:00:00 2001 From: Mauricio Ferrari Date: Thu, 23 Apr 2026 09:35:22 -0400 Subject: [PATCH 8/8] updated log to v0.0.11 --- CHANGELOG.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 49ceaa1..0f3d6cf 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,14 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.0.11] - 2026-04-23 + +### Added + +- DNS test scripts (`scripts/setup/dns-ingresses.sh`, `scripts/setup/dns-httproutes.sh`, `scripts/cleanup/dns-ingresses.sh`, `scripts/cleanup/dns-httproutes.sh`) that bulk-create or delete N `Ingress` / `HTTPRoute` resources with unique `test-{i}.{domain}` hostnames (all labeled `dns-test=true`) for external-DNS reconciliation testing in downstream telescope pipelines +- CI `test-dns-resources` job with matrix over `[ingresses, httproutes]` covering positive (count=5) and negative paths (missing service, invalid domain, count=0) +- Docker entrypoint coverage and README/CLAUDE.md documentation for the new DNS subcommands + ## [0.0.10] - 2026-04-07 ### Changed