From 7c38140f9a35ba8e679dc27e30f592068811cd69 Mon Sep 17 00:00:00 2001 From: Travis Downs Date: Thu, 14 May 2026 19:01:46 -0400 Subject: [PATCH] [c++] avrogencpp: emit deterministic include guard CodeGen::guard() in avrogencpp.cc was suffixing the generated header's include guard with the output of std::mt19937 seeded from ::time(nullptr). That produced a different guard on every avrogen invocation, e.g.: #ifndef FOO_AVROGEN_H_3350718792_H #ifndef FOO_AVROGEN_H_2362587291_H Two consequences: 1. Generated headers were non-deterministic. Repeated runs on the same schema produced different bytes, which is surprising for a codegen and makes side-by-side diff / git review difficult. 2. Build systems that key their cache on input-content digests (e.g. Bazel's remote cache, the Nix store) saw every consumer of the generated header miss the cache on every build, even when the schema was byte-identical. In a hermetic two-output-base Bazel build of a downstream project, this surfaced as a chain of cascade rebuilds that started at manifest_file.avrogen.h and propagated through every .cc that included it. headerFile_ is already guaranteed-unique per output (it's the path of the file we're about to write). makeCanonical(h, true) turns it into a valid C identifier, which is already a fine guard name on its own; the random suffix doesn't add uniqueness, only entropy. --- lang/c++/impl/avrogencpp.cc | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/lang/c++/impl/avrogencpp.cc b/lang/c++/impl/avrogencpp.cc index d6e914e9657..66281389f00 100644 --- a/lang/c++/impl/avrogencpp.cc +++ b/lang/c++/impl/avrogencpp.cc @@ -805,7 +805,14 @@ void CodeGen::emitGeneratedWarning() { string CodeGen::guard() { string h = headerFile_; makeCanonical(h, true); - return h + "_" + std::to_string(random_()) + "_H"; + // headerFile_ is already a unique-per-output path, so the canonicalised + // form is already a valid, unique include guard. Avoid mixing in a + // time-seeded RNG here so the generated output is byte-deterministic + // across invocations -- otherwise build systems that key their cache on + // input-content digests (e.g. Bazel remote cache, Nix store paths) end + // up rebuilding every downstream consumer on every invocation, even on + // byte-identical schemas. + return h + "_H"; } void CodeGen::generate(const ValidSchema &schema) {