server: Set a pprof label on new stream goroutines#9082
Conversation
5b63d30 to
59a7845
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #9082 +/- ##
==========================================
- Coverage 83.30% 83.08% -0.23%
==========================================
Files 413 413
Lines 33482 33512 +30
==========================================
- Hits 27892 27842 -50
- Misses 4190 4243 +53
- Partials 1400 1427 +27
🚀 New features to boost your workflow:
|
To make stack-traces and some profiles more useful, change sets goroutine labels indicating which gRPC method is being handled. Goroutine labels are inherited by child goroutines, so this provides useful context in profiles and traces for work that's been farmed out to child goroutines. These currently show up in three places: - trace labels in pprof CPU profiles - trace labels in runtime/pprof (and http/pprof) debug=0 goroutine profiles (which are pprof format) - debug=1 aggregated text-format goroutine profiles For Go 1.27, golang/go#76349 adds goroutine labels to tracebacks and by extension debug=2 pprof text-based profiles for go 1.27+ modules. The naming of the goroutine label currently matches the opencensus RPC method tag, and has some similarities to the current proposal for goroutine tag naming for tests in golang/go#75047. I.e. this uses `grpc.server.method`. This change avoids setting anything on the client side due to the lower utility and goroutine lifetime issues. Include a GRPC_SERVER_METHOD_GOROUTINE_LABELS environment variable to allow users to easily opt-out of setting these goroutine labels. RELEASE NOTES: * server: Set runtime/pprof goroutine labels for incoming method streams. This may be disabled with the GRPC_SERVER_METHOD_GOROUTINE_LABELS=false environment variable. Fixes grpc#9010
59a7845 to
84dda06
Compare
| if envconfig.LabelServerStreamGoroutines { | ||
| // This method always runs in its own goroutine, so we can set a | ||
| // goroutine label without needing to restore a previous context. | ||
| ctx = pprof.WithLabels(ctx, pprof.Labels("grpc.server.method", stream.Method())) |
There was a problem hiding this comment.
Can we use the key "grpc.method" to be consistent with the OpenTelemetry plugins? gRPC is moving away from OpenCensus in favour of OpenTelemetry. See https://grpc.io/docs/guides/opentelemetry-metrics/#server-instruments and https://grpc.io/docs/guides/opentelemetry-metrics/#labelsattributes
| // LabelServerStreamGoroutines controls setting [runtime/pprof.Labels] on the | ||
| // goroutines spawned to handle incoming requests on the server. | ||
| // Set "GRPC_SERVER_METHOD_GOROUTINE_LABELS" to "false" to disable. | ||
| LabelServerStreamGoroutines = boolFromEnv("GRPC_SERVER_METHOD_GOROUTINE_LABELS", true) |
There was a problem hiding this comment.
Thinking ahead to when we might have more labels, maintaining one environment variable per label seems tedious. Could we instead use a single variable whose value is a comma-separated list of key-value pairs? For example: GRPC_GO_PPROF_LABELS_ENABLED="grpc.method=false".
This would be similar to the parsing of the GODEBUG environment variable
Add support for setting a comma-separated list of goroutine labels to enable/disable. Converting this configuration from a bool to a bitfield with a default instead of a solitary boolean made it clear that there needed to be ways to robustly enable disable and toggle the behavior. Tests will be in another commit.
Add a couple test-cases that verify that the contexts include the expected goroutine labels.
Add a test covering the goroutine label configuration parsing.
Simplify the new goroutine label tests a touch by using testutils.SetEnvConfig instead of directly setting the env var package-level vars.
Use strings.EqualFold instead of strconv.ParseBool and make some simplifications and clarifications. (rename entVal to bitDesignator)
Set the subtest name with tc.name in TestGoroutineLabelsFromEnv.
Include a trailing '.' at the end of the complete sentences.
dfinkel
left a comment
There was a problem hiding this comment.
Thanks for the excellent comments!
Here's the output of the benchmarks. It looks like it's adding ~3 new allocations, which is about a 2% increase, and that's causing a similar throughput reduction. (I was hoping it would be less noticeable than that.)
% go run benchmark/benchresult/main.go unary-before unary-after
unary-networkMode_Local-bufConn_false-keepalive_false-benchTime_1m0s-trace_false-latency_0s-kbps_0-MTU_0-maxConcurrentCalls_200-reqSize_100B-respSize_100B-compressor_off-channelz_false-preloader_false-clientReadBufferSize_-1-clientWriteBufferSize_-1-serverReadBufferSize_-1-serverWriteBufferSize_-1-sleepBetweenRPCs_0s-connections_1-recvBufferPool_simple-sharedWriteBuffer_true
Title Before After Percentage
TotalOps 9535615 9316870 -2.29%
SendOps 0 0 NaN%
RecvOps 0 0 NaN%
Bytes/op 9936.66 10045.13 1.10%
Allocs/op 143.39 146.40 2.09%
ReqT/op 127141533.33 124224933.33 -2.29%
RespT/op 127141533.33 124224933.33 -2.29%
50th-Lat 1.191891ms 1.229903ms 3.19%
90th-Lat 1.560204ms 1.594499ms 2.20%
99th-Lat 2.221238ms 2.261885ms 1.83%
Avg-Lat 1.257797ms 1.287178ms 2.34%
GoVersion go1.26.2 go1.26.2
GrpcVersion 1.82.0-dev 1.82.0-dev
|
The benchmark's reported QPS typically fluctuates by ±1%, even with identical code. Using the I discussed this with the other maintainers, and we concluded that enabling this by default would cause minor regressions for all users, even those who aren't profiling or using these new labels. While the extra allocations seem unavoidable, developers who need the metadata will likely be willing to pay the performance cost. Therefore, we should make this feature opt-in rather than opt-out. Also, since users currently have no way to toggle all labels at once, I think we should support special keywords like "all" and "none" as valid environment variable values to simplify this experience. |
Make it easy to opt into all goroutine labels now that we're keeping it default disabled.
Add a "none" special value that pairs with "all" so users can opt out of any goroutine labels that happen to have been enabled elsewhere.
Another commit appears to have removed this import while this change was in-flight. Add it back now that it's needed again.
|
Ok, I've updated my branch to default-disabled, updated the PR description/release-notes and merged master because the envconfig import disappeared in a merged PR which caused the tests to fail in CI. |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism to set runtime/pprof labels on gRPC server goroutines, specifically enabling the grpc.method label via the GRPC_GO_SERVER_GOROUTINE_LABELS environment variable. The implementation includes a bitfield-based configuration and comprehensive tests. Feedback highlights a compatibility issue where the use of strings.SplitSeq and range-over-func syntax breaks support for Go 1.22, suggesting a fallback to strings.Split. Additionally, it was noted that the 'all' and 'none' keywords are currently only recognized as standalone values and should ideally be supported within comma-separated lists for more flexible configuration.
arjan-bal
left a comment
There was a problem hiding this comment.
LGTM! Adding a second reviewer.
|
Thanks for the thorough review! |
| // requests on the server. | ||
| // Set "GRPC_GO_SERVER_GOROUTINE_LABELS" to "grpc.method=true" to | ||
| // enable this grpc.method label, or "all" to enable all valid labels. | ||
| // This variable is a bit-field. |
There was a problem hiding this comment.
Nit: Does it make sense to get rid of this last sentence? I'm wondering if it makes sense for a user reading this docstring.
There was a problem hiding this comment.
I'm not sure.
I think it's rare that a normal user would look at the doc-comments in this package, but I added that info to this doc-comment because some of the other package-level variables in this package have similar comments.
I'm definitely open to deleting this comment or moving some of the syntax/usage info to a doc-comment on goroutineLabelsFromEnv.
@arjan-bal , what do you think?
| ctxLabels := pprofCtxCollectLabels(ctx) | ||
| if val, ok := ctxLabels["grpc.method"]; !ok { | ||
| t.Errorf("missing \"grpc.method\" label; found labels: %v", ctxLabels) | ||
| } else if expVal := "/grpc.testing.TestService/EmptyCall"; val != expVal { |
There was a problem hiding this comment.
Nit: Here and elsewhere s/expVal/wantVal
See: https://google.github.io/styleguide/go/decisions#got-before-want
| _, err := ss.Client.EmptyCall(ctx, &testpb.Empty{}) | ||
| if err != nil { |
There was a problem hiding this comment.
Nit: The assignment and the conditional can be on the same line. Here and elsewhere in this test.
Fixes #9010
To make stack-traces and some profiles more useful, change sets goroutine labels indicating which gRPC method is being handled.
Goroutine labels are inherited by child goroutines, so this provides useful context in profiles and traces for work that's been farmed out to child goroutines.
These currently show up in three places:
For Go 1.27, golang/go#76349 adds goroutine labels to tracebacks and by extension debug=2 pprof text-based profiles for go 1.27+ modules.
The naming of the goroutine label currently matches the opentelemetry RPC method tag, and has some similarities to the current proposal for goroutine tag naming for tests in golang/go#75047. I.e. this uses
grpc.method.This change avoids setting anything on the client side due to the lower utility and goroutine lifetime issues.
Include a
GRPC_GO_SERVER_GOROUTINE_LABELSenvironment variable to allow users to easily opt-in of setting these goroutine labels. The value the formgrpc.method=trueto enable a specific goroutine label, and has special values ofallandnonewhich enable and disable all registered goroutine labels respectively.RELEASE NOTES:
GRPC_GO_SERVER_GOROUTINE_LABELS=grpc.method=trueenvironment variable for just the new goroutine label orGRPC_GO_SERVER_GOROUTINE_LABELS=allto enable all goroutine labels that might be added later.noneis available for blanket disabling, as well.