backend: Reuse a configured HTTP client in /externalproxy#5515
backend: Reuse a configured HTTP client in /externalproxy#5515VijayabaskarR-06 wants to merge 1 commit into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: VijayabaskarR-06 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
This PR improves the backend /externalproxy handler by reusing a single shared http.Client backed by a tuned http.Transport, addressing potential goroutine/FD leaks from slow or hanging upstreams and improving connection reuse under load.
Changes:
- Introduces a package-level
externalProxyClientwith transport-level timeouts and a larger per-host idle connection pool. - Updates the
/externalproxyhandler to use the shared client instead of allocating a newhttp.Clientper request. - Adds regression tests for response header timeout behavior and upstream connection reuse.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| backend/cmd/headlamp.go | Adds a shared, configured HTTP client and switches /externalproxy to use it. |
| backend/cmd/headlamp_test.go | Adds tests validating response-header timeout behavior and keep-alive connection reuse. |
illume
left a comment
There was a problem hiding this comment.
Thanks for these changes.
It looks like this PR has git conflicts. Can you please fix them?
How to resolve conflicts
Rebase or merge the latest main into your branch, resolve the conflicts, and push the updated branch.
Can you please address the open review comments? Once you've resolved each one, please mark it as resolved.
83ee5a2 to
aaa6c8d
Compare
illume
left a comment
There was a problem hiding this comment.
Thanks for this PR.
A few of the commits don't quite follow the project guidelines. We use Linux kernel style for git commits — have a look at the contributing guide and previous commits with git log.
Commits that need attention
backend: Address externalproxy review comments— Only one file changed insidebackend/; add a sub-area so it's clear what was touched (e.g.backend: ComponentName: description).
Commit guidelines
- Use atomic commits focused on a single change.
- Use the title format
<area>: <Description of changes>— description must start with a capital letter. - Keep the title under 72 characters (soft requirement).
- Explain the intention and why the change is needed.
- Make commit titles meaningful and describe what changed.
- Do not add code that a later commit rewrites; squash or reorder commits instead.
- Do not include
Fixes #NNin commit messages.
Good examples:
frontend: HomeButton: Fix so it navigates to homebackend: config: Add enable-dynamic-clusters flag
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
backend/cmd/headlamp.go:639
- externalProxyClient adds a ResponseHeaderTimeout, but the handler still reads the entire response body via io.ReadAll (and the upstream request is created with context.Background()). If the upstream sends headers and then stalls (or the downstream client disconnects), this goroutine can still block indefinitely and keep the connection/file descriptor open. Consider creating proxyReq with r.Context() and adding a per-request deadline/cancellation strategy for the body read (or stream the body to w instead of io.ReadAll) so slow/hung bodies can’t leak goroutines.
resp, err := externalProxyClient.Do(proxyReq) //nolint:gosec
if err != nil {
logger.Log(logger.LevelError, nil, err, "making request")
http.Error(w, err.Error(), http.StatusBadGateway)
illume
left a comment
There was a problem hiding this comment.
Thanks for working on this.
Would you mind addressing the open Copilot review comments? Please mark each comment as resolved after addressing it.
|
|
illume
left a comment
There was a problem hiding this comment.
Thanks for these changes.
Can you please have a look at the git commits to see if they meet the contribution guidelines? We use a Linux kernel style of git commits. See the contributing guide for general context, and please see previous git commits with git log for examples.
Commits that need attention
backend: Address externalproxy review comments— Only one file changed insidebackend/; add a sub-area so it's clear what was touched (e.g.backend: ComponentName: description).Potential fix for pull request finding— Missingarea: descriptionprefix — e.g.frontend: HomeButton: Fix so it navigates to homeorbackend: config: Add enable-dynamic-clusters flag.
Commit guidelines
- Use atomic commits focused on a single change.
- Use the title format
<area>: <Description of changes>— description must start with a capital letter. - Keep the title under 72 characters (soft requirement).
- Explain the intention and why the change is needed.
- Make commit titles meaningful and describe what changed.
- Do not add code that a later commit rewrites; squash or reorder commits instead.
- Do not include
Fixes #NNin commit messages.
Good examples:
frontend: HomeButton: Fix so it navigates to homebackend: config: Add enable-dynamic-clusters flag
It looks like the Contributor License Agreement (CLA) has not been signed. Kubernetes requires all contributors to sign the CLA before a PR can be merged. Could you please take a look at the EasyCLA bot comment and sign it?
About the CLA
The Kubernetes project uses the Linux Foundation EasyCLA system. Signing takes only a minute through the link in the EasyCLA bot comment. Individual contributors sign electronically; corporate contributors may need their employer to sign a Corporate CLA.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
backend/cmd/headlamp_test.go:597
- This test issues multiple HTTP requests without a context timeout. Adding a short per-request timeout will prevent the test suite from hanging indefinitely if the handler blocks or the test server fails to respond.
req, err := http.NewRequestWithContext(context.Background(), "GET", srv.URL+"/externalproxy", nil)
if err != nil {
t.Fatal(err)
}
req.Header.Set("proxy-to", upstream.URL)
resp, err := http.DefaultClient.Do(req)
if err != nil {
t.Fatal(err)
}
skoeva
left a comment
There was a problem hiding this comment.
hi! could you address the failing CI test? you can run make backend-lint-fix locally
9874dda to
55cd6a0
Compare
55cd6a0 to
085747f
Compare
085747f to
adaa419
Compare
illume
left a comment
There was a problem hiding this comment.
Thanks for this PR.
The open review comments from Copilot still need attention — can you have a look? Once addressed, please mark them as resolved.
adaa419 to
a99ff82
Compare
|
Addressed the Copilot comments the handler now forwards the Location header so redirect 30x responses stay usable client side, and TestExternalProxyDoesNotFollowRedirects asserts it. Also rebased onto latest main. go test ./. and golangci lint both pass. Threads resolved. /cc @illume |
illume
left a comment
There was a problem hiding this comment.
Thanks for these changes.
can you please rebase against main to remove the merge main commit?
Why this matters
Merge commits from main make the PR history harder to review. Please rebase your branch on top of the latest main instead, then update the PR with the rebased commits.
The /externalproxy handler created a new http.Client{} on every request,
with no timeout and the default transport. A slow or hanging upstream
would block the request goroutine indefinitely, leaking goroutines and
file descriptors, and the default per-host connection pool meant TCP
connections were rebuilt instead of reused under load.
Replace the per-request client with a single shared package-level
http.Client backed by a tuned http.Transport: ResponseHeaderTimeout cuts
off stalling upstreams, dial / TLS / expect-continue timeouts cover
handshake failures, and a larger idle-connection pool lets Keep-Alive
reuse sockets.
The shared client also stops following redirects. The default client
follows 30x responses automatically, so an allowed upstream could return
a redirect to a disallowed or internal address and the client would
fetch it server-side, bypassing the /externalproxy allowlist. CheckRedirect
returns http.ErrUseLastResponse so the 30x response is handed back to the
handler, which forwards its status and Location header to the caller; the
redirect can still be followed client-side but is never chased server-side.
Client.Timeout is deliberately not set: it includes body read time and
would break legitimate long streaming responses. The transport-level
timeouts protect against the slow-headers case without that side effect.
Tests cover the behaviours: TestExternalProxyResponseHeaderTimeout drives
a slow upstream and confirms the proxy returns 502 fast,
TestExternalProxyReusesConnections asserts the upstream sees roughly one
new TCP connection, and TestExternalProxyDoesNotFollowRedirects confirms
a redirect target outside the allowlist is never fetched and that the
Location header is forwarded so the caller can still follow it.
1d70768 to
47df346
Compare
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Summary
The /externalproxy handler was newing up an http.Client{} on every incoming request, with no timeout and no custom transport. That meant a slow or hanging upstream would block the request goroutine forever, slowly leaking goroutines and file descriptors, and the default transport's pool limits applied per-host so under load the same TCP connections were getting torn down and rebuilt instead of reused.
This PR replaces that per-request client with a single shared package-level http.Client backed by a tuned http.Transport. The transport sets a ResponseHeaderTimeout so stalling upstreams get cut off, dial / TLS / continue timeouts so handshake failures don't hang either, and a larger idle-connection pool so Keep-Alive can actually reuse sockets.
I deliberately avoided setting a Client.Timeout because that includes body read time and would break legitimate long streaming responses (logs, anything tailed). The transport-level timeouts protect against the slow-headers case without that side effect, which is the same approach httputil.ReverseProxy takes.
Related Issue
Fixes #5512
Changes
Steps to Test
Screenshots (if applicable)
N/A, backend-only change.
Notes for the Reviewer