Summary
The Kafka and MySQL large-buffer handlers delete their per-connection LRU map entry and then continue reading the returned map-value pointer to finish event construction.
Impact
This is a real lifetime bug in product eBPF code, but the impact supported by the current code is low severity. The stale data is limited to small protocol header fragments that can be copied into EVENT_TCP_LARGE_BUFFER telemetry, causing corrupted telemetry or disclosure of a few stale bytes to telemetry consumers.
Environment
- First identified in: 281748d
- Reviewed against current
main commit: d51a98d
Evidence
Steps to reproduce
-
Build the BPF programs and the agent from the repository root:
make generate
make compile
-
Add temporary debug prints to the two delete-then-use sites, then rebuild. For MySQL, insert the following block in bpf/generictracer/protocol_mysql.h immediately before and after bpf_map_delete_elem(&mysql_state, &pid_conn->conn);:
if (state_data) {
bpf_dbg_printk("mysql before delete: ptr=%llx bytes=%02x %02x %02x %02x",
state_data,
((const unsigned char *)state_data)[0],
((const unsigned char *)state_data)[1],
((const unsigned char *)state_data)[2],
((const unsigned char *)state_data)[3]);
bpf_map_delete_elem(&mysql_state, &pid_conn->conn);
bpf_dbg_printk("mysql after delete: ptr=%llx bytes=%02x %02x %02x %02x",
state_data,
((const unsigned char *)state_data)[0],
((const unsigned char *)state_data)[1],
((const unsigned char *)state_data)[2],
((const unsigned char *)state_data)[3]);
__builtin_memcpy(lb->buf, state_data, sizeof(*state_data));
lb->len = sizeof(*state_data);
}
-
In bpf/generictracer/protocol_kafka.h, insert the following block immediately before and after bpf_map_delete_elem(&kafka_state, &state_key);:
if (state_data && state_data->message_size > 0 && (u32)state_data->message_size == bytes_len) {
bpf_dbg_printk("kafka before delete: ptr=%llx size=%d",
state_data,
state_data->message_size);
bpf_map_delete_elem(&kafka_state, &state_key);
bpf_dbg_printk("kafka after delete: ptr=%llx size=%d",
state_data,
state_data->message_size);
const s32 message_size_be = bpf_htonl(state_data->message_size);
__builtin_memcpy(lb->buf, &message_size_be, k_kafka_hdr_message_size);
}
-
Rebuild and restart the agent so the new bpf_dbg_printk statements are loaded:
make generate
make compile
Run the agent with protocol debug enabled if your local setup supports it so the large-buffer chunk contents are also printed from userspace.
-
Start a TCP fragmenting proxy that splits the protocol header into its own short read. The following Python proxy is sufficient for both MySQL and Kafka because it forwards the first N bytes as one write, sleeps briefly, then forwards the remainder:
import selectors
import socket
import sys
import time
LISTEN_HOST = "127.0.0.1"
LISTEN_PORT = int(sys.argv[1])
TARGET_HOST = sys.argv[2]
TARGET_PORT = int(sys.argv[3])
SPLIT_BYTES = int(sys.argv[4])
def forward_once(src, dst, split_bytes):
data = src.recv(65535)
if not data:
return False
head = data[:split_bytes]
tail = data[split_bytes:]
if head:
dst.sendall(head)
time.sleep(0.05)
if tail:
dst.sendall(tail)
return True
lsock = socket.socket()
lsock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
lsock.bind((LISTEN_HOST, LISTEN_PORT))
lsock.listen(1)
client, _ = lsock.accept()
upstream = socket.create_connection((TARGET_HOST, TARGET_PORT))
sel = selectors.DefaultSelector()
sel.register(client, selectors.EVENT_READ, ("client", client, upstream))
sel.register(upstream, selectors.EVENT_READ, ("server", upstream, client))
while True:
for key, _ in sel.select():
direction, src, dst = key.data
split = SPLIT_BYTES if direction == "server" else 65535
if not forward_once(src, dst, split):
sys.exit(0)
-
Use the proxy with a split size that matches the saved header state:
python3 fragment_proxy.py 3307 127.0.0.1 3306 4
python3 fragment_proxy.py 9093 127.0.0.1 9092 4
The MySQL command makes the first 4 bytes of a server packet arrive separately from the next bytes. The Kafka command makes the 4-byte response size prefix arrive separately from the rest of the response body.
-
Drive traffic through the proxy from an instrumented client:
mysql --host 127.0.0.1 --port 3307 -u <user> -p -e 'select 1'
kcat -b 127.0.0.1:9093 -t <topic> -C -c 1
Any equivalent MySQL request or Kafka fetch is fine as long as the response passes through the fragmenting proxy.
-
Observe the BPF debug output. A successful reproduction will show the same state_data pointer being logged before and after bpf_map_delete_elem(...), with the second log still dereferencing that pointer after deletion. Example output shape:
mysql before delete: ptr=0xffff... bytes=01 00 00 01
mysql after delete: ptr=0xffff... bytes=01 00 00 01
kafka before delete: ptr=0xffff... size=128
kafka after delete: ptr=0xffff... size=128
-
Confirm in the userspace protocol-debug output for EVENT_TCP_LARGE_BUFFER that the first emitted chunk begins with the same 4 bytes that were read from the deleted map value. That is the observable consequence of the bug:
- MySQL: the first large-buffer chunk starts with the 4-byte saved header fragment.
- Kafka: the first large-buffer chunk starts with the 4-byte saved
message_size field.
Suggested Fix Direction
Do not read map-value state after bpf_map_delete_elem. Copy the needed header fields into stack locals before deleting the map entry, then build the large-buffer event from those locals.
Acceptance Criteria
- Kafka and MySQL large-buffer paths never dereference a map-value pointer after deleting its map entry.
- Any needed header state is copied before deletion and used from stack-local storage afterward.
- Tests cover the split-header cleanup path for both protocols.
Note
I have reviewed this issue before posting it. It was identified by OpenAI Codex, and the draft was prepared with its assistance, but it may still contain mistakes, missing context, or incorrect conclusions. Please independently validate the behavior, impact, and proposed fix before acting on it.
Summary
The Kafka and MySQL large-buffer handlers delete their per-connection LRU map entry and then continue reading the returned map-value pointer to finish event construction.
Impact
This is a real lifetime bug in product eBPF code, but the impact supported by the current code is low severity. The stale data is limited to small protocol header fragments that can be copied into
EVENT_TCP_LARGE_BUFFERtelemetry, causing corrupted telemetry or disclosure of a few stale bytes to telemetry consumers.Environment
maincommit: d51a98dEvidence
mysql_stateas an LRU hash map.mysql_stateand then copies fromstate_data.kafka_stateas an LRU hash map.kafka_stateand then readsstate_data->message_size.s32 message_sizefield.Steps to reproduce
Build the BPF programs and the agent from the repository root:
Add temporary debug prints to the two delete-then-use sites, then rebuild. For MySQL, insert the following block in
bpf/generictracer/protocol_mysql.himmediately before and afterbpf_map_delete_elem(&mysql_state, &pid_conn->conn);:In
bpf/generictracer/protocol_kafka.h, insert the following block immediately before and afterbpf_map_delete_elem(&kafka_state, &state_key);:Rebuild and restart the agent so the new
bpf_dbg_printkstatements are loaded:Run the agent with protocol debug enabled if your local setup supports it so the large-buffer chunk contents are also printed from userspace.
Start a TCP fragmenting proxy that splits the protocol header into its own short read. The following Python proxy is sufficient for both MySQL and Kafka because it forwards the first
Nbytes as one write, sleeps briefly, then forwards the remainder:Use the proxy with a split size that matches the saved header state:
The MySQL command makes the first 4 bytes of a server packet arrive separately from the next bytes. The Kafka command makes the 4-byte response size prefix arrive separately from the rest of the response body.
Drive traffic through the proxy from an instrumented client:
Any equivalent MySQL request or Kafka fetch is fine as long as the response passes through the fragmenting proxy.
Observe the BPF debug output. A successful reproduction will show the same
state_datapointer being logged before and afterbpf_map_delete_elem(...), with the second log still dereferencing that pointer after deletion. Example output shape:Confirm in the userspace protocol-debug output for
EVENT_TCP_LARGE_BUFFERthat the first emitted chunk begins with the same 4 bytes that were read from the deleted map value. That is the observable consequence of the bug:message_sizefield.Suggested Fix Direction
Do not read map-value state after
bpf_map_delete_elem. Copy the needed header fields into stack locals before deleting the map entry, then build the large-buffer event from those locals.Acceptance Criteria
Note
I have reviewed this issue before posting it. It was identified by OpenAI Codex, and the draft was prepared with its assistance, but it may still contain mistakes, missing context, or incorrect conclusions. Please independently validate the behavior, impact, and proposed fix before acting on it.