-
Notifications
You must be signed in to change notification settings - Fork 39k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added serialization from etcd error metric #114376
Added serialization from etcd error metric #114376
Conversation
Hi @baomingwang. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign @deads2k @lavalamp |
/ok-to-test |
/priority important-soon |
should we have a test that flips a bit in etcd, then calls storage accessor methods and ensures this metric increments? also, we decode in paths other than get and list (on every write, in watch, etc). should this metric get incremented if decode errors on existing data are encountered in those methods? |
I did some local testing to have targeting metric storage_decode_errors emitted by replace protoEncodingPrefix from []byte{0x6b, 0x38, 0x73, 0x00} to []byte{0x6b, 0x38, 0x73, 0x01}.
Is it enough?
Really nice callout. You remind me that it would be better to capture decode error inside of |
a16529c
to
88631a3
Compare
/lgtm will wait for @liggitt to remove hold! |
LGTM label has been added. Git tree hash: d9afc39922b772c077743e9e96bc034811c69380
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plumbing data to decode and recording the metric / error there makes sense. If we're intending to log the actual path in etcd, we should use preparedKey
in a bunch of places
88631a3
to
105e5f4
Compare
105e5f4
to
cac50a6
Compare
cac50a6
to
92c490f
Compare
/retest |
/lgtm |
LGTM label has been added. Git tree hash: 3ba83b0c0ca16d0f6403c6eaad6c68188c7aca18
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: baomingwang, dims, liggitt, logicalhan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass. This bot retests PRs for certain kubernetes repos according to the following rules:
You can:
/retest |
Cherry-pick of upstream Kubernetes PR kubernetes#114376
What type of PR is this?
/kind bug
/sig api-machinery
What this PR does / why we need it:
To revive previous PR #90612 for capturing data corruption issue #69579
Able to manually flip one bit and have targeting metric
storage_decode_errors
emitted by replaceprotoEncodingPrefix
from[]byte{0x6b, 0x38, 0x73, 0x00}
to[]byte{0x6b, 0x38, 0x73, 0x01}
.kube-apiserver log with --v=4 output
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: