-
Notifications
You must be signed in to change notification settings - Fork 3.9k
storage: fix nil pointer panic during replica changes #40961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: fix nil pointer panic during replica changes #40961
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix. I wish this stuff were easier to test.
Reviewed 3 of 3 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @nvanbenschoten)
pkg/storage/client_replica_test.go, line 2311 at r1 (raw file):
// We're going to have one goroutine trying to ADD the 4th node. // and another goroutine trying to move out of a joint config on both // sides and then
Sentence needs an end.
pkg/storage/client_replica_test.go, line 2320 at r1 (raw file):
require.NoError(t, err) // Manually construct the batch because the (*DB).AdminChangeReplicas does
It does, but you need this magical hack
context.WithValue(ctx, "testing", "testing"), |
c060a7b
to
f50f5d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized that there's a bad case if we read a range descriptor for a different range. I added testing for that too.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @tbg)
pkg/storage/client_replica_test.go, line 2311 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Sentence needs an end.
Done.
pkg/storage/client_replica_test.go, line 2320 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
It does, but you need this magical hack
context.WithValue(ctx, "testing", "testing"),
Done.
f50f5d8
to
ffccfde
Compare
Before this PR there was a bug whereby a command attempting to move a range out of a joint config would fail to find a RangeDescriptor for a range because it was racing with a merge which destroyed that range. The code used to panic. This fix is simply detecting the nil and not attempting to move out of a joint config on a range that no longer exists. Another slightly less problematic but still wrong case was if the merged range had been resplit at the same key. In this case we might attempt to modify a different range. This PR is almost exclusively testing. Fixes cockroachdb#40877. Release justification: Fixes a panic. Release note: None
ffccfde
to
1f55bb1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @nvanbenschoten)
pkg/storage/replica_command.go, line 1547 at r2 (raw file):
// NB: We might fail to find the range if the range has been merged away // in which case we definitely want to fail the check below. if kvDesc != nil && kvDesc.RangeID == referenceDesc.RangeID && chgs.leaveJoint() {
Good catch.
TFTR! bors r+ |
40961: storage: fix nil pointer panic during replica changes r=ajwerner a=ajwerner Before this PR there was a bug whereby a command attempting to move a range out of a joint config would fail to find a RangeDescriptor for a range because it was racing with a merge which destroyed that range. The code used to panic. This fix is simply detecting the nil and not attempting to move out of a joint config on a range that no longer exists. This PR is almost exclusively testing. Fixes #40877. Release justification: Fixes a panic. Release note: None Co-authored-by: Andrew Werner <[email protected]>
Build succeeded |
Before this PR there was a bug whereby a command attempting to move a range out
of a joint config would fail to find a RangeDescriptor for a range because it
was racing with a merge which destroyed that range. The code used to panic.
This fix is simply detecting the nil and not attempting to move out of a
joint config on a range that no longer exists.
This PR is almost exclusively testing.
Fixes #40877.
Release justification: Fixes a panic.
Release note: None