Skip to content

storage: don't crash when applying ChangeReplicas trigger with DeprecatedNextReplicaID #41148

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

nvanbenschoten
Copy link
Contributor

Fixes #41145.

This bug was introduced in #40892.

This may force us to pick a new SHA for the beta. Any ChangeReplicas
Raft entry from 19.1 or before is going to crash a node without it.

Release justification: fixes a crash in mixed version clusters.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice test. Thanks for picking this up

Reviewed 1 of 2 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/storage/replica_application_state_machine.go, line 675 at r1 (raw file):

		// providing a new range descriptor directly, which includes this info.
		var nextReplID roachpb.ReplicaID
		if change.Desc != nil {

I could see adding a method to the ChangeReplicasTrigger to hide this migration.

…atedNextReplicaID

Fixes cockroachdb#41145.

This bug was introduced in cockroachdb#40892.

This may force us to pick a new SHA for the beta. Any ChangeReplicas
Raft entry from 19.1 or before is going to crash a node without it.

Release justification: fixes a crash in mixed version clusters.

Release note: None
@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/changeReplSM branch from 8f9657e to 15f5b81 Compare September 27, 2019 03:02
Copy link
Contributor Author

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell)


pkg/storage/replica_application_state_machine.go, line 675 at r1 (raw file):

Previously, ajwerner wrote…

I could see adding a method to the ChangeReplicasTrigger to hide this migration.

This is only used here, so I think I'd rather call it out where it's needed.

craig bot pushed a commit that referenced this pull request Sep 27, 2019
41148: storage: don't crash when applying ChangeReplicas trigger with DeprecatedNextReplicaID r=nvanbenschoten a=nvanbenschoten

Fixes #41145.

This bug was introduced in #40892.

This may force us to pick a new SHA for the beta. Any ChangeReplicas
Raft entry from 19.1 or before is going to crash a node without it.

Release justification: fixes a crash in mixed version clusters.

Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig
Copy link
Contributor

craig bot commented Sep 27, 2019

Build succeeded

@craig craig bot merged commit 15f5b81 into cockroachdb:master Sep 27, 2019
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Sep 27, 2019
… trigger

Fixes cockroachdb#41155.

The fix in cockroachdb#41148 avoided a crash when staging a ChangeReplicas trigger with
a DeprecatedNextReplicaID in an application batch, but there was another bug
where applying the side-effects of such a command still caused a crash. This
commit fixes the crash and extends the test added in cockroachdb#41148 to go through the
whole process of applying the command (which would have caught the second
crash as well).

Release justification: fixes a crash in mixed version clusters.

Release note: None
craig bot pushed a commit that referenced this pull request Sep 27, 2019
41171: storage: don't crash when applying side-effects of old ChangeReplicas trigger r=nvanbenschoten a=nvanbenschoten

Fixes #41155.
Fixes #41147.

The fix in #41148 avoided a crash when staging a ChangeReplicas trigger with
a DeprecatedNextReplicaID in an application batch, but there was another bug
where applying the side-effects of such a command still caused a crash. This
commit fixes the crash and extends the test added in #41148 to go through the
whole process of applying the command (which would have caught the second
crash as well).

Release justification: fixes a crash in mixed version clusters.

Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
@nvanbenschoten nvanbenschoten deleted the nvanbenschoten/changeReplSM branch October 14, 2019 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

roachtest: version/mixed/nodes=5 failed
4 participants