Cluster verify: make "instance runs in wrong node" node-driven
Previously, the "instance should not be running in this node" error was computed by verifying, for each instance, whether any node other than its primary was running it. But this is not a well-suited approach if we were to shard cluster verification (because, for each instance, we won't have information whether it's running *outside* the current set of nodes). By reversing the logic of the check, and asking instead, for each node, "is it running any instance for which it's not primary", we catch all occurrences of the problem even if running sharded. Because of this, we can also detect orphan instances at the same time (instances that are not known in the cluster config). We warn about them here too, and drop the later _VerifyOrphanInstances check. Signed-off-by:Adeodato Simo <dato@google.com> Signed-off-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
Loading
Please register or sign in to comment