Skip to content
Snippets Groups Projects
Commit b544cfe0 authored by Iustin Pop's avatar Iustin Pop
Browse files

Reduce chance of ssh failures in verify cluster

The cluster verify builds a sorted list of nodes and passes that to all
the nodes (in parallel) for ssh checks. This means that for a cluster
with N nodes, there will be approximately N simultaneous connections to
the first node, then to the second node, etc. This, coupled with the
ssh daemon's “MaxStartups” parameter, can create false alarms about ssh
connectivity.

This patch randomizes the node list in the backend (therefore, each node
should have it's own order of ssh-ing to the other nodes) and the chance
of these alarms should be reduced.

Reviewed-by: ultrotter
parent 6c896e2f
No related branches found
No related tags found
No related merge requests found
...@@ -30,6 +30,7 @@ import stat ...@@ -30,6 +30,7 @@ import stat
import errno import errno
import re import re
import subprocess import subprocess
import random
from ganeti import logger from ganeti import logger
from ganeti import errors from ganeti import errors
...@@ -200,6 +201,7 @@ def VerifyNode(what): ...@@ -200,6 +201,7 @@ def VerifyNode(what):
if 'nodelist' in what: if 'nodelist' in what:
result['nodelist'] = {} result['nodelist'] = {}
random.shuffle(what['nodelist'])
for node in what['nodelist']: for node in what['nodelist']:
success, message = _GetSshRunner().VerifyNodeHostname(node) success, message = _GetSshRunner().VerifyNodeHostname(node)
if not success: if not success:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment