- Feb 18, 2011
-
-
Iustin Pop authored
And also enable verbose display via the, well, verbose option. Man page and tests are updated, and the formatting is moved from 4 if statements to a data structure. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Stephen Shirley authored
Signed-off-by:
Stephen Shirley <diamond@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Stephen Shirley authored
Signed-off-by:
Stephen Shirley <diamond@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Stephen Shirley authored
Signed-off-by:
Stephen Shirley <diamond@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Feb 17, 2011
-
-
Iustin Pop authored
Currently, there is at least one LU that does wrong validation of HV parameters (against all nodes, LUClusterSetParams). It's possible to fix this case, but I went and modified the base functions to filter out non-vm_capable nodes so all callers are protected. Note: the _CheckOSParams function is never called with all nodes list, so modifying it shouldn't be needed. However, I think it's safe to do so (and it shouldn't hurt as an instance's node shouldn't ever lack the vm_capable bit). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Since we don't have the data per design, UNAVAIL is appropriate here, while NODATA is not. The patch also adds a comment: if we extend the live fields list to contain other data in the future, we need to reevaluate this solution. This should fix issue 143. The listing now shows (node2==ofline, node3==not vm_capable): Node DTotal DFree MTotal MNode MFree Pinst Sinst node1 698.6G 630.5G 32.0G 1.0G 30.0G 8 7 node2 (offline) (offline) (offline) (offline) (offline) 9 4 node3 (unavail) (unavail) (unavail) (unavail) (unavail) 0 0 Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
Because non-vm_capable nodes most likely don't have a hypervisor configured and/or storage, so the call will fail anyway. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Stephen Shirley authored
The condition is already covered by the previous requirement. Signed-off-by:
Stephen Shirley <diamond@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Stephen Shirley authored
Prevents lots of spurious warnings like: 2011-02-10 17:00:22,776: CRITICAL Configuration data is not consistent: Not enough master candidates: actual 3, target 4 Signed-off-by:
Stephen Shirley <diamond@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Stephen Shirley authored
ECID was being calculated completely differently in __MergeNodeGroups() and _MergeConfig() Signed-off-by:
Stephen Shirley <diamond@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Feb 16, 2011
-
-
Iustin Pop authored
The “-A” (use agent) was not documented, and instead of adding manual listing, I converted it to optparse like the other CLI tools. Note that I cleaned up a bit the usage and help texts. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
By delaying the agent key query until after the fork, we prevent the problem of simultaneous access to the agent. Tested that it works against 80 hosts in parallel without error; the current version breaks already at 20 hosts. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Feb 14, 2011
-
-
Stephen Shirley authored
This reverts commit c0711f2c. Signed-off-by:
Stephen Shirley <diamond@google.com> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Stephen Shirley authored
cli.RunWhileClusterStopped() stops noded on all of the nodes in the original cluster. This prevents /etc/hosts updates on the master, and config redistribution doesn't reach the other nodes in the original cluster. As all we want to do is merge while the master is stopped, simply stop it and start it again after. Signed-off-by:
Stephen Shirley <diamond@google.com> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Feb 09, 2011
-
-
Iustin Pop authored
Currently, for both primary and secondary offline nodes, we give the same message: - ERROR: instance instance14: instance lives on offline node(s) node3 - ERROR: instance instance15: instance lives on offline node(s) node3 - ERROR: instance instance16: instance lives on offline node(s) node3 - ERROR: instance instance17: instance lives on offline node(s) node3 This is confusing, as an offline primary is in a different category than a secondary. The patch changes the warnings to have different error messages: - ERROR: instance instance14: instance has offline secondary node(s) node3 - ERROR: instance instance15: instance has offline secondary node(s) node3 - ERROR: instance instance16: instance lives on offline node node3 - ERROR: instance instance17: instance lives on offline node node3 Thanks to Alexander Schreiber <als@google.com> for reporting this issue. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Alexander Schreiber <als@google.com>
-
Stephen Shirley authored
Signed-off-by:
Stephen Shirley <diamond@google.com> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
Currently, cluster-verify says: - ERROR: instance instance14: couldn't retrieve status for disk/0 on node3: node offline - ERROR: instance instance14: instance lives on offline node(s) node3 - ERROR: instance instance15: couldn't retrieve status for disk/0 on node3: node offline - ERROR: instance instance15: instance lives on offline node(s) node3 This is redundant as the “lives on offline node” message should be all we need to understand the cluster situation. The patch fixes this and also corrects a very old idiom. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Stephen Shirley <diamond@google.com>
-
Iustin Pop authored
Currently, cluster verify shows warnings N+1 warnings for offline nodes having any redundant instances since the memory data that we have for those nodes is zero, so any instance will trigger the warning. As the comment says, we already list secondary instances on offline nodes, so that warning is enough, and we skip the N+1 one. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Stephen Shirley <diamond@google.com>
-
- Feb 08, 2011
-
-
Stephen Shirley authored
The current code gives: Failure: prerequisites not met for this operation: error type: wrong_input, error details: Selection filter does not match any instances Signed-off-by:
Stephen Shirley <diamond@google.com> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Feb 04, 2011
-
-
Stephen Shirley authored
Signed-off-by:
Stephen Shirley <diamond@google.com> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Stephen Shirley authored
This is needed so cluster-merge can add nodes from other clusters. Signed-off-by:
Stephen Shirley <diamond@google.com> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Stephen Shirley authored
Current line tries to unpack dict incorrectly Signed-off-by:
Stephen Shirley <diamond@google.com> Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
Also bump up the version. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Iustin Pop authored
Hopefully this can be fixed before the final 2.4 release… Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Stephen Shirley <diamond@google.com>
-
- Feb 03, 2011
-
-
Iustin Pop authored
Currently, the export timeout is 10 times 20 seconds, but the import is only 30 seconds. I'm raising this to 60 seconds with two goals in mind: - when debugging manually, this allows for easier synchronisation of the processes - 60 equals to 3 full 20 second intervals, which I think is better than just one an a half This change shouldn't make a big difference either way (at most, it will possibly delay the job in case of failures by half a minute). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
In case of failures, the recent daemon output is logged as %r on a list of unicode strings, which results in the (ugly): Thu Feb 3 05:13:34 2011 snapshot/0 failed to send data: Exited with status 1 (recent output: [u' DUMP: Date of this level 0 dump: Thu Feb 3 05:13:18 2011', u' DUMP: Dumping /dev/mapper/6369a5f7-1e67-4d0d-a4f0-956b3649c6d7.disk0_data.snap-1 (an unlisted file system) to standard output', u' DUMP: Label: none', u' DUMP: Writing 10 Kilobyte records', u' DUMP: mapping (Pass I) [regular files]', u' DUMP: mapping (Pass II) [directories]', u' DUMP: estimated 54301 blocks.', u' DUMP: Volume 1 started with block 1 at: Thu Feb 3 05:13:19 2011', u' DUMP: dumping (Pass III) [directories]', u' DUMP: dumping (Pass IV) [regular files]', u'socat: E SSL_write(): Connection reset by peer', u"dd: dd: writing `standard output': Broken pipe", u' DUMP: Broken pipe', u' DUMP: The ENTIRE dump is aborted.']) This patch joins this list and makes it a non-unicode string, thus resulting in the more readable (and ~10% shorter): Thu Feb 3 05:16:04 2011 snapshot/0 failed to send data: Exited with status 1 (recent output: DUMP: Date of this level 0 dump: Thu Feb 3 05:15:58 2011\n DUMP: Dumping /dev/mapper/6369a5f7-1e67-4d0d-a4f0-956b3649c6d7.disk0_data.snap-1 (an unlisted file system) to standard output\n DUMP: Label: none\n DUMP: Writing 10 Kilobyte records\n DUMP: mapping (Pass I) [regular files]\n DUMP: mapping (Pass II) [directories]\n DUMP: estimated 54350 blocks.\n DUMP: Volume 1 started with block 1 at: Thu Feb 3 05:15:59 2011\n DUMP: dumping (Pass III) [directories]\nsocat: E SSL_write(): Connection reset by peer\ndd: dd: writing `standard output': Broken pipe\n DUMP: Broken pipe\n DUMP: The ENTIRE dump is aborted.) Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
This adds a message and nice handling of ^C, especially useful for ``gnt-job watch``. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
Michael Hanselmann authored
* devel-2.3: backend: Disable compression in export info file Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
The new import/export infrastructure in Ganeti 2.2 and up handles compression differently. It no longer writes compressed files to the destination. Unfortunately changing this behaviour would be non-trivial, so in the meantime setting “compression = none” will hopefully avoid some confusion. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Feb 02, 2011
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
This function can be used from a SIGHUP handler to reopen log files. Initial, simple unittests are included. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
It's passed in by most users (daemons, CLI scripts) and for the others (burnin, watcher) it certainly doesn't hurt, especially when using syslog. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
The I/O error will occur while opening the file, not while adding and configuring the handler. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
- Use constant for exit value - Configure logging from main function, not from class' “__init__” Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
Instead of using its own, burnin can use cli.SetGenericOpcodeOpts. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Feb 01, 2011
-
-
Stephen Shirley authored
Signed-off-by:
Stephen Shirley <diamond@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Stephen Shirley authored
Signed-off-by:
Stephen Shirley <diamond@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Stephen Shirley authored
This allows calling of _UnlockedLookupNodeGroup() from within AddNodeGroup() Signed-off-by:
Stephen Shirley <diamond@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
- Jan 31, 2011
-
-
Stephen Shirley authored
Also fix type of Merger.cluster_name from list to string. This would have triggered an error in sshRunner if cluster keys were in use. Signed-off-by:
Stephen Shirley <diamond@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-