- Dec 20, 2010
-
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Dec 17, 2010
-
-
René Nussbaumer authored
As we can't test this on master anymore (if we flag the node offline we would change master role on master) we use the first non master node we find in the configuration Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
René Nussbaumer authored
This patch makes sure we cross verify the state the node is in with our view: power off -> Node has to be set offline modify -O no -> Node has to be powered Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
* devel-2.3: QA: Run cluster-verify as part of all instance tests QA: Fix typo and add “not” ensure-dirs: Speed up when using big queues Fix gnt-cluster verify with diskless instances Conflicts: lib/cmdlib.py: Trivial qa/ganeti-qa.py: Trivial Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
This patch changes utils.NiceSort to use the built-in “sorted()” and gets rid of the intermediate list. Instead of wrapping the items ourselves, a key function is used. The caller can specify another key function (useful to sort objects by their name, e.g. “utils.NiceSort(instances, key=operator.attrgetter("name"))”. Unittests are provided. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
“gnt-cluster verify” looks at some per-instance information as well, so it should be run for each instance type QA tests. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Iustin Pop authored
For secondary node that is offline, we should not consider that the disk shutdown has failed, as it can never succeed under this cluster state and (by virtue of the fact that the secondary node is offline) the disks are already "shutdown". The patch also fixes a tiny typo. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Iustin Pop authored
data ≫ code, eom. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
- Dec 16, 2010
-
-
Michael Hanselmann authored
The “ensure-dirs” script as included in Ganeti 2.3 is very slow when working with big queues requiring a change of permissions on many or all files. $ find /var/lib/ganeti/queue/ | wc -l 52354 Before this change: $ time /usr/local/lib/ganeti/ensure-dirs -f real 16m4.739s While not adressed in this patch, I'd like to record the overall ineffiency of the “ensure-dirs” script, even after this change: $ time /usr/local/lib/ganeti/ensure-dirs -f real 5m57.362s […] $ strace -e clone,execve -f -c /usr/local/lib/ganeti/ensure-dirs -f % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 50.08 5.147090 49 104774 clone 49.92 5.131094 49 104739 execve More changes will be needed. Just for comparision, a small Python snippet changing permissions on all files (“ensure-dirs” changes the owner too): $ time python -c 'import os; from ganeti import utils; [os.chmod(i, 0644) for i in utils.ListVisibleFiles("/var/lib/ganeti/queue/archive/big")]' real 0m0.605s […] Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Dec 15, 2010
-
-
Adeodato Simo authored
`gnt-cluster verify` was failing with KeyError if there was any diskless instance in the cluster. This was because _CollectDiskInfo() was not including these instances in the returned dictionary, but they were expected to be present in LUVerifyCluster.Exec(). With this commit, we ensure that the dictionary returned by _CollectDiskInfo includes entries for diskless instances as well. Signed-off-by:
Adeodato Simo <dato@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Miguel Di Ciurcio Filho authored
The error contained a typo and is slightly cumbersome. It changes from: - ERROR: node a: not enough memory on to accommodate failovers should peer node b fail to: - ERROR: node a: not enough memory to accomodate instance failovers should node b fail Signed-off-by:
Miguel Di Ciurcio Filho <miguel.filho@gmail.com> Reviewed-by:
Guido Trotter <ultrotter@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
* devel-2.3: jqueue: Keep jobs in “waitlock” while returning to queue Improve jqueue unittests Update manpages to display version 2.3 Conflicts: man/ganeti-cleaner.sgml: Removed man/ganeti-confd.sgml: Removed man/ganeti-masterd.sgml: Removed man/ganeti-noded.sgml: Removed man/ganeti-os-interface.sgml: Removed man/ganeti-rapi.sgml: Removed man/ganeti-watcher.sgml: Removed man/ganeti.sgml: Removed man/gnt-backup.sgml: Removed man/gnt-cluster.sgml: Removed man/gnt-debug.sgml: Removed man/gnt-instance.sgml: Removed man/gnt-job.sgml: Removed man/gnt-node.sgml: Removed man/gnt-os.sgml: Removed Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
Iustin Pop reported that a job's file is updated many times while it waits for locks held by other thread(s). After an investigation it was concluded that the reason was a design decision for job priorities to return jobs to the “queued” status if they couldn't acquire all locks. Changing a jobs' status or priority requires an update to permanent storage. In a high-level view this is what happens: 1. Mark as waitlock 2. Write to disk as permanent storage (jobs left in this state by a crashing master daemon are resumed on restart) 3. Wait for lock (assume lock is held by another thread) 4. Mark as queued 5. Write to disk again 6. Return to workerpool Another option originally discussed was to leave the job in the “waitlock” status. Ignoring priority changes, this is what would happen: 1. If not in waitlock 1.1. Assert state == queued 1.2. Mark as waitlock 1.3. Set start_timestamp 1.4. Write to disk as permanent storage 3. Wait for locks (assume lock is held by another thread) 4. Leave in waitlock 5. Return to workerpool Now let's assume the lock is released by the other thread: […] 3. Wait for locks and get them 4. Assert state == waitlock 5. Set state to running 6. Set exec_timestamp 7. Write to disk As this change reduces the number of writes from two per lock acquire attempt to two per opcode and one per priority increase (as happens after 24 acquire attempts (see mcpu._CalculateLockAttemptTimeouts) until the highest priority is reached), here's the patch to implement it. Unittests are updated. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
Michael Hanselmann authored
- Verify job file updates - Ensure queue lock is released while executing opcode Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Iustin Pop <iustin@google.com>
-
- Dec 14, 2010
-
-
Michael Hanselmann authored
Pylint complained that the “lambda may not be necessary”. Turns out it was right. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Michael Hanselmann authored
A new function for formatting the query results is added, ``FormatTable``. This was determined to be easier and safer than modifying the existing ``GenerateTable`` function while keeping backwards compatibility for code not yet converted. The new code makes use of the enhanced information provided by query2. Signed-off-by:
Michael Hanselmann <hansmi@google.com> Reviewed-by:
René Nussbaumer <rn@google.com>
-
Miguel Di Ciurcio Filho authored
Signed-off-by:
Miguel Di Ciurcio Filho <miguel.filho@gmail.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Balazs Lecz authored
Signed-off-by:
Balazs Lecz <leczb@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
René Nussbaumer authored
Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
René Nussbaumer authored
This is just a state of record field and does not necessary reflect the reality. Signed-off-by:
René Nussbaumer <rn@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
- Dec 13, 2010
-
-
Adeodato Simo authored
Signed-off-by:
Adeodato Simo <dato@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Adeodato Simo authored
This adds QA tests for the SetGroupParams operation, both for CLI and RAPI. Additionally, it adds tests for add/rename/remove groups via RAPI, which had not been included in a previous patch series. Finally, it also tests setting "alloc_policy" (and, for the CLI, "ndparams") at group creation time. Signed-off-by:
Adeodato Simo <dato@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Adeodato Simo authored
Signed-off-by:
Adeodato Simo <dato@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-
Adeodato Simo authored
This creates the /2/groups/<name>/modify resource; at the moment, only the "alloc_policy" attribute can be modified. Signed-off-by:
Adeodato Simo <dato@google.com> Reviewed-by:
Michael Hanselmann <hansmi@google.com>
-