- Dec 01, 2010
-
-
Iustin Pop authored
Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Balazs Lecz <leczb@google.com>
-
Iustin Pop authored
For now, we don't support instances allocated across two groups, and we will reject such clusters. The isClusterConsistent function will return a list of inconsistent instances, potentially allowing operation without touch them (but only the rest). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Balazs Lecz <leczb@google.com>
-
Iustin Pop authored
Unittests included. The function will be needed for consistency checks in the algorithms. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Balazs Lecz <leczb@google.com>
-
Iustin Pop authored
This is to pottentially allow easier changes later. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Balazs Lecz <leczb@google.com>
-
- Nov 24, 2010
-
-
Iustin Pop authored
This depends on future support from Ganeti (2.4+). Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Balazs Lecz <leczb@google.com>
-
Iustin Pop authored
This makes the code incompatible with JSON files from Ganeti pre-2.4. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Balazs Lecz <leczb@google.com>
-
Iustin Pop authored
Compatibility with old text files is kept by using the default UUID if the file (or even some records) don't have a UUID. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Balazs Lecz <leczb@google.com>
-
Iustin Pop authored
This makes the code incompatible with Ganeti pre-2.4. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Balazs Lecz <leczb@google.com>
-
Iustin Pop authored
This is not used anywhere yet, and the backend are all just adding the default UUID, not the real one. The patch also allows displaying the group UUID in the node list. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Balazs Lecz <leczb@google.com>
-
Iustin Pop authored
This will be used as a placeholder for the cases when we need a UUID (any UUID), but we don't have one handy. Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Balazs Lecz <leczb@google.com>
-
- Nov 23, 2010
-
-
Iustin Pop authored
This does just two passes, instead of three, over the list. This reduces the overall runtime well enough (~25%) in some tests, but it's not reproducible using profiling, so I don't know how much the function itself is being sped-up. Note: this is written via `seq`s, and not BangPatterns. Since it's just one case, adding BangPatterns just for it wasn't a big gain. Thanks to Lécz Balázs for the impetus to improve this! Signed-off-by:
Iustin Pop <iustin@google.com> Reviewed-by:
Balazs Lecz <leczb@google.com>
-
- Nov 19, 2010
-
-
Iustin Pop authored
While we don't actually have IO code in the Simu loader, we do have the same interface. So we move the code again to a separate parseData function which is exported.
-
Iustin Pop authored
-
Iustin Pop authored
The change is similar to the text loader change.
-
Iustin Pop authored
This change, which will be followed by similar changes in the other loaders, splits the parsing of the data from the actual loading from disk. Since the parsing doesn't usually involve IO actions, we will be able to better test the parsing. The loading becomes a smaller part of the code and thus inability to test it has a smaller impact.
-
- Nov 11, 2010
-
-
Iustin Pop authored
This break compatibility with Ganeti pre-2.3.
-
- Nov 09, 2010
-
-
Iustin Pop authored
Currently, the tag exclusion metric has a weight of one, which means there might be cases where we won't move instances around because it upsets the cluster metrics. However, we do want to make a higher effort for cleaning up tag collisions, so we increase the weight to an empirically-determined value of 2.
-
- Oct 07, 2010
-
-
Iustin Pop authored
-
- Oct 06, 2010
-
-
Iustin Pop authored
-
- Sep 03, 2010
-
-
Iustin Pop authored
Also adds them in hbal.
-
Iustin Pop authored
Recent hbal seems to run many steps for small improvements (< 1e-3), so we should stop early in this case. We add a new option (-g), that will be used for the minimum gain during balancing. This check will only become active when the cluster score is below a threshold (--min-gain-limit), so as to not stop rebalances too early.
-
- Sep 02, 2010
-
-
Iustin Pop authored
These are just variations of the standard debug, but are provided for simpler code, since lazyness is something causing non-computation of debug statements.
-
Iustin Pop authored
The addition of a new secondary on a node is doing two memory tests: - in strict mode, reject if we get into N+1 failure - reject if the new instance memory is greater than the free memory (not available memory) on the node The last check is designed to ensure that, irrespective of the other secondary instances on this node, we are able to failover/migrate the newly-added instance. However, we should allow this, if the instances comes from an offline node, which doesn't offer anything (not even disk replication). Therefore this patch makes this check conditional on the strict mode.
-
- Aug 30, 2010
-
-
Iustin Pop authored
The Cluster.iterateAlloc and tieredAlloc functions are changed to also return the updated instance list, since it is needed to have a “full” cluster view.
-
Iustin Pop authored
This is currently hardcoded in an internal function in hscan.hs, and we move it to Text.hs for later use.
-
- Aug 25, 2010
-
-
Iustin Pop authored
This option will in the future be used to serialize the cluster state in hbal and hspace after the rebalance/allocation steps.
-
Iustin Pop authored
This checks that the Node text serialization and deserialization operations are idempotent when combined other.
-
Iustin Pop authored
Currently, the hostnames are almost fully arbitrary chars, which breaks the assumption that nodes/instances will be normal DNS hostnames. This patch adds some custom generators for these hostnames, that will allow better testing of text loader serialization/deserialization.
-
- Aug 24, 2010
-
-
Iustin Pop authored
Currently these are in hscan, and cannot be reused easily.
-
- Jul 27, 2010
-
-
Iustin Pop authored
Currently we show the instance index, but this makes no sense outside the current running program. Instead, we show the instance name.
-
- Jul 22, 2010
-
-
Iustin Pop authored
-
Iustin Pop authored
-
- Jul 21, 2010
-
-
Iustin Pop authored
This is needed so that in the coverage report we list all modules, even the ones we don't test at all, such that we get the complete results.
-
Iustin Pop authored
Currently, this metric tracks the nodes failing the N+1 check. While this helps (in some cases) to evacuate such nodes, it's not a good metric since rarely it will change during a step (only at the last instance moving away). Therefore we replace it with the count of instances living on such nodes, which is much better because: - moving an instance away while the node is still N+1 failing will still reflect in the score as an optimization - moving the last instance causing an N+1 failure will result in a heavy decrease of this score, thus giving the right bonus to clear this status
-
Iustin Pop authored
Currently all metrics have the same weight (we just sum them together). However, for the hard constraints (N+1 failures, offline nodes, etc.) we should handle the metrics differently based on their meaning. For example, an instance living on a primary offline node is worse than an instance having its secondary node offline, which in turn is worse than an instance having its secondary node failing N+1. To express this case in our code, we introduce a table of weights for the metrics, with which we can influence their relative importance.
-
Iustin Pop authored
This patch switches the applyMove function to the extended versions of Node.addPri and addSec, and passes the override flag based on the state of the node that we're moving away from.
-
Iustin Pop authored
In case an instance is living on an offline node, it doesn't make sense to refuse moving it because that would create N+1 failures; failing N+1 is still much better than not running at all. Similarly, if the secondary node of an instance is offline, meaning the instance doesn't have any redundancy, we have a worse case than having a secondary that is N+1 failing and it could not accept the instance as primary, but it stil does redundancy for it. To allow this, we rename Node.addPri to addPriEx and introduce an extra parameter (addPri is a partial application of addPriEx and keeps the same signature). Node.addSec gets the same treatement.
-
- Jul 19, 2010
-
-
Iustin Pop authored
This was only used in one place (hbal), and is obsolete by the change to the dual name/alias structure.
-
Iustin Pop authored
This was a regression from the name handling changes, as we started using the original names for the solution list (which is not designed for parsing/feeding back into ganeti).
-
Iustin Pop authored
printSolution is no longer used, as we print the solution iteratively now.
-