1. 27 Jan, 2009 1 commit
      Implement disk verify checks in config verify · 332d0e37
      This patch adds a simple check that the 'mode' attribute of top-level disks is
      correct. It does not recurse over children.
      The framework could be extended with other checks in the future.
      Reviewed-by: imsnah
  2. 23 Jan, 2009 2 commits
      Relax the restrictions on temporary DRBD minors · 79b26a7a
      Currently the restrictions are too harsh: there is a time interval
      between an instance gets a new disk and before it is added to the
      configuration in which the restriction is not met. We solve this by
      allowing temporary DRBD minors to match existing minors (for the same
      instance), such that parallel creations/minor allocations are OK.
      The change is done by moving the add of temporary minors to the
      minor map after the instance minors are computed, and only considering
      them as duplicate if the instance name doesn't match.
      Reviewed-by: ultrotter
      Introduce more configuration consistency checks · 4a89c54a
      This patch enhances the duplicate DRBD minors checks (currently just a
      few) and adds automatic checks of configuration consistency at
      configuration file writing time.
      In order to do so and show meaningful error messages, the
      _UnlockedComputeDRBDMap function is changed to not raise errors in case
      of duplicates, but instead return both the minors map and the duplicate
      list, and its callers now raise the error. This allows the VerifyConfig
      function to return a complete list of duplicates.
      The new checks required some small updates to the unittests for the
      config module.
      Reviewed-by: ultrotter
  3. 21 Jan, 2009 2 commits
      Automatically release DRBD minors on success · 61cf6b5e
      This patch converts the DRBD minors reservation protocol from explicit
      release to automatic release on the success paths. On the errors paths,
      it's still needed to manual release.
      The patch doesn't bring much by itself, but is needed for a future patch
      which enhances the automatic verification of configuration consistency.
      Reviewed-by: ultrotter
      Change the instance status attribute to boolean · 0d68c45d
      Due to historic reasons, the “should run or not” attribute of an
      instance was denoted by its “status” attribute having a string value of
      either ‘up’ or ‘down’. Checking this is in code was done via hardcoding
      of the strings.
      This was long done for a redo, and this patch changes this attribute to
      “admin_up” having a boolean value. The patch is in fact shorter than I
      expected, and passes burnin.
      The patch also fixes an error in BuildInstanceHookEnvByObject where the
      instance.os was passed as the status value.
      Reviewed-by: ultrotter
  4. 20 Jan, 2009 2 commits
      Fix adding of disks to an instance · 32388e6d
      The ConfigWriter.AllocateDRBDMinor requires the instance name, not the
      instance object. The LUSetInstanceParms is passing wrongly the instance
      object, which can cause breakage.
      The patch also adds asserts to check for this mismatch in ConfigWriter.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Make cluster-verify check the drbd minors space · 6d2e83d5
      Iustin Pop authored
      This patch adds support for verification of drbd minors space in cluster
      verify: minors which belong to running instances and should be online
      but are not, and minors which do not belong to any instace but are in
      The patch requires exposing some methods from bdev.DRBD8 and
      config.ConfigWriter which were until now private methods.
      Reviewed-by: ultrotter
  5. 09 Jan, 2009 2 commits
      Add a new ssconf file with the ganeti version · 8a113c7a
      The patch adds a new ssconf file containing the ganeti version.
      Reviewed-by: imsnah
    • Iustin Pop's avatar
      Iustin Pop authored
      We shouldn't query offline nodes in gnt-os. This patch adds an utility
      function to ConfigWriter that returns the names of online nodes and uses
      it in LUDiagnoseOS to query only the good nodes.
      Reviewed-by: imsnah
  6. 14 Dec, 2008 1 commit
  7. 11 Dec, 2008 1 commit
      Fix epydoc format warnings · c41eea6e
      This patch should fix all outstanding epydoc parsing errors; as such, we
      switch epydoc into verbose mode so that any new errors will be visible.
      Reviewed-by: imsnah
  8. 08 Dec, 2008 1 commit
      Fix _AdjustCandidatePool · ee513a66
      Currently the ConfigWriter.MaintainCandidatePool returns node names, and
      _AdjustCandidatePool uses them as such, but then it passes these to
      context.ReaddNode which in turn passes them to jqueue.JobQueue.AddNode which
      uses them as objects.Node instances.
      Since this is currently the only usage, we change return type from
      ConfigWriter.MaintainCandidatePool to be objects and adjust the logging of
      their names, so that the auto-adjusement works.
      Reviewed-by: ultrotter
  9. 05 Dec, 2008 2 commits
      Add function to compute the master candidates · ec0292f1
      Since some nodes can be offline, we can't just take the length of the
      node list as the maximum possible number of master candidates.
      The patch adds an utility function to correctly compute this value and
      replaces hardcoded computations with the use of this function. It then
      adds utility functions to automate the maintenance of the node lists.
      Reviewed-by: ultrotter
    • Iustin Pop's avatar
      Add the offline node list to ssconf · a3316e4a
      Iustin Pop authored
      The patch also changes the various node list generation to be more
      Reviewed-by: imsnah
  10. 03 Dec, 2008 1 commit
      A few fixes related to master candidates · 3a26773f
      This patch:
        - fixes cluster verify when all nodes are master candidates, but the
          candidate_pool_size is higher
        - warn when the master node is not marked as candidate
        - disable setting master node to regular node
        - don't pass the master node to context.ReaddNode since the job queue
          doesn't like getting our own node name
      Reviewed-by: ultrotter
  11. 02 Dec, 2008 4 commits
  12. 27 Nov, 2008 2 commits
      Fix logic bug in rev 2072 · f34901f8
      In revision 2072 "ConfigWriter: change cluster serial meaning" I misread
      the serial_no update logic: it was about updating the serial number on
      the object itself, not on the cluster.
      So we don't actually have at all cluster serial number increase when a
      node is changed (not removed/added).
      This patch revers to the original always increase the target serial
      number and adds increase of the cluster serial number in case a node has
      been changed.
      Reviewed-by: ultrotter
      ConfigWriter: change cluster serial meaning · cff4c037
      Currently, we increase the cluster serial number for instance additions,
      removals and renames. This is conforming with the REST paradigm, however
      it means that for each of these operations, we need to push ssconf
      updates to all nodes.
      In order to support future cases with reduced set of master-eligible
      nodes, we want to reduce the ssconf pushes (which need to be to all
      nodes). This patch changes the meaning for the cluster serial number so
      that it doesn't track instance operations at all.
      This means that addition of an instance can fail due to concurrent
      additions, even if the cluster serial has not changed. It slightly
      breaks the REST paradigm, but IMHO it's better for actual usage.
      Reviewed-by: ultrotter
  13. 25 Nov, 2008 1 commit
  14. 24 Nov, 2008 1 commit
  15. 23 Nov, 2008 1 commit
  16. 21 Nov, 2008 1 commit
  17. 23 Oct, 2008 1 commit
  18. 20 Oct, 2008 1 commit
      Convert rpc.call_upload_file to use addresses · 6b294c53
      This patch allows rpc.call_upload_file to use addresses (if passed), and
      also converts the ConfigWriter._DistributeConfig to pass them, since
      this is an often-done operation.
      Reviewed-by: imsnah
  19. 10 Oct, 2008 1 commit
      Convert rpc module to RpcRunner · 72737a7f
      This big patch changes the call model used in internode-rpc from
      standalong function calls in the rpc module to via a RpcRunner class,
      that holds all the methods. This can be used in the future to enable
      smarter processing in the RPC layer itself (some quick examples are not
      setting the DiskID from cmdlib code, but only once in each rpc call,
      There are a few RPC calls that are made outside of the LU code, and
      these calls are left as staticmethods, so they can be used without a
      class instance (which requires a ConfigWriter instance).
      Reviewed-by: imsnah
  20. 06 Oct, 2008 1 commit
      Disable re-reading of config file · 3d3a04bc
      Since the objects read from the config file are passed to the various
      threads, it's unsafe to re-read the config file (and throw away
      ConfigWriter._config_data). As such, we disable the re-reading of the
      file (since now the master is the owner the file, it makes not sense to
      re-read it), and any modifications to the file must be done offline,
      otherwise they will be overwritten.
      Reviewed-by: imsnah
  21. 01 Oct, 2008 5 commits
      Convert config.py · 5b263ed7
      The configuration version is now again in the configuration file.
      Reviewed-by: iustinp
      Add new query to get cluster config values · ae5849b5
      This can be used to retrieve certain cluster config values from
      within clients.
      OpDumpClusterConfig was not used anywhere, hence I'm just reusing
      it. The way ConfigWriter.DumpConfig returned the configuration
      was not thread-safe, anyway (no deepcopy).
      Reviewed-by: iustinp
      Move functions from ssconf.py elsewhere · 4a8b186a
      These functions will be used to access config values instead of using
      Reviewed-by: iustinp
      Add cluster options from ssconf to configuration · f6bd6e98
      ssconf will become write-only from ganeti-masterd's point of view,
      therefore all settings in there need to go into the main configuration
      Reviewed-by: iustinp
      Move instantiation of config into bootstrap.py · b9eeeb02
      Future patches will add even more variables to the cluster config.
      Adding more parameters wouldn't make the function easier to use and
      it doesn't make sense to pass them to another function, as it's
      only done once in bootstrap.py on cluster initialization.
      Reviewed-by: iustinp
  22. 29 Sep, 2008 1 commit
      Extend DRBD disks with shared secret attribute · f9518d38
      This patch, which is similar to r1679 (Extend DRBD disks with minors
      attribute), extends the logical and physical id of the DRBD disks with a
      shared secret attribute. This is generated at disk creation time and
      saved in the config file.
      The generation of the secret is done so that we don't have duplicates in
      the configuration (otherwise the goal of preventing cross-connection
      will not be reached), so we add to config.py more than just a simple
      call to utils.GenerateSecret().
      The patch does not yet enable the use of the secrets.
      Reviewed-by: imsnah
  23. 28 Sep, 2008 1 commit
      Fix a bug related to static minors · d48663e4
      When the node does not yet have any minors allocated, the first minor
      (0) will not be entered in the ConfigWriter._temporary_drbds structure.
      This does not happen for our current usage, since we always ask for two
      minors (so the next call will not match this case), but it will be
      triggered if we only ask for one minor, and then ask again before adding
      the instance to the config file.
      Reviewed-by: ultrotter
  24. 27 Sep, 2008 4 commits
      Add checks for tcp/udp port collisions · 48ce9fd9
      In case the config file is manually modified, or in case of bugs, the
      tcp/udp ports could be reused, which will create various problems
      (instances not able to start, or drbd disks not able to communicate).
      This patch extends the ConfigWriter.VerifyConfig() method (which is used
      in cluster verify) to check for duplicates between:
        - the ports used for DRBD disks
        - the ports used for network console
        - the ports marked as free in the config file
      Also, if the cluster parameter ‘highest_used_port’ is actually lower
      than the computed highest used port, this is also flagged as an error.
      The output from gnt-cluster verify will show (output manually wrapped):
      node1 # gnt-cluster verify
      * Verifying global settings
        - ERROR: tcp/udp port 11006 has duplicates: instance3.example.com/network port,
      instance2.example.com/drbd disk sda
        - ERROR: tcp/udp port 11017 has duplicates: instance3.example.com/drbd disk sda,
      instance3.example.com/drbd disk sdb, cluster/port marked as free
        - ERROR: Highest used port mismatch, saved 11010, computed 11017
      * Gathering data (2 nodes)
      Reviewed-by: ultrotter
      Update the cluster serial_no on certain operations · b9f72b4e
      This patch adds update of the cluster serial number for:
        - add/remove node (as the cluster's node list is changed)
        - add/remove/rename instance (as the cluster's instance list is changed)
        - change the volume group name
      The rule for updating this attribute is when cluster-wide properties are
      changed, but not individual node/instance ones.
      There are other remaining cases to handle, pending on the ssconf
      Reviewed-by: ultrotter
      Initialize and update the serial_no on objects · b989e85d
      This patch add initialization of the serial_no on instance and nodes,
      and update of the field whenever an object is updated in the generic
      case, via ConfigWriter.Update(obj) and in the specific case of
      instances' state being modified manually.
      Reviewed-by: ultrotter
      Switch the global serial_no to the top object · 9d38c6e1
      Currently the serial_no that is incremented every time the configuration
      file is written is located on the 'cluster' object in the configuration
      structure. However, this is wrong as the cluster serial_no should be
      incremented only when the cluster state is changed (for whatever
      definition of “changed” we will use), not simply because the
      configuration file is written.
      This patch changes so that the ConfigWriter._BumpSerialNo affects the
      top-level ConfigData object.
      Reviewed-by: ultrotter