1. 01 Oct, 2008 9 commits
    • Michael Hanselmann's avatar
      Move functions from ssconf.py elsewhere · 4a8b186a
      Michael Hanselmann authored
      These functions will be used to access config values instead of using
      ssconf.
      
      Reviewed-by: iustinp
      4a8b186a
    • Michael Hanselmann's avatar
      Add simple configuration reader/writer classes · 856c67e1
      Michael Hanselmann authored
      This will be used to read the configuration file in the node daemon.
      The write functionality is needed for master failover.
      
      Reviewed-by: iustinp
      856c67e1
    • Iustin Pop's avatar
      Fix the watcher with down nodes · 37b77b18
      Iustin Pop authored
      The watcher didn't handle the down nodes, fix this by ignoring (in
      secondary node reboot checks) any node that doesn't return a boot id.
      
      Reviewed-by: imsnah
      37b77b18
    • Iustin Pop's avatar
      Fix the watcher not restarting instance bug · b7309a0d
      Iustin Pop authored
      The watcher was using conflicting attributes of the instance:
        - it queried the admin_/oper_state, which are booleans
        - but it compared those to the status (which is a text field)
      
      The code was changed to query the aggregated 'status' field, as that
      will also return indication of node problems, and we can use this only
      one field for all decisions. We still ask for the admin_state field as
      that is needed for the activate disks check (in secondary node restart).
      
      The patch also touches the watcher in some other parts:
        - log exceptions nicer
        - convert a method to @staticmethod
        - remove unused imports
      
      Reviewed-by: imsnah
      b7309a0d
    • Iustin Pop's avatar
      Remove last use of utils.RunCmd from the watcher · 5188ab37
      Iustin Pop authored
      The watcher has one last use of ganeti commands as opposed to sending
      requests via luxi. The patch changes this to use the cli functions.
      
      The patch also has two other changes:
        - fix the docstring for OpVerifyDisks (found out while converting
          this)
        - enable stderr logging on the watcher when “-d” is passes
      
      Reviewed-by: imsnah
      5188ab37
    • Michael Hanselmann's avatar
      Fix unittests broken by revision 1727 · 36b8c2c1
      Michael Hanselmann authored
      Reviewed-by: iustinp
      36b8c2c1
    • Michael Hanselmann's avatar
      Add cluster options from ssconf to configuration · f6bd6e98
      Michael Hanselmann authored
      ssconf will become write-only from ganeti-masterd's point of view,
      therefore all settings in there need to go into the main configuration
      file.
      
      Reviewed-by: iustinp
      f6bd6e98
    • Michael Hanselmann's avatar
      Move instantiation of config into bootstrap.py · b9eeeb02
      Michael Hanselmann authored
      Future patches will add even more variables to the cluster config.
      Adding more parameters wouldn't make the function easier to use and
      it doesn't make sense to pass them to another function, as it's
      only done once in bootstrap.py on cluster initialization.
      
      Reviewed-by: iustinp
      b9eeeb02
    • Iustin Pop's avatar
      Change the results from cli.PollJob · 53c04d04
      Iustin Pop authored
      Curently PollJob accepts a generic job, but will return (history
      artifact) only the first opcode result. This is wrong, as it doesn't
      allow polling of a job with multiple results.
      
      Its only caller (for now) is also changed, so no functional changes
      should happen.
      
      Reviewed-by: ultrotter, amishchenko
      53c04d04
  2. 30 Sep, 2008 12 commits
  3. 29 Sep, 2008 9 commits
    • Michael Hanselmann's avatar
      Add job queue design document · b2cee5e5
      Michael Hanselmann authored
      Reviewed-by: iustinp
      b2cee5e5
    • Iustin Pop's avatar
      Add an 'index' of design documents · 84f4dc28
      Iustin Pop authored
      This will be an overview document, enumerating the changes without going
      into details and pointing to the actual documents.
      
      Reviewed-by: ultrotter
      84f4dc28
    • Iustin Pop's avatar
      Add opcode execution log in job info · 5b23c34c
      Iustin Pop authored
      This patch adds the job execution log in “gnt-job info” and also allows
      its selection in “gnt-job list” (however here it's not very useful as
      it's not easy to parse). It does this by adding a new field in the query
      job call, named ‘oplog’.
      
      With this, one can get a very clear examination of the job. What remains
      to be added would be timestamps for start/stop of the processing for the
      job itself and its opcodes.
      
      Reviewed-by: imsnah
      5b23c34c
    • Iustin Pop's avatar
      Move a hardcoded constant to constants.py · 3c03759a
      Iustin Pop authored
      For now we only use the ‘C’ protocol so we can put it in constants.py
      instead of hardcoding it.
      
      Reviewed-by: imsnah
      3c03759a
    • Iustin Pop's avatar
      Enable the use of shared secrets · 2899d9de
      Iustin Pop authored
      This patch enables the use of the shared secrets for DRBD8 disks, using
      (hardcoded in constants.py) the md5 digest algorithm.
      
      For making this more flexible, either we implement a cluster parameter
      (once the new model is in place), or we can make it ./configure-time
      selectable.
      
      Reviewed-by: imsnah
      2899d9de
    • Iustin Pop's avatar
      Extend DRBD disks with shared secret attribute · f9518d38
      Iustin Pop authored
      This patch, which is similar to r1679 (Extend DRBD disks with minors
      attribute), extends the logical and physical id of the DRBD disks with a
      shared secret attribute. This is generated at disk creation time and
      saved in the config file.
      
      The generation of the secret is done so that we don't have duplicates in
      the configuration (otherwise the goal of preventing cross-connection
      will not be reached), so we add to config.py more than just a simple
      call to utils.GenerateSecret().
      
      The patch does not yet enable the use of the secrets.
      
      Reviewed-by: imsnah
      f9518d38
    • Iustin Pop's avatar
      Add a info subcommand to gnt-job · 191712c0
      Iustin Pop authored
      Currently, it is hard to examine a job in detail; the output of ‘gnt-job
      list’ is not easy to parse.
      
      The patch adds a ‘gnt-job info’ command that is (vaguely) similar to
      ‘gnt-instance info’ in that it shows in a somewhat easy to understand
      format the details of a job.
      
      The result formatter is the most complicated part, since the results are
      not standardized; the code attempts to format nicely the most common
      result types (as taken from a random job list), via a generic algorithm.
      
      Reviewed-by: imsnah
      191712c0
    • Iustin Pop's avatar
      Implement job summary in gnt-job list · 60dd1473
      Iustin Pop authored
      It is not currently possibly to show a summary of the job in the output
      of “gnt-job list”. The closes is listing the whole opcode(s), but that
      is too verbose. Also, the default output (id, status) is not very
      useful, unless one looks for (and knows about) an exact job ID.
      
      The patch adds a “summary” description of a job composed of the list of
      OP_ID of the individual opcodes. Moreover, if an opcode has a ‘logical’
      target in a certain opcode field (e.g. start instance has the instance
      name as the target), then it is included in the formatting also. It's
      easier to explain via a sample output:
      
      gnt-job list
      ID Status  Summary
      1  error   NODE_QUERY
      2  success NODE_ADD(gnta2)
      3  success CLUSTER_QUERY
      4  success NODE_REMOVE(gnta2.example.com)
      5  error   NODE_QUERY
      6  success NODE_ADD(gnta2)
      7  success NODE_QUERY
      8  success OS_DIAGNOSE
      9  success INSTANCE_CREATE(instance1.example.com)
      10 success INSTANCE_REMOVE(instance1.example.com)
      11 error   INSTANCE_CREATE(instance1.example.com)
      12 success INSTANCE_CREATE(instance1.example.com)
      13 success INSTANCE_SHUTDOWN(instance1.example.com)
      14 success INSTANCE_ACTIVATE_DISKS(instance1.example.com)
      15 error   INSTANCE_CREATE(instance2.example.com)
      16 error   INSTANCE_CREATE(instance2.example.com)
      17 success INSTANCE_CREATE(instance2.example.com)
      18 success INSTANCE_ACTIVATE_DISKS(instance1.example.com)
      19 success INSTANCE_ACTIVATE_DISKS(instance2.example.com)
      20 success INSTANCE_SHUTDOWN(instance1.example.com)
      21 success INSTANCE_SHUTDOWN(instance2.example.com)
      
      This is done by a simple change to the opcode classes, which allows an
      opcode to format itself. The additional function is small enough that it
      can go in opcodes.py, where it could also be used by a client if needed.
      
      Reviewed-by: imsnah
      60dd1473
    • Iustin Pop's avatar
      Nicely sort the job list · 3b87986e
      Iustin Pop authored
      Unless we decide to change the job identifiers to integer, we should at
      least sort the list returned by _GetJobIDsUnlocked.
      
      Reviewed-by: imsnah
      3b87986e
  4. 28 Sep, 2008 2 commits
    • Iustin Pop's avatar
      Move the pseudo-secret generation to utils.py · 33081d90
      Iustin Pop authored
      The bootstrap code needs a pseudo-secret and this is currently generated
      inside the InitGanetiServerSetup function. Since more users will need
      this, move it to utils.py
      
      Reviewed-by: ultrotter
      33081d90
    • Iustin Pop's avatar
      Fix a bug related to static minors · d48663e4
      Iustin Pop authored
      When the node does not yet have any minors allocated, the first minor
      (0) will not be entered in the ConfigWriter._temporary_drbds structure.
      This does not happen for our current usage, since we always ask for two
      minors (so the next call will not match this case), but it will be
      triggered if we only ask for one minor, and then ask again before adding
      the instance to the config file.
      
      Reviewed-by: ultrotter
      d48663e4
  5. 27 Sep, 2008 8 commits
    • Iustin Pop's avatar
      Add checks for tcp/udp port collisions · 48ce9fd9
      Iustin Pop authored
      In case the config file is manually modified, or in case of bugs, the
      tcp/udp ports could be reused, which will create various problems
      (instances not able to start, or drbd disks not able to communicate).
      
      This patch extends the ConfigWriter.VerifyConfig() method (which is used
      in cluster verify) to check for duplicates between:
        - the ports used for DRBD disks
        - the ports used for network console
        - the ports marked as free in the config file
      
      Also, if the cluster parameter ‘highest_used_port’ is actually lower
      than the computed highest used port, this is also flagged as an error.
      
      The output from gnt-cluster verify will show (output manually wrapped):
      
      node1 # gnt-cluster verify
      * Verifying global settings
        - ERROR: tcp/udp port 11006 has duplicates: instance3.example.com/network port,
      instance2.example.com/drbd disk sda
        - ERROR: tcp/udp port 11017 has duplicates: instance3.example.com/drbd disk sda,
      instance3.example.com/drbd disk sdb, cluster/port marked as free
        - ERROR: Highest used port mismatch, saved 11010, computed 11017
      * Gathering data (2 nodes)
      ...
      
      Reviewed-by: ultrotter
      48ce9fd9
    • Iustin Pop's avatar
      Update the cluster serial_no on certain operations · b9f72b4e
      Iustin Pop authored
      This patch adds update of the cluster serial number for:
        - add/remove node (as the cluster's node list is changed)
        - add/remove/rename instance (as the cluster's instance list is changed)
        - change the volume group name
      
      The rule for updating this attribute is when cluster-wide properties are
      changed, but not individual node/instance ones.
      
      There are other remaining cases to handle, pending on the ssconf
      changes.
      
      Reviewed-by: ultrotter
      b9f72b4e
    • Iustin Pop's avatar
      Allow listing of the serial_no via gnt-* list · 38d7239a
      Iustin Pop authored
      This patch adds listing of the serial_no attribute in gnt-instance and
      gnt-node list, and updates to the manpages to reflect the change.
      
      Reviewed-by: ultrotter
      38d7239a
    • Iustin Pop's avatar
      Initialize and update the serial_no on objects · b989e85d
      Iustin Pop authored
      This patch add initialization of the serial_no on instance and nodes,
      and update of the field whenever an object is updated in the generic
      case, via ConfigWriter.Update(obj) and in the specific case of
      instances' state being modified manually.
      
      Reviewed-by: ultrotter
      b989e85d
    • Iustin Pop's avatar
      Switch the global serial_no to the top object · 9d38c6e1
      Iustin Pop authored
      Currently the serial_no that is incremented every time the configuration
      file is written is located on the 'cluster' object in the configuration
      structure. However, this is wrong as the cluster serial_no should be
      incremented only when the cluster state is changed (for whatever
      definition of “changed” we will use), not simply because the
      configuration file is written.
      
      This patch changes so that the ConfigWriter._BumpSerialNo affects the
      top-level ConfigData object.
      
      Reviewed-by: ultrotter
      9d38c6e1
    • Iustin Pop's avatar
      Add serial_no attributes to objects · be1fa613
      Iustin Pop authored
      This patch adds the ‘serial_no’ attribute to the other top-level objects
      (the configuration object itself, the nodes and the instances).
      
      Reviewed-by: ultrotter
      be1fa613
    • Iustin Pop's avatar
      Replace a cfg.AddInstance with UpdateInstance · 97abc79f
      Iustin Pop authored
      This seems to be the last (deprecated) use of AddInstance in order to
      update an instance.
      
      The patch also removes a whitespace-at-eol case.
      
      Reviewed-by: ultrotter
      97abc79f
    • Iustin Pop's avatar
      Add design doc for the disk changes · fbd6f863
      Iustin Pop authored
      Reviewed-by: imsnah
      fbd6f863