1. 11 Apr, 2012 3 commits
    • Iustin Pop's avatar
      Fix extra whitespace · 612f7fd4
      Iustin Pop authored
      
      
      Sorry, didn't catch this before…
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      (cherry picked from commit 54b010ca
      
      )
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      612f7fd4
    • Dimitris Aragiorgis's avatar
      Further fixes concerning drbd port release · 42f25b0b
      Dimitris Aragiorgis authored
      Commit 3b3b1bca does not entirely fix the bug introduced in commit
      f396ad8c
      
      . It fixes consistency of config data in permanent storage, but
      does not ensure consistency in data held in runtime memory of masterd.
      
      The bug of duplicate ports is still triggered when LUInstanceRemove()
      invokes _RemoveDisks() and this returns False (in case
      call_blockdev_remove RPC fails). The drbd ports get returned in the
      pool, but execution is aborted and RemoveInstance() is never invoked.
      
      Due to the fact that port handling is not done with
      TemporaryReservationManager, ensure that ports are released, only if
      disk related config data is deleted.
      
      In _RemoveDisks() release ports only if all RPCs succeed.
      
      Extend _RemoveDisks() to include ignore_failures argument passed by
      _RemoveInstance() to handle the ports appropriately.
      Signed-off-by: default avatarDimitris Aragiorgis <dimara@grnet.gr>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      42f25b0b
    • Dimitris Aragiorgis's avatar
      Fix a bug concerning TCP port release · 2522b7c4
      Dimitris Aragiorgis authored
      Commit f396ad8c
      
       returns the TCP port used by DRBD disk back to the
      TCP/UDP port pool using AddTcpUdpPort().
      
      However, AddTcpUdpPort() writes the config on every invocation,
      using _WriteConfig(). This causes two problems:
      
       * it causes critical errors logged by VerifyConfig(), after the DRBD
         disk removal, and until the actual instance removal.
       * if the code following AddTcpUdpPort() fails, the port is already
         returned back the pool, which causes the port to have duplicates
         (inconsistent config).
      
      AddTcpUdpPort() is invoked in three cases:
      
       * during InstanceRemove() through _RemoveDisks().
       * during InstanceSetParams() in case of disk removal.
       * during InstanceSetParams() through _ConvertDrbdToPlain().
      
      This commit fixes the problem by removing the _WriteConfig() call from
      AddTcpUdpPort(), delegate it to Update() via the
      TemporaryReservationManager and ensure AddTcpUdpPort() precedes
      Update().
      Signed-off-by: default avatarDimitris Aragiorgis <dimara@grnet.gr>
      [iustin@google.com: small comments adjustements]
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      (cherry picked from commit 3b3b1bca)
      2522b7c4
  2. 24 Nov, 2011 1 commit
    • Michael Hanselmann's avatar
      LUGroupAssignNodes: Fix node membership corruption · 218f4c3d
      Michael Hanselmann authored
      
      
      Note: This bug only manifests itself in Ganeti 2.5, but since the
      problematic code also exists in 2.4, I decided to fix it there.
      
      If a node was assigned to a new group using “gnt-group assign-nodes” the
      node object's group would be changed, but not the duplicate member list
      in the group object. The latter is an optimization to require fewer
      locks for other operations. The per-group member list is only kept in
      memory and not written to disk.
      
      Ganeti 2.5 starts to make use of the data kept in the per-group member
      list and consequently fails when it is out of date. The following
      commands can be used to reproduce the issue in 2.5 (in 2.4 the issue was
      confirmed using additional logging):
      
        $ gnt-group add foo
        $ gnt-group assign-nodes foo $(gnt-node list --no-header -o name)
        $ gnt-cluster verify  # Fails with KeyError
      
      This patch moves the code modifying node and group objects into
      “config.ConfigWriter” to do the complete operation under the config
      lock, and also to avoid making use of side-effects of modifying objects
      without calling “ConfigWriter.Update”. A unittest is included.
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      218f4c3d
  3. 14 Nov, 2011 1 commit
  4. 23 Aug, 2011 1 commit
  5. 22 Jul, 2011 1 commit
  6. 28 Jun, 2011 2 commits
    • Iustin Pop's avatar
      Fix bug in recreate-disks for DRBD instances · b768099e
      Iustin Pop authored
      
      
      The new functionality in 2.4.2 for recreate-disks to change nodes is
      broken for DRBD instances: it simply changes the nodes without caring
      for the DRBD minors mapping, which will lead to conflicts in non-empty
      clusters.
      
      This patch changes Exec() method of this LU significantly, to both fix
      the DRBD minor usage and make sure that we don't have partial
      modification to the instance objects:
      
      - the first half of the method makes all the checks and computes the
        needed configuration changes
      - the second half then performs the configuration changes and
        recreates the disks
      
      This way, instances will either be fully modified or not at all;
      whether the disks are successfully recreate is another point, but at
      least we'll have the configuration sane.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      b768099e
    • Iustin Pop's avatar
      Fix a lint warning · 78ff9e8f
      Iustin Pop authored
      Patch db8e5f1c
      
       removed the use of feedback_fn, hence pylint warn
      now.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      78ff9e8f
  7. 27 Jun, 2011 1 commit
    • Iustin Pop's avatar
      Fix bug in drbd8 replace disks on current nodes · db8e5f1c
      Iustin Pop authored
      
      
      Currently the drbd8 replace-disks on the same node (i.e. -p or -s) has
      a bug in that it does modify the instance disk temporarily before
      changing it back to the same value. However, we don't need to, and
      shouldn't do that: what this operation do is simply change the LVM
      configuration on the node, but otherwise the instance disks keep the
      same configuration as before.
      
      In the current code, this change back-and-forth is fine *unless* we
      fail during attaching the new LVs to DRBD; in which case, we're left
      with a half-modified disk, which is entirely wrong.
      
      So we change the code in two ways:
      
      - use temporary copies of the disk children in the old_lvs var
      - stop updating disk.children
      
      Which means that the instance should not be modified anymore (except
      maybe for SetDiskID, which is a legacy and unfortunate decision that
      will have to cleaned up sometime).
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      db8e5f1c
  8. 23 Jun, 2011 1 commit
  9. 17 Jun, 2011 4 commits
  10. 26 May, 2011 1 commit
    • Michael Hanselmann's avatar
      TLReplaceDisks: Move assertion checking locks · a9b42993
      Michael Hanselmann authored
      Commit 1bee66f3
      
       added assertions for ensuring only the necessary locks
      are kept while replacing disks. One of them makes sure locks have been
      released during the operation. Unfortunately the commit added the check
      as part of a “finally” branch, which is also run when an exception is
      thrown (in which case the locks may not have been released yet). Errors
      could be masked by the assertion error. Moving the check out of the
      “finally” branch fixes the issue.
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      a9b42993
  11. 24 May, 2011 1 commit
  12. 16 May, 2011 1 commit
  13. 11 May, 2011 1 commit
  14. 09 May, 2011 2 commits
    • Iustin Pop's avatar
      Add --no-wait-for-sync when converting to drbd · 456798ab
      Iustin Pop authored
      
      
      Currently, when converting an instance from plain to DRBD, the
      instance is blocked during the entire resync period. This patch adds
      the --no-wait-for-sync so that the operation finishes as soon as the
      DRBD sync has started, without waiting for the entire sync. This makes
      the instance available much faster.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      456798ab
    • Iustin Pop's avatar
      Recreate instance disks: allow changing nodes · c8a96ae7
      Iustin Pop authored
      
      
      This patch introduces the option of changing an instance's nodes when
      doing the disk recreation. The rationale is that currently if an
      instance lives on a node that has gone down and is marked offline,
      it's not possible to re-create the disks and reinstall the instance on
      a different node without hacking the config file.
      
      Additionally, the LU now locks the instance's nodes (which was not
      done before), as we most likely allocate new resources on them.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      c8a96ae7
  15. 06 May, 2011 2 commits
  16. 03 May, 2011 2 commits
  17. 02 May, 2011 2 commits
  18. 29 Apr, 2011 2 commits
  19. 28 Apr, 2011 2 commits
  20. 27 Apr, 2011 5 commits
    • Iustin Pop's avatar
      Replace disks: keep the meta device in the same VG · fd09d178
      Iustin Pop authored
      
      
      This patch enhances the multi-VG support in replace disks, by keeping
      the meta device in the same VG, as opposed to moving it to the data
      device VG (note that we don't have a way to create the meta in a
      different VG in the first place, but at least we correctly handle a
      custom config).
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      fd09d178
    • Doug Dumitru's avatar
      Fix for multiple VGs - PlainToDrbd and replace-disks · 88aa7f66
      Doug Dumitru authored
      
      
      Converting an instance from 'plain' to 'drbd'.  The old code would
      create the drbd volumes in the default VG and then the renames would
      fail.  This fix pulls the plain VG names from the existing volumes and
      places it into the new disk template.
      
      Running 'replace-disks' has a similar issue with the new disks going
      into the wrong VG and then the rename failing.
      
      Their might be a similar issue with 'recreate-disks', but I actually
      have no idea what recreate-disks does, so did not look into it.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      88aa7f66
    • Iustin Pop's avatar
      Improve error messages in cluster verify/OS · 2db04578
      Iustin Pop authored
      
      
      A few issues in the clarity of the error messages are fixed:
      
      - "ERROR: node node3: OS API version lenny-image": no preposition
        between the parameter type and the OS name, changed to "for
        lenny-image"
      
      - "API version lenny-image differs from reference node node1: 10, 5
        vs. 10, 20, 5, 15": parameters not sorted in display
      
      - "OS variants list lenny-image differs from reference node node1:
        vs. default, i386": empty sets are not clearly delimited, changed to
        add [] around the sets: "node node1: [] vs. [default, i386]"
      
      - "OS parameters lenny-image differs from reference node node1:
        vs. (u'dhcp', u'Whether to enable (yes) or disable (dhcp)')": ugly
        formatting in the OS parameters list, as we used to just "%s" the
        tuple; now it is "reference node node1: [] vs. [dhcp: Whether to
        enable (yes) or disable (dhcp)]"
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      2db04578
    • Iustin Pop's avatar
      Prevent readding of the master node · d833acc6
      Iustin Pop authored
      
      
      This breaks Ganeti in multiple ways. If we don't make the check in
      gnt-node itself, then bootstrap.SetupNodeDaemon will restart the
      master daemon, making the operation fail:
      
        node1# gnt-node add --readd node1
        Cannot communicate with the master daemon.
        Is it running and listening for connections?
      
      The check in cmdlib is more of a safety check, as we shouldn't reach
      it. If we do (via a bad client), then it will prevent breakage in the
      job queue/config handling.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      d833acc6
    • Iustin Pop's avatar
      Fix punctuation in an error message · cce6f357
      Iustin Pop authored
      
      
      IIRC we don't use punctuation at the end of error messages.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      cce6f357
  21. 20 Apr, 2011 1 commit
  22. 19 Apr, 2011 1 commit
    • Iustin Pop's avatar
      disk wiping: fix bug in chunk size computation · 6e7f0cd9
      Iustin Pop authored
      
      
      The current wipe_chunk_size computation is doing min(int_value,
      float_value). For small disks (below 10GiB), the actual formula will
      result into the float value being chosen. This results into very
      interesting behaviour:
      
      Wiping disk 0, offset 102.4, chunk 102.4
      Wiping disk 0, offset 204.8, chunk 102.4
      …
      Wiping disk 0, offset 921.6, chunk 102.4
      Wiping disk 0, offset 1024.0, chunk 1.13686837722e-13
      
      Since these are passed to dd via %d, this will result into the call to
      dd specifying offset 1024 and count 0, which will fail.
      
      We just need to enforce conversion to int, in order to not get bitten
      by floating point rounding errors.
      
      The patch also reorders some logging messages in order to log the
      chunk size.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      6e7f0cd9
  23. 14 Apr, 2011 1 commit
  24. 13 Apr, 2011 1 commit