Skip to content
Snippets Groups Projects
  1. May 09, 2011
  2. May 02, 2011
    • Iustin Pop's avatar
      Cluster verify: check for missing bridges · 20d317d4
      Iustin Pop authored
      
      Currently cluster verify doesn't check for bridge information; the
      only checks are done at instance create and failover/migrate
      time. This means a cluster that seems healthy will fail creation jobs.
      
      This patch implements a simple verification that all nodes (in the
      entire cluster, so doesn't work well for multi-group) have all the
      required bridges: the default one plus any instance bridge.
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      20d317d4
  3. Apr 20, 2011
    • Apollon Oikonomopoulos's avatar
      Shared storage instance migration · b9187ba2
      Apollon Oikonomopoulos authored
      
      Modify LUMigrateInstance and TLMigrateInstance to allow instance migrations for
      instances with DTS_EXT_MIRROR disk templates.
      
      Migrations of shared storage instances require either a target node, or an
      iallocator to determine the target node. If none is given, the cluster default
      iallocator is used.
      
      Locking behaviour: If the iallocator is used, then initially all nodes are
      locked and subsequently only the locks on the source node and the target node
      selected by the iallocator are retained.
      
      Signed-off-by: default avatarApollon Oikonomopoulos <apollon@noc.grnet.gr>
      [iustin@google.com: small changes in cmdlib.py]
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      b9187ba2
    • Apollon Oikonomopoulos's avatar
      Add bdev_sizes RPC call · 69266fae
      Apollon Oikonomopoulos authored
      
      The bdev_sizes multi-node RPC call returns the sizes of the requested
      block devices on the desired nodes. Its intended use is to verify the
      existence of a block device on a given node for shared block storage
      support.
      
      Block device paths are expected to lie under constants.BLOCKDEV_DIR
      ("/dev/disk" by default), where persistent symlinks for block devices
      are assumed to exist.
      
      Signed-off-by: default avatarApollon Oikonomopoulos <apollon@noc.grnet.gr>
      [iustin@google.com: small changes in backend.py]
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      69266fae
    • Apollon Oikonomopoulos's avatar
      Core shared file storage support · 53197381
      Apollon Oikonomopoulos authored
      
      This patch introduces core file storage support, consisting of the following:
      
      A configure-time switch for enabling/disabling shared file storage
      support and controlling the shared file storage location:
      --with-shared-file-storage-dir=.  Shared file storage configuration is then
      available as _autoconf.ENABLE_SHARED_FILE_STORAGE and
      _autoconf.SHARED_FILE_STORAGE_DIR and there is a cluster-wide ssconf
      key named "shared_file_storage_dir" for changing the file location.
      
      A new disk template named "sharedfile" (DT_SHARED_FILE), using
      ganeti.bdev.FileStorage.
      
      Auxiliary functions in lib/config.py to handle shared file storage.
      
      Signed-off-by: default avatarApollon Oikonomopoulos <apollon@noc.grnet.gr>
      [iustin@google.com: small style fixes]
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      53197381
  4. Feb 03, 2011
  5. Jan 28, 2011
    • Iustin Pop's avatar
      Re-create instance disk symlinks on activate · c417e115
      Iustin Pop authored
      
      This patch implements recreation of instance disk symlinks when the
      activate-disks operation is run. Until now, it was not possible to
      re-create these symlinks without stopping and starting or migrating an
      instance as the RPC call where this is done was in instance startup
      and migration.
      
      In order to do this, the blockdev_assemble rpc call needs the disk
      index too, which is added to the protocol. This is a change from 2.3
      and makes instance startup incompatible (FYI).
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      c417e115
  6. Jan 27, 2011
    • Iustin Pop's avatar
      cluster verify: add hvparams verification · 58a59652
      Iustin Pop authored
      
      Currently, the validity of the hypervisor parameters is only checked
      at init/modification time, and not in the cluster verify. This is bad,
      as it can lead to inconsistent state that is only detected when the
      next modification (which can be unrelated) is made, leading to
      unexpected error messages.
      
      This patch adds both syntax verification (in masterd) and validity
      verification on remote nodes. The downside of the patch is that on
      clusters with many instances which have custom parameters, it will be
      slow. A possible improvement would be to detect duplicate, identical
      set of parameters, and collapse these into a single verification, but
      that is left as a TODO (in case it becomes problematic).
      
      An additional change is in utils.ForceDict, where we said 'key',
      whereas this function is always used with parameter dicts, so I
      changed it to "Unknown parameter".
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      58a59652
  7. Jan 26, 2011
    • Iustin Pop's avatar
      Verify disks: increase parallelism and other fixes · 397693d3
      Iustin Pop authored
      
      The recent work on multi-VG support has converted LUClusterVerifyDisks
      into doing serialised calls to each node, as each node can have
      different VGs. This is suboptimal, especially for big clusters, where
      this LU is executed by the watcher very often.
      
      This patch changes the logic based on the observation that querying a
      node for its VGs and then requesting a LV list for those VGs is
      equivalent to simply asking for all LVs, without specifying the VG
      name(s). So backend.py needs changes to accept an empty VG list, and
      the LU itself partially reverts to the previous version.
      
      Additionally, we do two other fixes to this LU:
      
      - small improvement in getting the instance list from the config
      - MapLVsByNode works for all disk types, hence no need to restrict to
        the DRBD template, especially as today we can "recreate" disks for
        plain volumes too (the warning message in gnt-cluster is updated
        too)
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      397693d3
  8. Jan 20, 2011
  9. Jan 11, 2011
  10. Jan 05, 2011
  11. Dec 21, 2010
    • Iustin Pop's avatar
      Allow customisation of the disk index separator · 3536c792
      Iustin Pop authored
      
      As per issue 124, some Xen versions (or packaging) don't deal nicely
      with the colon being part of a disk name. Therefore we add a
      configure-time option for customising this.
      
      Note: setting the separator to interesting values like / is not
      handled by the code. This being a configure-time option (e.g. to be
      set by distribution packagers), we assume the person building the code
      knows what they are doing.
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      3536c792
  12. Dec 02, 2010
  13. Dec 01, 2010
  14. Nov 29, 2010
  15. Nov 28, 2010
  16. Nov 26, 2010
    • Iustin Pop's avatar
      RPC call_node_info: change protocol · cb6a0296
      Iustin Pop authored
      
      Currently, the call_node_info RPC does always check both the VG free
      space and the hypervisor information. However, in ⅔ of the uses, we only
      care about one or the other. Therefore, we change it so that if any of
      the passed parameters is None, we don't perform the respective check. We
      also modify its callers to only pass in what they need.
      
      This also helps if the "default" hypervisor is broken and we want to
      create an instance for another hypervisor.
      
      With this patch, the duration of this rpc changes from 500ms to 90ms for
      a normal LVM+Xen PVM node, when we only require the LVM data; when we
      only require the hypervisor data, it doesn't change (as the “xm list”
      time is dominant).
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      cb6a0296
  17. Nov 03, 2010
  18. Oct 28, 2010
    • Iustin Pop's avatar
      Add support for vm_capable in cluster verify · 8964ee14
      Iustin Pop authored
      
      The method to make vm_capable integrate easily into cluster verify is as follows:
      
      - we add a new NV_VMNODES that represents *non*-vm-capable nodes
      - the LU populates this list (it's expected that non-vm_capable nodes
        are few compared to vm_capable nodes)
      - backend skips the checks that are related to VM hosting
      - in the LU, we reorder the VM-related checks so that they occur after
        the non-VM (generic) tests, and we only execute them conditionally
      
      Additionally, we add some support to the instance checks to detect
      instances living on bad nodes.
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      8964ee14
  19. Oct 26, 2010
  20. Oct 25, 2010
  21. Oct 22, 2010
  22. Sep 30, 2010
    • Iustin Pop's avatar
      Abstract OS name/variant functions · 870dc44c
      Iustin Pop authored
      
      Currently, the computation of the 'pure' name or the variant is
      hardcoded and spread around the functions that need it. This is not
      nice, and in the future we'd spread it even more with more usage of
      variants/pure os names.
      
      This patch abstracts these functions into the OS class, and then
      replaces the hardcoded uses with the new functions.
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      870dc44c
  23. Sep 23, 2010
  24. Sep 13, 2010
  25. Sep 07, 2010
  26. Sep 03, 2010
  27. Aug 23, 2010
  28. Aug 20, 2010
  29. Aug 19, 2010
  30. Aug 18, 2010
    • Manuel Franceschini's avatar
      Support for resolving hostnames to IPv6 addresses · b705c7a6
      Manuel Franceschini authored
      
      This patch enables IPv6 name resolution by using socket.getaddrinfo
      instead of socket.gethostbyname_ex.
      
      It renames the HostInfo class to Hostname and unifies its use throughout
      the code. This is achieved by using static calls where no object is
      needed and removes some obsolete code.
      
      For now, we just resolve to IPv4 addresses, but this will change once it
      is needed.
      
      Signed-off-by: default avatarManuel Franceschini <livewire@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      b705c7a6
    • Manuel Franceschini's avatar
      Introduce new IPAddress classes · 8b312c1d
      Manuel Franceschini authored
      
      This patch unifies the netutils functions dealing with IP addresses to
      three classes:
      - IPAddress: Common IP address functionality
      - IPv4Address: IPv4 specific functionality
      - IPv6address: IPv6-specific functionality
      
      Furthermore it adds methods to check whether an address is a loopback
      address, replacing the .startswith("127") for IPv4 and adding IPv6
      support.
      
      It also provides the basis for future IPv6 address handling. Methods to
      convert IP strings to their corresponding interger values will allow to
      canonicalize IPv6 addresses.
      
      Signed-off-by: default avatarManuel Franceschini <livewire@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      8b312c1d
  31. Jul 29, 2010
    • Iustin Pop's avatar
      Instance migration: remove error on missing link · b8ebd37b
      Iustin Pop authored
      
      Since we don't support upgrades from 1.2.4 without restarting the
      instance, the 'not restarted since 1.2.5' check/error is
      wrong/misleading.
      
      Since the live migration works anyway without the links (it recreates
      them during the disk reconfiguration anyway), we remove the check and we
      transform it into a warning (to the node daemon log only,
      unfortunately).
      
      For 2.3, we'll need to change the symlink creation from instance start
      time to disk activation time (but that requires more RPC changes).
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      b8ebd37b
  32. Jul 26, 2010
    • Iustin Pop's avatar
      Change the meaning of call_node_start_master · 91492e57
      Iustin Pop authored
      
      Currently, backend.StartMaster (the function behind this RPC call) will
      activate the master IP and then, if the start_daemons parameter is true,
      it will also activate the master role.
      
      While this works, it has two issues:
      
      - first, it will activate the master IP unconditionally, even if this
        node will not start the master daemon due to missing votes
      - second, the activation of the IP is done twice if start_daemons is
        true, because the master daemon does its own activation too
      
      This behaviour seems to be unmodified since Summer 2008, so probably any
      rationale on why this is done in two places is forgotten.
      
      The patch changes so that this function does *either* IP activation or
      master role activation but not both. So the IP will be activated only
      once (from the master daemon or from LURenameCluster), and it will only
      be done if the masterd got enough votes for startup.
      
      I can see only one downside to this change: if masterd won't actually
      start (due to missing votes), RAPI will still start, and without the
      master IP activated. But this is no worse than before, when both RAPI
      was running and the IP was activated.
      
      Note that the behaviour of StopMaster remains the same, as noone else
      does the IP removal.
      
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      91492e57
Loading