1. 11 Apr, 2012 2 commits
    • Dimitris Aragiorgis's avatar
      Further fixes concerning drbd port release · 42f25b0b
      Dimitris Aragiorgis authored
      Commit 3b3b1bca does not entirely fix the bug introduced in commit
      f396ad8c
      
      . It fixes consistency of config data in permanent storage, but
      does not ensure consistency in data held in runtime memory of masterd.
      
      The bug of duplicate ports is still triggered when LUInstanceRemove()
      invokes _RemoveDisks() and this returns False (in case
      call_blockdev_remove RPC fails). The drbd ports get returned in the
      pool, but execution is aborted and RemoveInstance() is never invoked.
      
      Due to the fact that port handling is not done with
      TemporaryReservationManager, ensure that ports are released, only if
      disk related config data is deleted.
      
      In _RemoveDisks() release ports only if all RPCs succeed.
      
      Extend _RemoveDisks() to include ignore_failures argument passed by
      _RemoveInstance() to handle the ports appropriately.
      Signed-off-by: default avatarDimitris Aragiorgis <dimara@grnet.gr>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      42f25b0b
    • Dimitris Aragiorgis's avatar
      Fix a bug concerning TCP port release · 2522b7c4
      Dimitris Aragiorgis authored
      Commit f396ad8c
      
       returns the TCP port used by DRBD disk back to the
      TCP/UDP port pool using AddTcpUdpPort().
      
      However, AddTcpUdpPort() writes the config on every invocation,
      using _WriteConfig(). This causes two problems:
      
       * it causes critical errors logged by VerifyConfig(), after the DRBD
         disk removal, and until the actual instance removal.
       * if the code following AddTcpUdpPort() fails, the port is already
         returned back the pool, which causes the port to have duplicates
         (inconsistent config).
      
      AddTcpUdpPort() is invoked in three cases:
      
       * during InstanceRemove() through _RemoveDisks().
       * during InstanceSetParams() in case of disk removal.
       * during InstanceSetParams() through _ConvertDrbdToPlain().
      
      This commit fixes the problem by removing the _WriteConfig() call from
      AddTcpUdpPort(), delegate it to Update() via the
      TemporaryReservationManager and ensure AddTcpUdpPort() precedes
      Update().
      Signed-off-by: default avatarDimitris Aragiorgis <dimara@grnet.gr>
      [iustin@google.com: small comments adjustements]
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      (cherry picked from commit 3b3b1bca)
      2522b7c4
  2. 30 Nov, 2011 2 commits
  3. 24 Nov, 2011 2 commits
    • Michael Hanselmann's avatar
      ConfigWriter: Fix epydoc error · 1730d4a1
      Michael Hanselmann authored
      
      
      The parameter is called “mods”, not “modes”.
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarAndrea Spadaccini <spadaccio@google.com>
      1730d4a1
    • Michael Hanselmann's avatar
      LUGroupAssignNodes: Fix node membership corruption · 218f4c3d
      Michael Hanselmann authored
      
      
      Note: This bug only manifests itself in Ganeti 2.5, but since the
      problematic code also exists in 2.4, I decided to fix it there.
      
      If a node was assigned to a new group using “gnt-group assign-nodes” the
      node object's group would be changed, but not the duplicate member list
      in the group object. The latter is an optimization to require fewer
      locks for other operations. The per-group member list is only kept in
      memory and not written to disk.
      
      Ganeti 2.5 starts to make use of the data kept in the per-group member
      list and consequently fails when it is out of date. The following
      commands can be used to reproduce the issue in 2.5 (in 2.4 the issue was
      confirmed using additional logging):
      
        $ gnt-group add foo
        $ gnt-group assign-nodes foo $(gnt-node list --no-header -o name)
        $ gnt-cluster verify  # Fails with KeyError
      
      This patch moves the code modifying node and group objects into
      “config.ConfigWriter” to do the complete operation under the config
      lock, and also to avoid making use of side-effects of modifying objects
      without calling “ConfigWriter.Update”. A unittest is included.
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      218f4c3d
  4. 14 Nov, 2011 2 commits
  5. 27 Oct, 2011 1 commit
  6. 20 Oct, 2011 1 commit
  7. 18 Oct, 2011 1 commit
  8. 12 Oct, 2011 1 commit
    • Michael Hanselmann's avatar
      rpc: Disable HTTP client pool and reduce memory consumption · 05927995
      Michael Hanselmann authored
      
      
      We noticed that “ganeti-masterd” can use large amounts of memory,
      especially on large clusters. Measurements showed a single PycURL client
      using about 500 kB of heap memory (the actual usage depends on versions,
      build options and settings).
      
      The RPC client uses a per-thread HTTP client pool with one client per
      node. At this time there are 41 non-main threads (25 for the job queue
      and 16 for client requests). This means the HTTP client pools use a lot
      of memory (ca. 200 MB for 10 nodes, ca. 1 GB for 50 nodes).
      
      This patch disables the per-thread HTTP client pool. No cleanup of
      unused code is done. That will be done in the master branch only.
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      05927995
  9. 06 Sep, 2011 1 commit
  10. 26 Aug, 2011 1 commit
  11. 23 Aug, 2011 5 commits
  12. 19 Aug, 2011 1 commit
  13. 05 Aug, 2011 3 commits
  14. 04 Aug, 2011 1 commit
  15. 03 Aug, 2011 3 commits
  16. 28 Jul, 2011 1 commit
  17. 26 Jul, 2011 1 commit
  18. 25 Jul, 2011 3 commits
  19. 22 Jul, 2011 2 commits
  20. 11 Jul, 2011 2 commits
    • Michael Hanselmann's avatar
      ht: Add new check for numbers · 697f49d5
      Michael Hanselmann authored
      
      
      Places which receive floats can usually also deal with integers, e.g.
      OpTestDelay. Tests are added and the new check function is used for the
      aforementioned opcode and verifying query results.
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      697f49d5
    • Michael Hanselmann's avatar
      Fix off-by-one bug in job serial generation · 3c88bf36
      Michael Hanselmann authored
      Commit 009e73d0
      
       (September 2009) changed the job queue to generate
      multiple job serials at once. Ever since it would return one more than
      requested.
      
      The “serial” file in the job queue directory is defined to contain the
      “last job ID used” (design-2.0). With the change above, the serial file
      would always contain the next serial number. The first value returned by
      the generating function was the one contained in the file, so during the
      switch in 2009 one job may have been overwritten.
      
      This patch changes the code to always return the exact number of
      serials, to keep the last used serial on disk and adds an assertion.
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      3c88bf36
  21. 01 Jul, 2011 2 commits
  22. 28 Jun, 2011 2 commits
    • Iustin Pop's avatar
      Fix bug in recreate-disks for DRBD instances · b768099e
      Iustin Pop authored
      
      
      The new functionality in 2.4.2 for recreate-disks to change nodes is
      broken for DRBD instances: it simply changes the nodes without caring
      for the DRBD minors mapping, which will lead to conflicts in non-empty
      clusters.
      
      This patch changes Exec() method of this LU significantly, to both fix
      the DRBD minor usage and make sure that we don't have partial
      modification to the instance objects:
      
      - the first half of the method makes all the checks and computes the
        needed configuration changes
      - the second half then performs the configuration changes and
        recreates the disks
      
      This way, instances will either be fully modified or not at all;
      whether the disks are successfully recreate is another point, but at
      least we'll have the configuration sane.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      b768099e
    • Iustin Pop's avatar
      Fix a lint warning · 78ff9e8f
      Iustin Pop authored
      Patch db8e5f1c
      
       removed the use of feedback_fn, hence pylint warn
      now.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      78ff9e8f