1. 18 Sep, 2012 1 commit
  2. 18 Jul, 2012 1 commit
    • René Nussbaumer's avatar
      Fix inconsistency in the LUXI protocol w.r.t. args · 734a2a7c
      René Nussbaumer authored
      
      
      This inconsistency was found during rebalancing. Hbal failed because,
      Ganeti couldn't load the opcode. After digging through the cause, an
      inconsistency with the "args" field in the LUXI protocol was triggered
      by the TemplateHaskell side where it's done uniformed.
      
      For SubmitJob and SubmitManyJobs we treat args as one argument,
      containing the job definition. In every other LUXI call args is actually
      a list of arguments. This patch fixes this consistency.
      
      This change is NOT backwards compatible.
      Signed-off-by: default avatarRené Nussbaumer <rn@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      734a2a7c
  3. 08 May, 2012 1 commit
    • Iustin Pop's avatar
      Correct capitalisation of two Luxi calls · 83c046a2
      Iustin Pop authored
      
      
      Two Luxi calls have inconsistent an name/value mapping (in the Python
      code):
      
      - REQ_AUTOARCHIVE_JOBS versus AutoArchiveJobs (versus AutoarchiveJobs)
      - REQ_QUEUE_SET_DRAIN_FLAG versus SetDrainFlag (no Queue)
      
      While these are only a consistency issue, let's fix them so that the
      Haskell code (which uses the auto-generated camel-case form) doesn't
      need to handle them case specially, and looks more like the Python
      code (hah, joke!).
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      83c046a2
  4. 26 Apr, 2012 1 commit
  5. 22 Feb, 2012 1 commit
  6. 21 Dec, 2011 1 commit
    • Michael Hanselmann's avatar
      serializer: Remove JSON indentation and dict key sorting · a182a3ed
      Michael Hanselmann authored
      
      
      Serializing to JSON using “simplejson” is significantly slower when
      indentation and/or sorting of dictionary keys is used. In simplejson 1.x
      the difference isn't that big, but with simplejson 2.x the difference
      can be up to a factor of 7.5. The reason is that the latter no longer
      uses C functions when sorting or indentation is used.
      
      With this patch we revert everything to simplejson's defaults, which
      should provide us with the best performance available.
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      a182a3ed
  7. 12 Oct, 2011 2 commits
    • Iustin Pop's avatar
      Standardise LUXI call argument types · a629ecb9
      Iustin Pop authored
      
      
      Currently, we have 4 types of arguments in LUXI calls:
      
      - most common, a list of values
      - a single argument that is sent as a list of one element
      - a single argument that is sent by itself
      - a dictionary (only Query and QueryFields)
      
      This inconsistency makes it not only harder to auto-generate the
      HTools LUXI interface, but also in general to check the arguments and
      (if we ever want to do it) auto-generate the Python LUXI client.
      
      Compare this with the node daemon, which uses consistently a list for
      its arguments, and even with way more changes over time had no issues
      with extending the interface.
      
      In case we want to extend a call, there are two options:
      
      - preferred: add a new call, keep the old one unchanged
      - possible: add further parameters to the current argument list
      
      The patch against HTools will follow—sending separately as the Python
      changes are very clear by themselves.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      a629ecb9
    • Iustin Pop's avatar
      Rename filter and filter_ to qfilter · 2e5c33db
      Iustin Pop authored
      
      
      We currently use 'filter' as the OpCode, QueryRequest and RAPI field
      name for representing a query filter. However, since 'filter' is a
      built-in function, we actually have to use filter_ throughout the code
      in order to not override the built-in function.
      
      This patch simply goes and does a global sed over the code. Due to the
      fact that the RAPI interface already exposed this field, we add
      compatibility code for now which handles both forms.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
      2e5c33db
  8. 30 Aug, 2011 1 commit
  9. 02 May, 2011 1 commit
  10. 15 Mar, 2011 1 commit
  11. 07 Jan, 2011 1 commit
  12. 06 Jan, 2011 1 commit
  13. 20 Dec, 2010 1 commit
  14. 13 Dec, 2010 1 commit
  15. 01 Dec, 2010 1 commit
  16. 01 Nov, 2010 1 commit
  17. 28 Oct, 2010 2 commits
  18. 24 Aug, 2010 1 commit
    • Michael Hanselmann's avatar
      Add simple lock monitor · 19b9ba9a
      Michael Hanselmann authored
      
      
      This patch adds an initial implementation of a lock monitor, accessible
      for the user through “gnt-debug locks”. It currently shows all resource
      locks: BGL, nodes and instances. Config and job queue locks could be
      shown too, but wouldn't be of much help.  The current owner(s) and mode
      are also shown.
      
      Showing pending acquires will require further changes on the SharedLock
      internals and is not yet implemented.
      
      Example output:
      $ gnt-debug locks -o name,mode,owner
      Name            Mode      Owner
      BGL/BGL         shared    JobQueue19/Job147
      instances/inst1 exclusive JobQueue19/Job147
      instances/inst2 -         -
      instances/inst3 -         -
      instances/inst4 -         -
      nodes/node1     exclusive JobQueue19/Job147
      nodes/node2     exclusive JobQueue19/Job147
      Signed-off-by: default avatarMichael Hanselmann <hansmi@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      19b9ba9a
  19. 28 Jul, 2010 1 commit
  20. 18 May, 2010 1 commit
    • Guido Trotter's avatar
      Abstract the LUXI eom into a constant · 25942a6c
      Guido Trotter authored
      
      
      Currently the EOM terminator is hardcoded on the server side, and is
      customizable in the Transport object (with the default being the same as
      the value found in the server), but not in the luxi client.
      
      With this patch we move the value to constants, and remove the "fake"
      customizability, which would just break client/server communication. If
      we ever need to have a luxi transport with a different terminator it's
      easy enough to add it back.
      Signed-off-by: default avatarGuido Trotter <ultrotter@google.com>
      Reviewed-by: default avatarIustin Pop <iustin@google.com>
      25942a6c
  21. 11 May, 2010 1 commit
  22. 22 Feb, 2010 2 commits
  23. 22 Jan, 2010 2 commits
  24. 05 Jan, 2010 1 commit
    • Iustin Pop's avatar
      Introduce a Luxi call for GetTags · 7699c3af
      Iustin Pop authored
      
      
      This changes from submitting jobs to get the tags (in cli scripts) to
      queries, which (since the tags query is a cheap one) should be much
      faster.
      
      The tags queries are already done without locks (in the generic query
      paths for instances/nodes/cluster), so this shouldn't break tags query
      via gnt-* list-tags.
      
      On a small cluster, the runtime of gnt-cluster/gnt-instance list tags
      more than halves; on a big cluster (with many MCs) I expect it to be
      more than 5 times faster. The speed of the tags get is not the main
      gain, it is eliminating a job when a simple query is enough.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarRené Nussbaumer <rn@google.com>
      7699c3af
  25. 04 Jan, 2010 1 commit
  26. 13 Oct, 2009 1 commit
  27. 25 Sep, 2009 1 commit
  28. 27 Aug, 2009 1 commit
  29. 26 Aug, 2009 1 commit
  30. 19 Jul, 2009 1 commit
    • Iustin Pop's avatar
      Add a luxi call for multi-job submit · 56d8ff91
      Iustin Pop authored
      
      
      As a workaround for the job submit timeouts that we have, this patch
      adds a new luxi call for multi-job submit; the advantage is that all the
      jobs are added in the queue and only after the workers can start
      processing them.
      
      This is definitely faster than per-job submit, where the submission of
      new jobs competes with the workers processing jobs.
      
      On a pure no-op OpDelay opcode (not on master, not on nodes), we have:
        - 100 jobs:
          - individual: submit time ~21s, processing time ~21s
          - multiple:   submit time 7-9s, processing time ~22s
        - 250 jobs:
          - individual: submit time ~56s, processing time ~57s
                        run 2:      ~54s                  ~55s
          - multiple:   submit time ~20s, processing time ~51s
                        run 2:      ~17s                  ~52s
      
      which shows that we indeed gain on the client side, and maybe even on
      the total processing time for a high number of jobs. For just 10 or so I
      expect the difference to be just noise.
      
      This will probably require increasing the timeout a little when
      submitting too many jobs - 250 jobs at ~20 seconds is close to the
      current rw timeout of 60s.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      (cherry picked from commit 2971c913)
      56d8ff91
  31. 07 Jul, 2009 3 commits
  32. 21 May, 2009 1 commit
    • Iustin Pop's avatar
      Add a luxi call for multi-job submit · 2971c913
      Iustin Pop authored
      
      
      As a workaround for the job submit timeouts that we have, this patch
      adds a new luxi call for multi-job submit; the advantage is that all the
      jobs are added in the queue and only after the workers can start
      processing them.
      
      This is definitely faster than per-job submit, where the submission of
      new jobs competes with the workers processing jobs.
      
      On a pure no-op OpDelay opcode (not on master, not on nodes), we have:
        - 100 jobs:
          - individual: submit time ~21s, processing time ~21s
          - multiple:   submit time 7-9s, processing time ~22s
        - 250 jobs:
          - individual: submit time ~56s, processing time ~57s
                        run 2:      ~54s                  ~55s
          - multiple:   submit time ~20s, processing time ~51s
                        run 2:      ~17s                  ~52s
      
      which shows that we indeed gain on the client side, and maybe even on
      the total processing time for a high number of jobs. For just 10 or so I
      expect the difference to be just noise.
      
      This will probably require increasing the timeout a little when
      submitting too many jobs - 250 jobs at ~20 seconds is close to the
      current rw timeout of 60s.
      Signed-off-by: default avatarIustin Pop <iustin@google.com>
      Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
      2971c913
  33. 04 Feb, 2009 2 commits
    • Iustin Pop's avatar
      Add one new luxi query: cluster info · 66baeccc
      Iustin Pop authored
      This is the last query that RAPI executes via opcodes and is purely
      static (config values only). As such, we can convert it safely to a
      query instead of job.
      
      Reviewed-by: imsnah
      66baeccc
    • Iustin Pop's avatar
      Implement lockless query operations · ec79568d
      Iustin Pop authored
      This patch adds the framework for, and enables lockless OpQueryInstances. This
      means that instances will be shown in ERROR_up or ERROR_down state, even though
      this is not an error (but just an in-progress job).
      
      The framework is implemented as follows:
        - the OpQueryInstances, OpQueryNodes and OpQueryExports opcodes take
          an additional “use_locking” flag which will denote whether to lock
          or not; this patch only implements this for LUQueryInstances
        - the luxi query functions take an additional argument use_locking
          which is passed to the master daemon, and then passed to the above
          opcodes
        - cli.py export a new SYNC_OPT command line options which implement
          setting this flag to true
        - except for gnt-instance list, which uses this option, and for
          name-only queries (e.g. QueryNodes(fields=["names"])), all other
          callers are setting this flag to True
        - RAPI also sets the flag to True
      
      The patch was tested with a continuous (0.2s sleep in-between)
      gnt-instance list during a burnin, and no problems were observed.
      
      Reviewed-by: ultrotter
      ec79568d