Skip to content
Snippets Groups Projects
  • Iustin Pop's avatar
    Optimise cli.JobExecutor with many pending jobs · 11705e3d
    Iustin Pop authored
    
    In the case we submit many pending jobs (> 100) to the masterd, the
    JobExecutor 'spams' the master daemon with status requests for the
    status of all the jobs, even though in the end it will only choose a
    single job for polling.
    
    This is very sub-optimal, because when the master is busy processing
    small/fast jobs, this query forces reading all the jobs from
    this. Restricting the 'window' of jobs that we query from the entire
    set to a smaller subset makes a huge difference (masterd only, 0s
    delay jobs, all jobs to tmpfs thus no I/O involved):
    
    - submitting/waiting for 500 jobs:
      - before: ~21 s
      - after:   ~5 s
    - submitting/waiting for 1K jobs:
      - before: ~76 s
      - after:   ~8 s
    
    This is with a batch of 25 jobs. With a batch of 50 jobs, it goes from
    8s to 12s. I think that choosing the 'best' job for nice output only
    matters with a small number of jobs, and that for more than that
    people will not actually watch the jobs. So changing from 'perfect
    job' to 'best job in the first 25' should be OK.
    
    Note that most jobs won't execute as fast as 0 delay, but this is
    still a good improvement.
    
    Signed-off-by: default avatarIustin Pop <iustin@google.com>
    Reviewed-by: default avatarGuido Trotter <ultrotter@google.com>
    Reviewed-by: default avatarMichael Hanselmann <hansmi@google.com>
    11705e3d