Commit f828f4aa authored by Iustin Pop's avatar Iustin Pop
Browse files

Parallelise instance allocation/capacity computation

This patch finally enables parallelisation in instance placement.

My original try for enabling this didn't work well, but it took a
while (and liberal use of threadscope) to understand why. The attempt
was to simply `parMap rwhnf` over allocateOnPair, however this is not
good as for a 100-node cluster, this will create roughly 100*100
sparks, which is way too much: each individual spark is too small, and
there are too many sparks. Furthermore, the combining of the
allocateOnPair results was done single-threaded, losing even more
parallelism. So we had O(n²) sparks to run in parallel, each spark of
size O(1), and we combine single-threadedly a list of O(n²) length.

The new algorithm does a two-stage process: we group the list of valid
pairs per primary node, relying on the fact that usually the secondary
nodes are somewhat balanced (it's definitely true for 'blank' cluster
computations). We then run in parallel over all primary nodes, doing
both the individual allocateOnPair calls *and* the concatAllocs
summarisation. This leaves only the summing of the primary group
results together for the main execution thread. The new numbers are:
O(n) sparks, each of size O(n), and we combine single-threadedly a
list of O(n) length.

This translates directly into a reasonable speedup (relative numbers
for allocation of 3 instances on a 120-node cluster):

- original code (non-threaded): 1.00 (baseline)
- first attempt (2 threads):    0.81 (20% slowdown

- new code (non-threaded):      1.00 (no slowdown)
- new code (threaded/1 thread): 1.00
- new code (2 threads):         1.65 (65% faster)

We don't get a 2x speedup, because the GC time increases. Fortunately
the code should scale well to more cores, so on many-core machines we
should get a nice overall speedup. On a different machine with 4
cores, we get 3.29x.
Signed-off-by: default avatarIustin Pop <>
Reviewed-by: default avatarAgata Murawska <>
parent d7339c99
......@@ -74,6 +74,7 @@ module Ganeti.HTools.Cluster
) where
import qualified Data.IntSet as IntSet
import Data.Function (on)
import Data.List
import Data.Maybe (fromJust, isNothing)
import Data.Ord (comparing)
......@@ -627,6 +628,19 @@ concatAllocs as (OpGood ns) =
-- elements of the tuple
in nsols `seq` nsuc `seq` as { asAllocs = nsuc, asSolution = nsols }
-- | Sums two 'AllocSolution' structures.
sumAllocs :: AllocSolution -> AllocSolution -> AllocSolution
sumAllocs (AllocSolution aFails aAllocs aSols aLog)
(AllocSolution bFails bAllocs bSols bLog) =
-- note: we add b first, since usually it will be smaller; when
-- fold'ing, a will grow and grow whereas b is the per-group
-- result, hence smaller
let nFails = bFails ++ aFails
nAllocs = aAllocs + bAllocs
nSols = bestAllocElement aSols bSols
nLog = bLog ++ aLog
in AllocSolution nFails nAllocs nSols nLog
-- | Given a solution, generates a reasonable description for it.
describeSolution :: AllocSolution -> String
describeSolution as =
......@@ -684,10 +698,12 @@ tryAlloc :: (Monad m) =>
-> AllocNodes -- ^ The allocation targets
-> m AllocSolution -- ^ Possible solution list
tryAlloc nl _ inst (Right ok_pairs) =
let sols = foldl' (\cstate (p, s) ->
concatAllocs cstate $ allocateOnPair nl inst p s
) emptyAllocSolution ok_pairs
let pgroups = groupBy ((==) `on` fst) ok_pairs
psols = parMap rwhnf (foldl' (\cstate (p, s) ->
concatAllocs cstate $
allocateOnPair nl inst p s)
emptyAllocSolution) pgroups
sols = foldl' sumAllocs emptyAllocSolution psols
in if null ok_pairs -- means we have just one node
then fail "Not enough online nodes"
else return $ annotateSolution sols
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment