From f828f4aa7158424563deda1207934a4e98faa2ab Mon Sep 17 00:00:00 2001 From: Iustin Pop <iustin@google.com> Date: Fri, 23 Sep 2011 14:53:35 +0900 Subject: [PATCH] Parallelise instance allocation/capacity computation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This patch finally enables parallelisation in instance placement. My original try for enabling this didn't work well, but it took a while (and liberal use of threadscope) to understand why. The attempt was to simply `parMap rwhnf` over allocateOnPair, however this is not good as for a 100-node cluster, this will create roughly 100*100 sparks, which is way too much: each individual spark is too small, and there are too many sparks. Furthermore, the combining of the allocateOnPair results was done single-threaded, losing even more parallelism. So we had O(nΒ²) sparks to run in parallel, each spark of size O(1), and we combine single-threadedly a list of O(nΒ²) length. The new algorithm does a two-stage process: we group the list of valid pairs per primary node, relying on the fact that usually the secondary nodes are somewhat balanced (it's definitely true for 'blank' cluster computations). We then run in parallel over all primary nodes, doing both the individual allocateOnPair calls *and* the concatAllocs summarisation. This leaves only the summing of the primary group results together for the main execution thread. The new numbers are: O(n) sparks, each of size O(n), and we combine single-threadedly a list of O(n) length. This translates directly into a reasonable speedup (relative numbers for allocation of 3 instances on a 120-node cluster): - original code (non-threaded): 1.00 (baseline) - first attempt (2 threads): 0.81 (20% slowdownβΌ) - new code (non-threaded): 1.00 (no slowdown) - new code (threaded/1 thread): 1.00 - new code (2 threads): 1.65 (65% faster) We don't get a 2x speedup, because the GC time increases. Fortunately the code should scale well to more cores, so on many-core machines we should get a nice overall speedup. On a different machine with 4 cores, we get 3.29x. Signed-off-by: Iustin Pop <iustin@google.com> Reviewed-by: Agata Murawska <agatamurawska@google.com> --- htools/Ganeti/HTools/Cluster.hs | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/htools/Ganeti/HTools/Cluster.hs b/htools/Ganeti/HTools/Cluster.hs index d8c2713f7..5f4a8d1e2 100644 --- a/htools/Ganeti/HTools/Cluster.hs +++ b/htools/Ganeti/HTools/Cluster.hs @@ -74,6 +74,7 @@ module Ganeti.HTools.Cluster ) where import qualified Data.IntSet as IntSet +import Data.Function (on) import Data.List import Data.Maybe (fromJust, isNothing) import Data.Ord (comparing) @@ -627,6 +628,19 @@ concatAllocs as (OpGood ns) = -- elements of the tuple in nsols `seq` nsuc `seq` as { asAllocs = nsuc, asSolution = nsols } +-- | Sums two 'AllocSolution' structures. +sumAllocs :: AllocSolution -> AllocSolution -> AllocSolution +sumAllocs (AllocSolution aFails aAllocs aSols aLog) + (AllocSolution bFails bAllocs bSols bLog) = + -- note: we add b first, since usually it will be smaller; when + -- fold'ing, a will grow and grow whereas b is the per-group + -- result, hence smaller + let nFails = bFails ++ aFails + nAllocs = aAllocs + bAllocs + nSols = bestAllocElement aSols bSols + nLog = bLog ++ aLog + in AllocSolution nFails nAllocs nSols nLog + -- | Given a solution, generates a reasonable description for it. describeSolution :: AllocSolution -> String describeSolution as = @@ -684,10 +698,12 @@ tryAlloc :: (Monad m) => -> AllocNodes -- ^ The allocation targets -> m AllocSolution -- ^ Possible solution list tryAlloc nl _ inst (Right ok_pairs) = - let sols = foldl' (\cstate (p, s) -> - concatAllocs cstate $ allocateOnPair nl inst p s - ) emptyAllocSolution ok_pairs - + let pgroups = groupBy ((==) `on` fst) ok_pairs + psols = parMap rwhnf (foldl' (\cstate (p, s) -> + concatAllocs cstate $ + allocateOnPair nl inst p s) + emptyAllocSolution) pgroups + sols = foldl' sumAllocs emptyAllocSolution psols in if null ok_pairs -- means we have just one node then fail "Not enough online nodes" else return $ annotateSolution sols -- GitLab