From c3c7a0c123ac9a95e4292b1beee92129320fa14f Mon Sep 17 00:00:00 2001 From: Iustin Pop <iustin@google.com> Date: Wed, 21 Jul 2010 17:47:25 +0200 Subject: [PATCH] Change the meaning of the N+1 fail metric Currently, this metric tracks the nodes failing the N+1 check. While this helps (in some cases) to evacuate such nodes, it's not a good metric since rarely it will change during a step (only at the last instance moving away). Therefore we replace it with the count of instances living on such nodes, which is much better because: - moving an instance away while the node is still N+1 failing will still reflect in the score as an optimization - moving the last instance causing an N+1 failure will result in a heavy decrease of this score, thus giving the right bonus to clear this status --- Ganeti/HTools/Cluster.hs | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/Ganeti/HTools/Cluster.hs b/Ganeti/HTools/Cluster.hs index ab573645f..1e0df9ec6 100644 --- a/Ganeti/HTools/Cluster.hs +++ b/Ganeti/HTools/Cluster.hs @@ -233,9 +233,10 @@ compDetailedCV nl = mem_cv = varianceCoeff mem_l -- metric: disk covariance dsk_cv = varianceCoeff dsk_l - n1_l = length $ filter Node.failN1 nodes - -- metric: count of failN1 nodes - n1_score = fromIntegral n1_l::Double + -- metric: count of instances living on N1 failing nodes + n1_score = fromIntegral . sum . map (\n -> length (Node.sList n) + + length (Node.pList n)) . + filter Node.failN1 $ nodes :: Double res_l = map Node.pRem nodes -- metric: reserved memory covariance res_cv = varianceCoeff res_l -- GitLab