Commit 7570569e authored by Iustin Pop's avatar Iustin Pop
Browse files

Improve the standard deviation computation



This does just two passes, instead of three, over the list. This reduces
the overall runtime well enough (~25%) in some tests, but it's not
reproducible using profiling, so I don't know how much the function
itself is being sped-up.

Note: this is written via `seq`s, and not BangPatterns. Since it's just
one case, adding BangPatterns just for it wasn't a big gain.

Thanks to Lécz Balázs for the impetus to improve this!
Signed-off-by: default avatarIustin Pop <iustin@google.com>
Reviewed-by: default avatarBalazs Lecz <leczb@google.com>
parent 543e859d
......@@ -87,15 +87,21 @@ sepSplit sep s
-- Simple and slow statistical functions, please replace with better
-- versions
-- | The covariance of the list
-- | Our modified standard deviation function (not, it's not the variance)
varianceCoeff :: [Double] -> Double
varianceCoeff lst =
let ll = fromIntegral (length lst)::Double -- length of list
mv = sum lst / ll -- mean value
av = foldl' (\accu em -> let d = em - mv in accu + d * d) 0.0 lst
bv = sqrt (av / ll) -- stddev
cv = bv / ll -- covariance
in cv
-- first, calculate the list length and sum lst in a single step,
-- for performance reasons
let (ll', sx) = foldl' (\(rl, rs) e ->
let rl' = rl + 1
rs' = rs + e
in rl' `seq` rs' `seq` (rl', rs')) (0::Int, 0) lst
ll = fromIntegral ll'::Double
mv = sx / ll
av = foldl' (\accu em -> let d = em - mv in accu + d * d) 0.0 lst
bv = sqrt (av / ll) -- stddev
cv = bv / ll -- standard deviation divided by list length
in cv
-- * JSON-related functions
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment