### Improve the standard deviation computation

```

This does just two passes, instead of three, over the list. This reduces
the overall runtime well enough (~25%) in some tests, but it's not
reproducible using profiling, so I don't know how much the function
itself is being sped-up.

Note: this is written via `seq`s, and not BangPatterns. Since it's just
one case, adding BangPatterns just for it wasn't a big gain.

Thanks to Lécz Balázs for the impetus to improve this!
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>```
parent 543e859d
 ... ... @@ -87,15 +87,21 @@ sepSplit sep s -- Simple and slow statistical functions, please replace with better -- versions -- | The covariance of the list -- | Our modified standard deviation function (not, it's not the variance) varianceCoeff :: [Double] -> Double varianceCoeff lst = let ll = fromIntegral (length lst)::Double -- length of list mv = sum lst / ll -- mean value av = foldl' (\accu em -> let d = em - mv in accu + d * d) 0.0 lst bv = sqrt (av / ll) -- stddev cv = bv / ll -- covariance in cv -- first, calculate the list length and sum lst in a single step, -- for performance reasons let (ll', sx) = foldl' (\(rl, rs) e -> let rl' = rl + 1 rs' = rs + e in rl' `seq` rs' `seq` (rl', rs')) (0::Int, 0) lst ll = fromIntegral ll'::Double mv = sx / ll av = foldl' (\accu em -> let d = em - mv in accu + d * d) 0.0 lst bv = sqrt (av / ll) -- stddev cv = bv / ll -- standard deviation divided by list length in cv -- * JSON-related functions ... ...
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!