Talk:Statistics/Summary/Variance

Hi! I came here looking for an explanation of why the differences are squared when calculating standard deviation. The text gives the same explanation as many other texts on statistics I've seen: to get rid of the minus signs. However, if this was the whole story, it would be sufficient to take the absolute value of each difference, and divide by N to get what I think is called "average deviation".

So what makes standard deviation better than average deviation?

Greetings, Simon


 * I'm not sure I would say that the standard deviation is "better" or "worse" than the mean deviation. Most of the equations are based around the standard deviation, and so that is what we use.  This is just a guess, but since the advent of computers it may also have to do with subtraction of nearly equal numbers being inaccurate and the absolute value in general can be non-useful to work with if you are trying to keep a "running estimate" of the deviation estimate.


 * For example, with the standard deviation, we only need to calculate the sum, the sum of squares, and the number of entities to find the variance (and from there, the standard deviation) for any number of values.  This computational method is more accurate on a computer (since it avoids subtraction of nearly equal numbers), and mathematically identical to the "standard method" used.


 * With the mean deviation, I don't know of any cute ways of rewriting it to avoid that step, so every time a number is added the mean would have to be recalculated, along with each differential along the way. Thus, for the list of numbers {5, 4, 3, 2, 1} we only have to store the numbers 55 (the sum of squares), 15 (the sum), and 5 (the number of entries.  If we add the number 6 to the list, we can calculate 91, 21, and 6 without having to go back to the original list of numbers.  This is much simpler than for the mean deviation, where we need to go back and calculate each individual differential ({2, 1, 0, -1, -2} in the first case and {1.5, 0.5, -0.5, -1.5, -2.5} in the second).


 * Does that make any sense? --Llywelyn 04:17, 28 Sep 2004 (UTC)

The standard deviation is a good estimate of scale when the data are normally distributed. In practice, it is common for data to be non-normal. They can often be transformed to something resembling normality (often, taking logs is a good choice). However, even after transformation it is common that the data will be heavy tailed and subject to outliers.

In these circumstances, the standard deviation is not "better" than the mean absolute deviation. In fact, it is much less efficient (i.e. its variance is higher). There are, however, even better estimates of scale than the mean absolute deviation. One is the mad (median absolute deviation).

This issue is used by Huber (1981) as an example that suggests robust methods over classical methods (Huber's book has recently been republished in paperback format and is more affordable than previously). I believe that Tukey also drew attention to the shortcoming of the standard deviation in this respect as far back as 1960s. This article on robust statistics might be of interest.

P. J. Huber, Robust Statistics, Wiley, 1981
 * Hey that's interesting to hear, because I just posted a comment on the article saying that the "get rid of minus" argument is tosh - just that I had assumed there would be sound reasonings otherwise. I'm writing a statistics test soon, and so I'm sure I'll be using the conventional method there, but may look into this later on. Sean Heron (talk) 20:29, 11 July 2009 (UTC)