| PureBytes Links Trading Reference Links | Lance wrote:
>My understanding of the difference in calculating the population vs
>sample standard deviation is that subtracting one from 'n' (n-1),
>in the case of sample stddev, essentially amounts to introducing
>an error factor in the equation to compensate for the fact that
>you are basing your analysis on a small "sample" of a larger
>"population".
That's my understanding too.
>Basically, subtracting one from 'n' decreases the denominator &
>increases the final stddev value.
Yes, and what I said earlier about it being impossible to derive the
exponential version of the sample standard deviation was wrong.  If Ds
is the sample standard deviation and Dp is the population standard
deviation, then Ds = sqrt(n/(n-1)) * Dp.  They're different by a
constant factor.  When n is large, the factor is approximately 1.
>As Alex said though, "Fortunately these two formulas are
>approximately equal when n is large enough (like >20)".
As they should be.
>Based on Alex's observation, and if my understanding of the
>reasoning behind subtracting one from 'n' is correct, wouldn't it
>make more sense to subtract a percentage of n from n? Ex. (n - (n
>* .1)) or simply (n * .9). This would prevent sample stddev from
>converging with the more "precise" population stddev at larger 'n'
>values
No.  You *want* them to converge at large values of N.  Large values
of N imply you have a population (or at least a representative
population), rather than just a small sample.  I don't know why
you'd want to prevent the sample standard deviation from converging.
>I know all this amounts to splitting hairs, but I'd be interested
>to hear what one of you "math guys" has to say about this.
I'm not a math guy by any means, but that's what I think.
-Alex
 |