PureBytes Links
Trading Reference Links
|
Hopefully one *last* post on this topic...
The SumX/SumX2 non-array calculation of SD that I posted works,
but you should be aware of a limitation with it.
The final SD calculation is
SD = sqrt( (SumX2 - SumX^2/Length) / (Length-1) )
Note that this includes the term "SumX2 - SumX^2/Length". This
is usually a subtraction of two very close large numbers.
So what? Remember that TS uses single-precision floats, which
have a precision of roughly 7 decimal digits. Anything beyond
about 7 digits is fantasy resulting from the translation of the
internal binary representation to the printed decimal value. See
http://www.purebytes.com/archives/omega/2003/msg00597.html for a
more complete explanation.
I tried computing SD with my algorithm, with the TS StdDevS()
function (which divides by Length-1 instead of Length, like my
algorithm does and like Excel does), and with the Excel STDEV()
function. My algorithm is close to Excel's STDEV() and TS's
StdDevS, but not exactly the same. Why? Because of the
precision problem.
Example: I calculated the SD of the Close of a day's worth of
5min bars in the S&P. That's 81 bars in the sample. My SD's
results were close to Excel's, but not identical: 3.67 vs.
3.6797, 3.71 vs. 3.6966, 2.83 vs. 2.8236, etc.
Then I calculated the SumX2 and SumX^2/Length terms in Excel.
For the first example I looked at, SumX2 = 56102168.640, and
SumX^2/Length = 56101431.563. The (correct) difference is
737.077. However, the first digit that differs in that sum is
the 6th digit of the large numbers -- dangerously close to TS's
limit of precision. The "same" values, as calculated by TS, were
SumX2 = 56102172.00, SumX^2/Length = 56101428.00, Diff =
744.000.
Notice that TS's values for the sums differ from Excel's, and
thus the Diff term is wrong. TS calculated an SD of
sqrt(744.000/(Length-1)), instead of sqrt(737.077/(Length-1)).
That's why TS comes up with a slightly different answer.
Can you solve this? Sometimes, yes, you can. As I described in
the archive posting referenced above, you can accumulate the
"difference of two big numbers" term as you go along, avoiding
the subtraction of two big numbers.
But I don't think you can do that with the SD calculation. You
could add in the SumX2 term, but the square(SumX)/Ntrades term
presents a problem. You might know Ntrades ahead of time --
let's say you always want to look at the SD of the last 20
samples -- but how do you add in one piece of the square(SumX)
term, without knowing the sum of all X's? square(SumX)/Ntrades
is eqivalent to SumX*Avg, so you could subtract the SumX term,
but you don't know Avg until the end of the sample period. I may
be missing something, but I don't think there's any way to
calculate the SD accurately, in a sample-by-sample sum-it-up-as-
you-go approach, given TS's precision limitations. I think you
have to do it after-the-fact, the way TS's functions do.
The built-in TS function, BTW, *is* accurate. StdDevS divides by
Length-1 so its calculation matches Excel's, AND it doesn't
suffer from the precision problems that TS's coeffR function has.
Instead of subtracting large numbers, is uses the "root mean
square" algorithm, subtracting the average from each X value:
For Counter = 0 To Length - 1 Begin
SumSqr = SumSqr+(Price[Counter]-Avg)*(Price[Counter]-Avg);
End;
The subtractions happen with small numbers -- Price-Avg -- so
there is very little loss of precision.
So: if you want to calculate your own SD, and precision is
important to you, you shouldn't use my algorithm. Use TS's
StdDevS if you want to calculate the SD of a value on the last N
bars, or StdDevS_a if you want to capture N widely-spaced events
(like trade results) in an array. Either one should match
Excel's results very closely.
Gary
|