> There are two classes of issues in your AFL - an algorithm/math >issue and a looping issue.
Commentary on the math issue:
(I have always liked the teaching method of historically preserved aphorisms, with additional commentary added at a later date, since I first read some of the eastern manuscripts in Comparative Religion101 ... a la Confucius' commentary on the IChing).
In some ways I benefit from being a naive programmer. I don't think the positives outweigh the negatives overall but certainly when I started with AB I took the concept of array processing into my progamming heart and soul and there are times when the fact that I don't know anything else, and hence practice a loop avoidance stategy, as a matter of self preservation, has paid off.
It seems as if AB is a unique program in that, at its core, it is pure array processing software?
I have been comfortable with the concept of array processing for quite a while (I think the Excel examples of array processing are one of the better teaching aids to be found in the AB manual).
Sometimes I wonder if we don't pay enough attention to AB's core philosophy i.e. machine level processing of array's.
Perhaps the status quo, in programming, hasn't elevated array processing to the level that AB demands. For one thing maths seems to have developed over a period of time before computing arrived on the scene and so some/many of the maths expressions that are traditional may not be the most suitable for array processing.
For example:
When Z and I were mulling over how to calculate the moments of a distribution (median, mode, mean, skew, kurtosis, variance/StDev) and how to calc them in AB, especially if we want to do it in a hurry (on the fly in an RT window?) it highlighted the fact, to me, that a re-arrangement of the maths was the key to get a performance hit (speed gain).
Luckily I recalled that in one textbook that I had skim read there were some calculations, for the moments of a normal dist, that used only basic maths operands (+- */) ... my assumption is that these operations are grist for the mill for array processors (how good was that 'lucky find' because I have never read a maths book in my life ... good old Ralph knows the pointy end of the trading stick from the blunt end).
In this case variance was the sticking point (in the maths expressions that I am familiar with variance has to look ahead, at each value, to a future lastvalue(mean) to calculate the variance = Abs(currentvalue - mean). The problem there is that if we want to calc variance on the fly (a running value) we get a new mean every bar and we have to go back and recalc the variance, for each element, relevant to the new mean.... very SLOW.
So the question now is:
- if I want to find maths expressions that have been rearranged for array processing is there anywhere I can do that?
The references I have so far are:
http://en.wikipedia .org/wiki/ Numerical_ Recipes
http://en.wikipedia .org/wiki/ GNU_Scientific_ Library
These are very big 'reads' though...... haven't started on them yet ... I am looking for some time saving tips....my wife has plans for my spare time :-)
Before I go off hunting through sites/books like that, looking for array libraries of maths expressions, can anyone suggest the best place to learn array processing, as it applies to AFL, and the best place to find algorithms that convert the common maths _expression_ used in trading/stats for traders, into an array format.
Does anyting like that exist anywhere?
If I do read, or reference, NumericalRecipes, should I read the C++ version, C version, all of them, none of them ... are any of them likely to contain the psuedocode for variance expressed in an array friendly format? (I don't really want to go off and learn how to write a C++ plugin to calc the moments of the dist on the fly ... anyway it would still be SLOW wouldn't it, unless the C++ version is array optimized)?
--- In amibroker@xxxxxxxxx ps.com, "bruce1r" <brucer@xxx> wrote:
>
> Larry -
>
> Good first attempt. I thought I'd tackle this for two reasons. First,
> you did a fair amount of work before asking. But, second, I had been
> about to write an article for a web site about array processing and
> happened upon your post. If you don't mind, I'd like to use a variation
> of it as an example for that. There are several great teaching points
> in the performance issue that you ran into.
>
> There are two classes of issues in your AFL - an algorithm/math issue
> and a looping issue. Let's deal with the major one first.
>
> As you might suspect the inner loop is the problem. If you assume
> approx. 400 x 1 minute bars per day, then on average for each bar, you
> will execute the inner (j) loop 200 times. This has to go !
>
> Here's the original code for reference -
>
> // now the hard part...calculate the variance...
> // a separate calc from the start of each day - note it requires the
> vwap from above
> // also note, we calculate starting at the first bar in the new day to
> today to the curent bar
> Variance = 0;
>
> for ( j = newdayindex; j < i; j++ )
> {
> AvgPrice = ( O[j] + H[j] + L[j] + C[j] ) / 4;
> Variance += ( Volume[j] / totVolume ) *
> ( Avgprice - Vwap2temp ) * ( Avgprice - Vwap2temp );
> }
>
> The way to get rid of it may not be what you expect, though. You are
> calculating a variance at each bar by restarting the calculation from
> the first bar of the day. At first glance, it probably appeared that
> this was required because of the volume weighting.
>
> With some fairly straightforward algebraic manipulation, the formula can
> be converted into a "running" variance calculation. I'll just show the
> result, and you can work through it. Replace the code above with -
>
> variance = ( prevvar * prevtotvol / totvolume ) +
> ( Volume[i] / totvolume ) *
> ( Avgprice - Vwap2temp ) * ( Avgprice - Vwap2temp );
>
> prevtotvol = totvolume;
> prevvar = Variance;
>
> Finally, to support this code, there are two areas of initialization. I
> did it this way to minimize changes to your original code. Insert the
> following code in two places -
>
> prevtotvol = 0;
> prevvar = 0;
>
> Put it once above the outer (i) loop before the for( i=0; i<Barcount;
> i++). Then, put it in the newday initialization found in the if (
> newday[i] == True) block.
>
> This should yield a improvement of over two orders of magnitude.
> Actually, a little more could be wrung out as you have a few
> calculations that could be streamlined. I didn't post the entire code,
> because the endgame is really an AFL with NO LOOPS that uses purely
> array processing. That will yield about an additional minimum 2x
> improvement at the smaller intervals (5000 bars) that you are probably
> using, but much greater as bars increase. As importantly, it is only 7
> or 8 lines of code in total. I'll show you that next and later.
>
> -- BruceR
>
>
> --- In amibroker@xxxxxxxxx ps.com, "shakerlr" <ljr500@> wrote:
> >
> > I just created the following code to calculate the VWAP + std
> deviation bands, but have found that it is extrememly slow. I posted
> the original code to the amibroker study site and was wondering if
> anyone has any suggestions to speed it up for display on 1 minute
> charts.
> >
> > Also, I noticed that if I DO NOT USE:
> > SetBarsRequired( 1000, 0 );
> >
> > The bands show up incorrect... (sometimes expanding/shrinkkin g as I
> scroll on the 1 minute chart)
> >
> > Note that I have about 100000 bars in my stock/ticker being
> studied...so that may be the reason it is slow...
> >
> > ----
> > /// VWAP code that also plots standard deviations.. .if you want a
> 3rd...it
> > should be fairly simple to add
> > //
> > // NOTE: the code is SLOOOOWWWW.. .can someone help speed it up?
> > // I tried my best, but can't really do much with the two for-loops...
> > //
> > // LarryJR
> >
> >
> > SetBarsRequired( 1000, 0 );
> >
> > // this stores true/false based on a new day...
> > newday=Day() != Ref(Day(), -1);
> >
> > SumPriceVolume= 0;
> > totVolume=0;
> > Vwap2=0;
> > stddev=0;
> > newdayindex= 0;
> > Variance =0;
> >
> > // we must use a loop here because we need to save the vwap for each
> bar to
> > calc the variance later
> > for( i= 0; i < BarCount; i++ )
> > {
> > // only want to reset our values at the start of a new day
> > if (newday[i]== True)
> > {
> > SumPriceVolume= 0;
> > totVolume=0;
> > newdayindex= i; // this is the index at the start of a new day
> > Variance=0;
> > //Vwap2=0;
> > }
> > AvgPrice=(O[ i] + H[i] + L[i] + C[i])/4;
> >
> > // Sum of Volume*price for each bar
> > sumPriceVolume += AvgPrice * (Volume[i]);
> >
> > // running total of volume each bar
> > totVolume += (Volume[i]);
> >
> > if (totVolume[i] >0)
> > {
> > Vwap2[i]=Sumpricevo lume / totVolume ;
> > Vwap2temp=Vwap2[ i];
> > }
> >
> > // now the hard part...calculate the variance...
> > // a separate calc from the start of each day - note it requires the
> vwap from
> > above
> > // also note, we calculate starting at the first bar in the new day
> to today
> > to the curent bar
> > Variance=0;
> > for (j=newdayindex; j < i; j++)
> > {
> > AvgPrice=(O[ j] + H[j] + L[j] + C[j])/4;
> > Variance += (Volume[j]/totVolum e) *
> > (Avgprice-Vwap2temp )*(Avgprice- Vwap2temp) ;
> > }
> > stddev_1_pos[ i]=Vwap2temp + sqrt(Variance) ;
> > stddev_1_neg[ i]=Vwap2temp - sqrt(Variance) ;
> >
> > stddev_2_pos[ i]=Vwap2temp + 2*sqrt(Variance) ;
> > stddev_2_neg[ i]=Vwap2temp - 2*sqrt(Variance) ;
> > }
> > Plot (Vwap2,"VWAP2" ,colorDarkGrey, styleLine);
> > Plot (stddev_1_pos, "VWAP_std+ 1",colorGrey50, styleDashed) ;
> > Plot (stddev_1_neg, "VWAP_std- 1",colorGrey50, styleDashed) ;
> > Plot (stddev_2_pos, "VWAP_std+ 2",colorGrey40, styleDashed) ;
> > Plot (stddev_2_neg, "VWAP_std- 2",colorGrey40, styleDashed) ;
> >
>