Re: [amibroker] Re: Paul Ho: Memory Challenges with Great Ranking Tool, AmiBroker Email List Archive

2008/7/12 Paul Ho <paul.tsho@xxxxxxxxx>:

upgrade to the latest version of AB

From: amibroker@xxxxxxxxxxxxxxx [mailto:amibroker@xxxxxxxxxxxxxxx] On Behalf Of rijnaars
Sent: Saturday, 12 July 2008 3:32 PM

To: amibroker@xxxxxxxxxxxxxxx
Subject: [amibroker] Re: Paul Ho: Memory Challenges with Great Ranking Tool

Can some one help me i try to run this ranking tool but get the next
errors
RestorePriceArrays();

n = !IsNull(mRoc);

m = !IsNull(mRSI);

roccount +=
-------------^

Error 30.
Syntax error

n = !IsNull(mRoc);

m = !IsNull(mRSI);

roccount += n;

rsicount +=
-------------^

Error 30.
Syntax error

than the result for only one ticker in my watchlist shows up

regards Rene

--- In amibroker@xxxxxxxxxxxxxxx, "bruce1r" <brucer@xx> wrote:
>
> Tomasz -
>
> Great diagnosis. I should've have thought of it. I've seen it
happen
> to a number of the FT users. That is why we stressed cache settings
> in the 2007 conference class that you helped with.
>
> Ken -
>
> There is no reason to run anything > 5200 bars with a FT database.
> Soon that will change to a lower number at next weekend's
conversion.
> And, the cache size setting needs to big big enough to handle > 500
> symbols. For FT, that is at least > 83 MB. FWIW, I run 5200 bars,
> 1000 symbols, and 160MB cache.
>
> Paul -
>
> In the interest of conveying some info constructively, let me
explain
> why I posted what I did.
>
> First, though, I think you might want to re-consider abount a range
> algorithm being slower. A 2*N algorithm like the range algorithm
will
> almost ALWAYS be faster than a N*N algorithm - it will be much
faster
> for large N.
>
> I hope that Ken is able to get a N*N approach fast enough. I
totally
> agree with you that it can be. He is still too slow, though.
>
> But, the other reason that I suggested range values to Ken was than
he
> is using it for ranking and probably rotational systems. This is
what
> many from the FastTrack community gravitate toward. He will have to
> confirm that part. But, in that setting, the range percentages can
> capture info about closeness of mixed scores that ordinal ranking
can
> mask. If it fits the score distribution characteristics, this can
be
> used to minimize rotation with little impact on fitness - which is
> typically a goal.
>
> Lastly, about the Pad and Align, I can only tell you that my
> experience is different. In a 20 year database like the FastTrack
> database like I think that Ken is using, if you only need a few
years
> of backtest, then aligning to a symbol with a short history will
> dramatically speed things up (thanks to Tomasz's implementation).
>
> Anyway, this has been an interesting discussion. Let me know if you
> need any details about the above. I'm getting back to some other
> trading work.
>
> -- BruceR
>
>
> --- In amibroker@xxxxxxxxxxxxxxx, "Tomasz Janeczko" <groups@> wrote:
> >
> > Hello,
> >
> > Indeed I have run this code on larger watch list and can confirm
> Paul timings.
> >
> > I am running Athonl 64 x2 @2GHz
> >
> > For 100 symbols it is just 50 seconds.
> > For 500 symbols (SP500) and history from 1992 till now (16 years)
it
> is just 13 minutes.
> > If the same 500 symbols are explored using QuickAFL turned on and
> last 4 years only, the time shrinks to 6 minutes 49 seconds .
> > These are actual timings, not progress bar estimate.
> > Yes the time grows slightly as process progresses but it is not
that
> surprising considering the fact it outputs
> > 500 * number of quotes lines (1.3 million lines for 10 year
history)
> >
> > I think this timings are quite OK for N*N algorithm.
> >
> > Memory also is not an issue. Running it on 500 symbols 16 years
> history with full caching enabled
> > caused that AmiBroker consumed 175MB of RAM.
> >
> > If you are getting timings in hours, I suspect that you are using
> sub-optimum cache settings. Please go to
> > Tools->Preferences, "Data" tab and increase "in-memory" cache to
at
> least 500 symbols (the size of watch list under test).
> > If cache is too small it will force many disk accesses. With cache
> large enough - everything will be in RAM.
> >
> > Best regards,
> > Tomasz Janeczko
> > amibroker.com
> > ----- Original Message -----
> > From: Paul Ho
> > To: amibroker@xxxxxxxxxxxxxxx
> > Sent: Monday, July 07, 2008 9:29 AM
> > Subject: RE: [amibroker] Re: Paul Ho: Memory Challenges with
Great
> Ranking Tool
> >
> >
> > Ken
> > As you know, the algorithm that I gave you increase dramatically
> with the size of the watchlist. However, the times that you stated
> isn't in line with what I experience. I tried my code (which is
> similar to yours, with exception that it also stores the value in
OI)
> on 2 machines. One very old machine, a single core AMD which is
about
> 5 years old, Window tells me it is an AMD XP 2600 with 1G of ram. I
> use a watchlist of 472 symbols, running from 1/1/2000 to now. the
time
> taken is 6.5 minutes (from the progress bar). and on a newer machine
> Core 2 duo E6600, it took just over 1 minute. So this is very
> different to the one and a half hour that you are talking about.
> > So I am wondering what machine you're running on. and what kind
of
> AA setting you use? What I will suggest is that you DONT Check pad
> and align, this will increase the number of bars by quite a lot. You
> can check QuickAFL though.
> > I have also given you a few suggestions in one of the other
posts,
> including monthly bars, normalisation of scores etc
> http://finance.groups.yahoo.com/group/amibroker/message/126336. Did
> you take a look at that?
> > Despite its shortcoming, I think the N^2 algorithm will still
> perform better/faster than Bruce's suggestions. Pad and Align &/or
ATC
> will slow it down even more. In the past, I have made a ranking dll
> which uses a 2 dimensional array, and basically insert the stock
into
> the right ranking order as each symbol is scanned, a little bit like
> Fred's algorithm but on a total array basis. That is certainly very
> fast. In addition, Tomasz's custom Backtester code has the potential
> also to be quite fast without the use of dll. I have ported the code
> so it stores directly in the OI during after optimization. However,
it
> comes back with an internal error in certain instances if the
> watchlist gets over 1400 symbols and no of bars is more than 2500. I
> have sent it to Tomasz for him to have a look at, when he comes
back.
> I can share that with you. But my point is that the N^2 idea should
be
> fast enough for what you are talking about ~500 stocks. You see
there
> is a big difference between 1 minute and 1 and half hour.
> > Send me a private email if you like, I'm curious why there such
a
> big difference.
> > Cheers
> >
> >
> >
> >
> >
> >
> ----------------------------------------------------------
--------
> > From: amibroker@xxxxxxxxxxxxxxx
> [mailto:amibroker@xxxxxxxxxxxxxxx] On Behalf Of Ken Close
> > Sent: Monday, 7 July 2008 7:40 AM
> > To: amibroker@xxxxxxxxxxxxxxx
> > Subject: RE: [amibroker] Re: Paul Ho: Memory Challenges with
> Great Ranking Tool
> >
> >
> > Bruce:
> >
> > Thanks for giving me another way to go.
> >
> > In case you have been following (or remember) this topic from
a
> ways back,
> > Fred was generous enough to write me a code concept for doing
> all of this
> > (but end of range only) and I successfully converted it to my
> "endpoint"
> > recipe of 11 or so indicators. It did ok for relatively large
> watchlists
> > (even the RUT) but could not handle very well some much larger
> watchlists in
> > the many 1000s. But, it gave me a combined ranking on any end
> point date I
> > set in the AA window. My absolute minimum requirement is to
create
> > combo-rank-scores on a monthly basis but I was pleasantly
> surprised when
> > Paul Ho served up a concept to create daily combo-rank-scores
on
> a daily
> > basis, but then euphoria changed to despair as I encountered
the
> n^2 time
> > factor.
> >
> > So, thanks to you and others I have a variety of ways to
> consider getting to
> > the end of this problem.
> >
> > 1. Your suggestion of normalized indicators and using a final
> percentage
> > value as the combo rank.
> >
> > 2. Taking Fred's code and finding a way to manipulate the
> EndofRange date,
> > basically repeating his code over and over on the same
watchlist
> but with
> > changing EndofRange dates. (I still have to do something with
> the collection
> > of combo-scores I will accumulate by date, but that is another
> issue.*** see
> > below) I have even considered manually repeating the process
to
> the end
> > point (only 12 runs per year x the 8 years I want to test
over).
> >
> > 3. Taking Paul Ho's code which ranks daily and either living
> with the
> > limitation in the Watchlist population or running the thing
over
> night.
> > Since I have speed problems now with 2 indicators and 150+
> symbols, I
> > probably will drop off the cliff with 11 indicators and the
same
> 150+
> > symbols. An alternate which I plan to test next is to see how
> Paul's code
> > performs on a Weekly or even Monthly compressed basis,
although
> if symbol
> > number is controlling and not barcount, then this will not do
> much good.)
> >
> > 4. Using Tomasz's suggestion of the custombacktester, making
11
> separate
> > runs, then somehow combining the 11 different output reports,
> coming up with
> > a combo-rank that way.
> >
> > If you were approaching this, can you guess and say which
> approach you would
> > concentrate on. Right now, number 4 looks like it actually
might
> be the
> > least programming and execution intensive, but I am not sure.
I
> also have
> > to have a way of updating the entire system as time goes
> forward. That will
> > bring an additional set of challenges I am sure.
> >
> > Thanks for stepping in.
> >
> > Ken
> >
> > *** Paul Ho shared a small COM code snippit that sticks an
> indicator nicely
> > into the OI field of a symbol, so that is the approach I want
to
> take once I
> > have the combo-rank to stick in the right place. Talk about
> complex......
> >
> > PS: Bruce, if you are still reading, would I have a better
chance of
> > executing my task in Trade vs Amibroker (sorry Thomasz)?
> >
> > -----Original Message-----
> > From: amibroker@xxxxxxxxxxxxxxx
> [mailto:amibroker@xxxxxxxxxxxxxxx] On Behalf
> > Of bruce1r
> > Sent: Sunday, July 06, 2008 5:03 PM
> > To: amibroker@xxxxxxxxxxxxxxx
> > Subject: [amibroker] Re: Paul Ho: Memory Challenges with Great
> Ranking Tool
> >
> > Ken -
> >
> > I'm too involved with something else right now, but let me see
> if I can
> > offer quick suggestion. First -
> >
> > 1. Tomasz is pointing out the solutions in (N^2) time are
never
> practical
> > past some limit. That means that the execution time goes up
with
> the square
> > of the number of items - ticker in this case. There are a
couple of
> > programming tricks that you can play, but I don't think that
> they are going
> > to get you where you want to go -
> >
> > For example, programming tricks can be used to make the N^2
> comparison
> > matrix "triangular". This reduces the comparisons by half.
> >
> > You might use Pad and Align to a ticker with a short history
to
> cut the time
> > further.
> >
> > But, this is still going to leave you in a long timeframe.
> >
> > 2. It looks like you are trying to add unbounded indicators
and
> use the
> > ordinal values to normalize them so that they can be
combined.
> > Use of the custom backtester would still require that you
> generate output
> > for each indicator and then combine them.
> >
> > Another approach might be to go out of the box a little and
> question your
> > basic assumption. Here's what I mean.
> >
> > Ordinal values can be used to convert unbounded ranges (such
as
> ROC) to
> > bounded values. But they can do some strange things to
outliers.
> > For example, consider these points. Say they are for tickers
> A,B,C,D,E on a
> > particular day -
> >
> > 0, 20, 21, 22, 200
> >
> > The point 22 is ranked #2 (higher value better) when it is not
> near the top.
> >
> > ON THE OTHER HAND, range value can be used also to convert
> unbounded data to
> > bounded. THEY REQUIRE A PRE-SCAN TO KNOW THE MIN AND MAX.
> > For the range above, it would convert to the following
percentages -
> >
> > 0, 10, 10.5, 11, 100
> >
> > This has some advantages for certain data distributions, but
some
> > disadvantages for others. For data where the probability of
> outliers is
> > low, it yields similar results.
> >
> > SO, HERE'S WHAT YOU MIGHT DO.
> >
> > 1. Take a watchlist and start a Exploration pass. When
> > Status("stocknum") == 0, loop through the list and find the
> global Min and
> > Max for each bar across all of the tickers for a given
indicator
> and store
> > it in an ATC in the H and L fields. For RSI and ROC, you would
> have 2 ATC's
> > - say ~MINMAX_ROC and ~MINMAX_RSI. This is 1 pass of all N
tickers.
> >
> > 2. Continuing on for stocknum 0 and for 1 - N, calculate the
ROC
> and RSI and
> > convert it to a percentage of the MIN and MAX range that you
> stored in the
> > ATC's for each bar -
> >
> > rangepcnt = ( tickscore - tickglobalmin ) / ( tickmax -
> tickglobalmin
> > ) * 100;
> >
> > 3. Now you can combine the range values because they are
> normalized.
> > If you divide by the number of indicators, you'll end up with
a
> combined
> > percentage.
> >
> > Now, while this is not an ordinal rank, it works perfectly
well
> for scoring
> > and is a solution in 2*N time. BTW - this reference won't mean
> much to most
> > here, but should to you - Ed Gilbert detailed this in Trade
doc
> almost a
> > decade ago.
> >
> > -- BruceR
> >
> > --- In amibroker@xxxxxxxxxxxxxxx, "Tomasz Janeczko" <groups@>
wrote:
> > >
> > > Hello,
> > >
> > > No, look again. The code I provided gives the sort is ON BAR
> BY BAR
> > basis.
> > >
> > > Best regards,
> > > Tomasz Janeczko
> > > amibroker.com
> > > ----- Original Message -----
> > > From: Ken Close
> > > To: amibroker@xxxxxxxxxxxxxxx
> > > Sent: Sunday, July 06, 2008 9:08 PM
> > > Subject: RE: [amibroker] Paul Ho: Memory Challenges with
Great
> > Ranking Tool
> > >
> > >
> > > Tomasz:
> > >
> > > Thanks for all the help you give to so many people, me
included.
> > >
> > > However, while I did as you suggested with the
custombacktester,
> > and looked into the output file it produces, I am at a loss to
> know how to
> > use the data it contains. It is not all of the data that I
need.
> > >
> > > I want the ordinal ranking of multiple indicators, add them
all
> > together, per bar and per symbol, and use the final sum, of
the
> ORDINAL
> > ranks, as the ranking value for all symbols.
> > >
> > > This output represents what I want (but it is only for two
> > indicators). I want to turn this into my "recipe" which will
have
> > approximately 8 to 10 indicators.
> > >
> > >
> > >
> > > I ran the custom backtest, opened the output.html file, and
see
> > that the symbols are sorted by the ranking value and it is
> indeed an ordinal
> > value. But, the sort is done only once (probably as a lastbar
> > basis) and Paul Ho sorting algorithm gives me ordinal values
for
> each bar
> > for each symbol (displayed above using a lastbar basis).
> > >
> > > You say Paul's code is inefficient, and maybe it is because
it
> > sorts all symbols by all bars. Can you suggest a change to the
> specific
> > code that would do what I want, but more efficiently?
> > >
> > > Again, thanks for all that you do.
> > >
> > > Ken
> > >
> > >
> > >
> > >
> > ----------------------------------------------------------
> > --
> > > From: amibroker@xxxxxxxxxxxxxxx
[mailto:amibroker@xxxxxxxxxxxxxxx]
> > On Behalf Of Tomasz Janeczko
> > > Sent: Sunday, July 06, 2008 1:39 PM
> > > To: amibroker@xxxxxxxxxxxxxxx
> > > Subject: Re: [amibroker] Paul Ho: Memory Challenges with
Great
> > Ranking Tool
> > >
> > >
> > > Hello,
> > >
> > > The code is inefficient because it repeats the sorting N*N
times
> > where N is number of symbols, while
> > > only N times is enough.
> > >
> > > Ranking is a process that is done during first pass of
backtest.
> > It is implemented efficiently.
> > > We can use this built-in process easily using custom
backtest
> > procedure as shown here:
> > >
> > > Note that this formula will not produce output in AA
directly.
> > Instead it will produce a HTML
> > > file (output.html) that you can later import to AA using AA,
> > File->Import
> > >
> > > Also please be warned that produced files are huge and
attempt to
> > load such big HTML file
> > > into Internet Explorer instead will easily hang IE.
> > >
> > > PositionScore = ROC( C, 14 ) + 1000; // WHAT YOU WANT TO
RANK
> > >
> > > SetOption("MaxOpenPositions", 10 );
> > > SetBacktestMode( backtestRegularRaw );
> > > Buy=1;
> > > Sell=0;
> > > SetCustomBacktestProc("");
> > > if( Status("action")==actionPortfolio )
> > > {
> > > bo = GetBacktesterObject();
> > >
> > > bo.PreProcess();
> > >
> > > dt = DateTime();
> > >
> > > fh = fopen("output.html", "w" );
> > >
> > >
> >
> fputs
("<TABLE><TR><TH>Symbol</TH><TH>Date/Time</TH><TH>Rank</TH></TR>\n",
> > fh );
> > >
> > > for( i = 0; i < BarCount; i++ )
> > > {
> > > k = 1;
> > > for( sig = bo.GetFirstSignal( i ); sig; sig =
> > bo.GetNextSignal( i ) )
> > > {
> > > Line = "<TR><TD>" + sig.Symbol + "</TD><TD>" +
> > > DateTimeToStr( dt[ i ] ) + "</TD><TD>" + k +
> > "</TD></TR>\n";
> > > fputs( Line, fh );
> > > k++;
> > > }
> > > }
> > >
> > > bo.PostProcess();
> > >
> > > fputs( "</TABLE>", fh );
> > > fclose( fh );
> > > }
> > >
> > >
> > > Best regards,
> > > Tomasz Janeczko
> > > amibroker.com
> > > ----- Original Message -----
> > > From: Ken Close
> > > To: amibroker@xxxxxxxxxxxxxxx
> > > Sent: Sunday, July 06, 2008 5:35 PM
> > > Subject: [amibroker] Paul Ho: Memory Challenges with Great
> > Ranking Tool
> > >
> > >
> > > Paul:
> > >
> > > my initial euphoria has turned somewhat downward as I
attempt to
> > apply the code below (just two indicators) to larger
Watchlists. You
> > sounded (from other messages) like someone who knows the ins
and
> outs of
> > memory management with AB, and perhaps can comment on how to
> keep the code
> > below from "bogging down".
> > >
> > > In spite of my many years with AB and its array processing,
my
> > mind still has a problem wrapping around what this code is
doing
> and why
> > (and whether) larger populated Watchlists will ever be able
to work.
> > >
> > > I initially tested against the DJ-30 (30 symbols) and all
went
> > well, fairly quickly, perhaps 10-15 seconds.
> > >
> > > I then tried the NDX (100 symbols) and things went more
slowly
> > but finished. I noticed the symbols appearing in the AA window
> more slowly.
> > >
> > > I have not been able to nor wanted to wait for the SP-500,
as
> > the symbols appear more and more slowly and the est time
counter
> was saying
> > something like 1 1/2 hours to complete 500 symbols.
> > >
> > > I was assuming that the code had to collect or process all
> > symbols before it could make comparisons among them---this is
> probably false
> > or else why would processed symbols start to appear in the AA
> window while
> > it is still accessing symbols.
> > >
> > > What suggestions can you make, given your understanding of
the
> > code and AB, that would minimize the processing of large
member
> watchlists?
> > >
> > > Can adding a SetBarsRequired in the right place limit the
number
> > of lookback bars that are processed, and thus speed up
execution?
> > >
> > > As the number of indicators I wish to process into a "Total
> > Rank" score increases, I imagine that executing this code will
> get slower
> > and slower and may not be possible at all. Would you agree?
> > >
> > > Thanks for any added help.
> > >
> > > Ken
> > >
> > >
> > >
> > >
> > ----------------------------------------------------------
> > > From: amibroker@xxxxxxxxxxxxxxx
> > [mailto:amibroker@xxxxxxxxxxxxxxx] On Behalf Of Ken Close
> > > Sent: Saturday, July 05, 2008 10:47 AM
> > > To: amibroker@xxxxxxxxxxxxxxx
> > > Subject: [amibroker] What a Great Ranking Tool
> > >
> > >
> > > Paul Ho has come up with a supurb ranking tool. I have
expanded
> > it to two indicators. Feel free to expand the code structure
to
> any number
> > of indicators.
> > >
> > > Possible next step: stick the Tot_Rank values into the OI
field
> > for the symbols, then Plot the Ranks for a visual
representation
> of "where
> > the symbol is over time".
> > >
> > > The possibilities are endless (or at least enlarged because
of
> > Paul's code idea). Thanks Paul for your creative input.
> > >
> > > Ken
> > >
> > > // Ranking_Alt01.afl KSC 07/05/2008
> > >
> > > // Original code by Paul Ho, Amibroker list 07/05/2008
> > >
> > > // Modifications and expansions by Ken Close 07/05/2008
> > >
> > >
> > >
> > > // Will ordinal rank every symbol in watchlist for every
bar.
> > >
> > >
> > >
> > >
> > >
> > > mOwnROC = ROC(C, 14);
> > >
> > > mOwnRSI = RSIa(C, 14);
> > >
> > > mRoc = 0;
> > >
> > > mRSI = 0;
> > >
> > > list = CategoryGetSymbols(categoryWatchlist, 16);
> > >
> > > ROCcount[0] = rocrank[0] = 0;
> > >
> > > RSIcount[0] = RSIrank[0] = 0;
> > >
> > > for(i = 0; (sym = StrExtract(list, i)) != ""; i++)
> > >
> > > {
> > >
> > > SetForeign(sym);
> > >
> > > mRoc = ROC(C, 14);
> > >
> > > mRSI = RSIa(C, 14);
> > >
> > > RestorePriceArrays();
> > >
> > > n = !IsNull(mRoc);
> > >
> > > m = !IsNull(mRSI);
> > >
> > > roccount += n;
> > >
> > > rsicount += m;
> > >
> > > rocrank = IIf(Nz(mRoc) > mOwnROC, Rocrank + n, rocrank);
> > >
> > > rsirank = IIf(Nz(mRsi) > mOwnRSI, Rsirank + m, rsirank);
> > >
> > > Totrank = rocrank + rsirank;
> > >
> > > }
> > >
> > > ROCn = ROC(C, 14);
> > >
> > > RSIn = RSIa(C, 14);
> > >
> > > Filter = 1;
> > >
> > > Buy = Sell = 0;
> > >
> > > AddColumn(ROCn, "ROCn",1.2);
> > >
> > > AddColumn(RSIn, "RSIn",1.2);
> > >
> > > AddColumn(mRoc, "MROC", 1.2);
> > >
> > > AddColumn(ROCrank, "ROCRank", 1.0);
> > >
> > > AddColumn(RSIrank, "rsirank",1.0);
> > >
> > > AddColumn(Totrank, "Totrank", 1.0);
> > >
> > >
> > >
> > > // To check the sorting, run on a watchlist, then click
once on
> > the date column,
> > >
> > > // Then shift click on one of the indicators, ie, RSIn, and
you
> > will see the
> > >
> > > // ordinal values in order.
> > >
> >
> > ------------------------------------
> >
> > Please note that this group is for discussion between users
only.
> >
> > To get support from AmiBroker please send an e-mail directly
to
> SUPPORT {at}
> > amibroker.com
> >
> > For NEW RELEASE ANNOUNCEMENTS and other news always check
DEVLOG:
> > http://www.amibroker.com/devlog/
> >
> > For other support material please check also:
> > http://www.amibroker.com/support.html
> > Yahoo! Groups Links
> >
>