[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[amibroker] Re: Paul Ho: Memory Challenges with Great Ranking Tool

To: amibroker@xxxxxxxxxxxxxxx
Subject: [amibroker] Re: Paul Ho: Memory Challenges with Great Ranking Tool
From: "bruce1r" <brucer@xxxxxxxxx>
Date: Mon, 07 Jul 2008 17:06:50 -0000
PureBytes Links
Trading Reference Links
Tomasz -

Great diagnosis.  I should've have thought of it. I've seen it happen
to a number of the FT users.  That is why we stressed cache settings
in the 2007 conference class that you helped with.

Ken -

There is no reason to run anything > 5200 bars with a FT database. 
Soon that will change to a lower number at next weekend's conversion.
 And, the cache size setting needs to big big enough to handle > 500
symbols.  For FT, that is at least > 83 MB.  FWIW, I run 5200 bars,
1000 symbols, and 160MB cache.

Paul -

In the interest of conveying some info constructively, let me explain
why I posted what I did.

First, though, I think you might want to re-consider abount a range
algorithm being slower.  A 2*N algorithm like the range algorithm will
almost ALWAYS be faster than a N*N algorithm - it will be much faster
for large N.  

I hope that Ken is able to get a N*N approach fast enough.  I totally
agree with you that it can be.  He is still too slow, though.

But, the other reason that I suggested range values to Ken was than he
is using it for ranking and probably rotational systems.  This is what
many from the FastTrack community gravitate toward.  He will have to
confirm that part.  But, in that setting, the range percentages can
capture info about closeness of mixed scores that ordinal ranking can
mask.  If it fits the score distribution characteristics, this can be
used to minimize rotation with little impact on fitness - which is
typically a goal.

Lastly, about the Pad and Align, I can only tell you that my
experience is different.  In a 20 year database like the FastTrack
database like I think that Ken is using, if you only need a few years
of backtest, then aligning to a symbol with a short history will
dramatically speed things up (thanks to Tomasz's implementation).

Anyway, this has been an interesting discussion.  Let me know if you
need any details about the above.  I'm getting back to some other
trading work.

-- BruceR


--- In amibroker@xxxxxxxxxxxxxxx, "Tomasz Janeczko" <groups@xxx> wrote:
>
> Hello,
> 
> Indeed I have run this code on larger watch list and can confirm
Paul timings. 
> 
> I am running Athonl 64 x2 @2GHz
> 
> For 100 symbols it is just 50 seconds.
> For 500 symbols (SP500) and history from 1992 till now (16 years) it
is just 13 minutes.
> If the same 500 symbols are explored using QuickAFL turned on and
last 4 years only, the time shrinks to 6 minutes 49 seconds .
> These are actual timings, not progress bar estimate. 
> Yes the time grows slightly as process progresses but it is not that
surprising considering the fact it outputs
> 500 * number of quotes lines (1.3 million lines for 10 year history)
> 
> I think this timings are quite OK for N*N algorithm. 
> 
> Memory also is not an issue. Running it on 500 symbols 16 years
history with full caching enabled
> caused that AmiBroker consumed 175MB of RAM.
> 
> If you are getting timings in hours, I suspect that you are using
sub-optimum cache settings. Please go to
> Tools->Preferences, "Data" tab and increase "in-memory" cache to at
least 500 symbols (the size of watch list under test).
> If cache is too small it will force many disk accesses. With cache
large enough - everything will be in RAM.
> 
> Best regards,
> Tomasz Janeczko
> amibroker.com
>   ----- Original Message ----- 
>   From: Paul Ho 
>   To: amibroker@xxxxxxxxxxxxxxx 
>   Sent: Monday, July 07, 2008 9:29 AM
>   Subject: RE: [amibroker] Re: Paul Ho: Memory Challenges with Great
Ranking Tool
> 
> 
>   Ken
>   As you know, the algorithm that I gave you increase dramatically
with the size of the watchlist. However, the times that you stated
isn't in line with what I experience. I tried my code (which is
similar to yours, with exception that it also stores the value in OI)
on 2 machines. One very old machine, a single core AMD which is about
5 years old, Window tells me it is an AMD XP 2600 with 1G of ram. I
use a watchlist of 472 symbols, running from 1/1/2000 to now. the time
taken is 6.5 minutes (from the progress bar). and on a newer machine
Core 2 duo E6600, it took just over 1 minute.  So this is very
different to the one and a half hour that you are talking about.
>   So I am wondering what machine you're running on. and what kind of
AA setting you use?  What I will suggest is that you DONT Check pad
and align, this will increase the number of bars by quite a lot. You
can check QuickAFL though.
>   I have also given you a few suggestions in one of the other posts,
including monthly bars, normalisation of scores etc
http://finance.groups.yahoo.com/group/amibroker/message/126336. Did
you take a look at that?
>   Despite its shortcoming, I think the N^2 algorithm will still
perform better/faster than Bruce's suggestions. Pad and Align &/or ATC
will slow it down even more. In the past, I have made a ranking dll
which uses a 2 dimensional array, and basically insert the stock into
the right ranking order as each symbol is scanned, a little bit like
Fred's algorithm but on a total array basis. That is certainly very
fast. In addition, Tomasz's custom Backtester code has the potential
also to be quite fast without the use of dll. I have ported the code
so it stores directly in the OI during after optimization. However, it
comes back with an internal error in certain instances if the
watchlist gets over 1400 symbols and no of bars is more than 2500. I
have sent it to Tomasz for him to have a look at, when he comes back.
I can share that with you. But my point is that the N^2 idea should be
fast enough for what you are talking about ~500 stocks. You see there
is a big difference between 1 minute and 1 and half hour.
>   Send me a private email if you like, I'm curious why there such a
big difference.
>   Cheers
> 
> 
> 
> 
> 
>
----------------------------------------------------------------------------
>     From: amibroker@xxxxxxxxxxxxxxx
[mailto:amibroker@xxxxxxxxxxxxxxx] On Behalf Of Ken Close
>     Sent: Monday, 7 July 2008 7:40 AM
>     To: amibroker@xxxxxxxxxxxxxxx
>     Subject: RE: [amibroker] Re: Paul Ho: Memory Challenges with
Great Ranking Tool
> 
> 
>     Bruce:
> 
>     Thanks for giving me another way to go.
> 
>     In case you have been following (or remember) this topic from a
ways back,
>     Fred was generous enough to write me a code concept for doing
all of this
>     (but end of range only) and I successfully converted it to my
"endpoint"
>     recipe of 11 or so indicators. It did ok for relatively large
watchlists
>     (even the RUT) but could not handle very well some much larger
watchlists in
>     the many 1000s. But, it gave me a combined ranking on any end
point date I
>     set in the AA window. My absolute minimum requirement is to create
>     combo-rank-scores on a monthly basis but I was pleasantly
surprised when
>     Paul Ho served up a concept to create daily combo-rank-scores on
a daily
>     basis, but then euphoria changed to despair as I encountered the
n^2 time
>     factor.
> 
>     So, thanks to you and others I have a variety of ways to
consider getting to
>     the end of this problem.
> 
>     1. Your suggestion of normalized indicators and using a final
percentage
>     value as the combo rank.
> 
>     2. Taking Fred's code and finding a way to manipulate the
EndofRange date,
>     basically repeating his code over and over on the same watchlist
but with
>     changing EndofRange dates. (I still have to do something with
the collection
>     of combo-scores I will accumulate by date, but that is another
issue.*** see
>     below) I have even considered manually repeating the process to
the end
>     point (only 12 runs per year x the 8 years I want to test over).
> 
>     3. Taking Paul Ho's code which ranks daily and either living
with the
>     limitation in the Watchlist population or running the thing over
night.
>     Since I have speed problems now with 2 indicators and 150+
symbols, I
>     probably will drop off the cliff with 11 indicators and the same
150+
>     symbols. An alternate which I plan to test next is to see how
Paul's code
>     performs on a Weekly or even Monthly compressed basis, although
if symbol
>     number is controlling and not barcount, then this will not do
much good.)
> 
>     4. Using Tomasz's suggestion of the custombacktester, making 11
separate
>     runs, then somehow combining the 11 different output reports,
coming up with
>     a combo-rank that way.
> 
>     If you were approaching this, can you guess and say which
approach you would
>     concentrate on. Right now, number 4 looks like it actually might
be the
>     least programming and execution intensive, but I am not sure. I
also have
>     to have a way of updating the entire system as time goes
forward. That will
>     bring an additional set of challenges I am sure.
> 
>     Thanks for stepping in.
> 
>     Ken
> 
>     *** Paul Ho shared a small COM code snippit that sticks an
indicator nicely
>     into the OI field of a symbol, so that is the approach I want to
take once I
>     have the combo-rank to stick in the right place. Talk about
complex......
> 
>     PS: Bruce, if you are still reading, would I have a better chance of
>     executing my task in Trade vs Amibroker (sorry Thomasz)? 
> 
>     -----Original Message-----
>     From: amibroker@xxxxxxxxxxxxxxx
[mailto:amibroker@xxxxxxxxxxxxxxx] On Behalf
>     Of bruce1r
>     Sent: Sunday, July 06, 2008 5:03 PM
>     To: amibroker@xxxxxxxxxxxxxxx
>     Subject: [amibroker] Re: Paul Ho: Memory Challenges with Great
Ranking Tool
> 
>     Ken -
> 
>     I'm too involved with something else right now, but let me see
if I can
>     offer quick suggestion. First -
> 
>     1. Tomasz is pointing out the solutions in (N^2) time are never
practical
>     past some limit. That means that the execution time goes up with
the square
>     of the number of items - ticker in this case. There are a couple of
>     programming tricks that you can play, but I don't think that
they are going
>     to get you where you want to go -
> 
>     For example, programming tricks can be used to make the N^2
comparison
>     matrix "triangular". This reduces the comparisons by half.
> 
>     You might use Pad and Align to a ticker with a short history to
cut the time
>     further.
> 
>     But, this is still going to leave you in a long timeframe.
> 
>     2. It looks like you are trying to add unbounded indicators and
use the
>     ordinal values to normalize them so that they can be combined. 
>     Use of the custom backtester would still require that you
generate output
>     for each indicator and then combine them.
> 
>     Another approach might be to go out of the box a little and
question your
>     basic assumption. Here's what I mean.
> 
>     Ordinal values can be used to convert unbounded ranges (such as
ROC) to
>     bounded values. But they can do some strange things to outliers. 
>     For example, consider these points. Say they are for tickers
A,B,C,D,E on a
>     particular day -
> 
>     0, 20, 21, 22, 200
> 
>     The point 22 is ranked #2 (higher value better) when it is not
near the top.
> 
>     ON THE OTHER HAND, range value can be used also to convert
unbounded data to
>     bounded. THEY REQUIRE A PRE-SCAN TO KNOW THE MIN AND MAX. 
>     For the range above, it would convert to the following percentages -
> 
>     0, 10, 10.5, 11, 100
> 
>     This has some advantages for certain data distributions, but some
>     disadvantages for others. For data where the probability of
outliers is
>     low, it yields similar results.
> 
>     SO, HERE'S WHAT YOU MIGHT DO.
> 
>     1. Take a watchlist and start a Exploration pass. When
>     Status("stocknum") == 0, loop through the list and find the
global Min and
>     Max for each bar across all of the tickers for a given indicator
and store
>     it in an ATC in the H and L fields. For RSI and ROC, you would
have 2 ATC's
>     - say ~MINMAX_ROC and ~MINMAX_RSI. This is 1 pass of all N tickers.
> 
>     2. Continuing on for stocknum 0 and for 1 - N, calculate the ROC
and RSI and
>     convert it to a percentage of the MIN and MAX range that you
stored in the
>     ATC's for each bar -
> 
>     rangepcnt = ( tickscore - tickglobalmin ) / ( tickmax -
tickglobalmin
>     ) * 100;
> 
>     3. Now you can combine the range values because they are
normalized. 
>     If you divide by the number of indicators, you'll end up with a
combined
>     percentage.
> 
>     Now, while this is not an ordinal rank, it works perfectly well
for scoring
>     and is a solution in 2*N time. BTW - this reference won't mean
much to most
>     here, but should to you - Ed Gilbert detailed this in Trade doc
almost a
>     decade ago.
> 
>     -- BruceR
> 
>     --- In amibroker@xxxxxxxxxxxxxxx, "Tomasz Janeczko" <groups@> wrote:
>     >
>     > Hello,
>     > 
>     > No, look again. The code I provided gives the sort is ON BAR
BY BAR
>     basis.
>     > 
>     > Best regards,
>     > Tomasz Janeczko
>     > amibroker.com
>     > ----- Original Message ----- 
>     > From: Ken Close 
>     > To: amibroker@xxxxxxxxxxxxxxx 
>     > Sent: Sunday, July 06, 2008 9:08 PM
>     > Subject: RE: [amibroker] Paul Ho: Memory Challenges with Great
>     Ranking Tool
>     > 
>     > 
>     > Tomasz:
>     > 
>     > Thanks for all the help you give to so many people, me included.
>     > 
>     > However, while I did as you suggested with the custombacktester,
>     and looked into the output file it produces, I am at a loss to
know how to
>     use the data it contains. It is not all of the data that I need.
>     > 
>     > I want the ordinal ranking of multiple indicators, add them all
>     together, per bar and per symbol, and use the final sum, of the
ORDINAL
>     ranks, as the ranking value for all symbols.
>     > 
>     > This output represents what I want (but it is only for two
>     indicators). I want to turn this into my "recipe" which will have
>     approximately 8 to 10 indicators.
>     > 
>     > 
>     > 
>     > I ran the custom backtest, opened the output.html file, and see
>     that the symbols are sorted by the ranking value and it is
indeed an ordinal
>     value. But, the sort is done only once (probably as a lastbar
>     basis) and Paul Ho sorting algorithm gives me ordinal values for
each bar
>     for each symbol (displayed above using a lastbar basis).
>     > 
>     > You say Paul's code is inefficient, and maybe it is because it
>     sorts all symbols by all bars. Can you suggest a change to the
specific
>     code that would do what I want, but more efficiently?
>     > 
>     > Again, thanks for all that you do.
>     > 
>     > Ken
>     > 
>     > 
>     > 
>     >
>     ----------------------------------------------------------
>     --
>     > From: amibroker@xxxxxxxxxxxxxxx [mailto:amibroker@xxxxxxxxxxxxxxx]
>     On Behalf Of Tomasz Janeczko
>     > Sent: Sunday, July 06, 2008 1:39 PM
>     > To: amibroker@xxxxxxxxxxxxxxx
>     > Subject: Re: [amibroker] Paul Ho: Memory Challenges with Great
>     Ranking Tool
>     > 
>     > 
>     > Hello,
>     > 
>     > The code is inefficient because it repeats the sorting N*N times
>     where N is number of symbols, while
>     > only N times is enough.
>     > 
>     > Ranking is a process that is done during first pass of backtest.
>     It is implemented efficiently. 
>     > We can use this built-in process easily using custom backtest
>     procedure as shown here:
>     > 
>     > Note that this formula will not produce output in AA directly.
>     Instead it will produce a HTML
>     > file (output.html) that you can later import to AA using AA,
>     File->Import
>     > 
>     > Also please be warned that produced files are huge and attempt to
>     load such big HTML file
>     > into Internet Explorer instead will easily hang IE. 
>     > 
>     > PositionScore = ROC( C, 14 ) + 1000; // WHAT YOU WANT TO RANK
>     > 
>     > SetOption("MaxOpenPositions", 10 ); 
>     > SetBacktestMode( backtestRegularRaw ); 
>     > Buy=1; 
>     > Sell=0; 
>     > SetCustomBacktestProc(""); 
>     > if( Status("action")==actionPortfolio ) 
>     > { 
>     > bo = GetBacktesterObject();
>     > 
>     > bo.PreProcess();
>     > 
>     > dt = DateTime();
>     > 
>     > fh = fopen("output.html", "w" );
>     > 
>     > 
>    
fputs("<TABLE><TR><TH>Symbol</TH><TH>Date/Time</TH><TH>Rank</TH></TR>\n",
>     fh ); 
>     > 
>     > for( i = 0; i < BarCount; i++ ) 
>     > { 
>     > k = 1; 
>     > for( sig = bo.GetFirstSignal( i ); sig; sig =
>     bo.GetNextSignal( i ) ) 
>     > { 
>     > Line = "<TR><TD>" + sig.Symbol + "</TD><TD>" + 
>     > DateTimeToStr( dt[ i ] ) + "</TD><TD>" + k +
>     "</TD></TR>\n"; 
>     > fputs( Line, fh ); 
>     > k++; 
>     > } 
>     > }
>     > 
>     > bo.PostProcess();
>     > 
>     > fputs( "</TABLE>", fh ); 
>     > fclose( fh ); 
>     > }
>     > 
>     > 
>     > Best regards,
>     > Tomasz Janeczko
>     > amibroker.com
>     > ----- Original Message ----- 
>     > From: Ken Close 
>     > To: amibroker@xxxxxxxxxxxxxxx 
>     > Sent: Sunday, July 06, 2008 5:35 PM
>     > Subject: [amibroker] Paul Ho: Memory Challenges with Great
>     Ranking Tool
>     > 
>     > 
>     > Paul:
>     > 
>     > my initial euphoria has turned somewhat downward as I attempt to
>     apply the code below (just two indicators) to larger Watchlists. You
>     sounded (from other messages) like someone who knows the ins and
outs of
>     memory management with AB, and perhaps can comment on how to
keep the code
>     below from "bogging down".
>     > 
>     > In spite of my many years with AB and its array processing, my
>     mind still has a problem wrapping around what this code is doing
and why
>     (and whether) larger populated Watchlists will ever be able to work.
>     > 
>     > I initially tested against the DJ-30 (30 symbols) and all went
>     well, fairly quickly, perhaps 10-15 seconds.
>     > 
>     > I then tried the NDX (100 symbols) and things went more slowly
>     but finished. I noticed the symbols appearing in the AA window
more slowly.
>     > 
>     > I have not been able to nor wanted to wait for the SP-500, as
>     the symbols appear more and more slowly and the est time counter
was saying
>     something like 1 1/2 hours to complete 500 symbols.
>     > 
>     > I was assuming that the code had to collect or process all
>     symbols before it could make comparisons among them---this is
probably false
>     or else why would processed symbols start to appear in the AA
window while
>     it is still accessing symbols. 
>     > 
>     > What suggestions can you make, given your understanding of the
>     code and AB, that would minimize the processing of large member
watchlists?
>     > 
>     > Can adding a SetBarsRequired in the right place limit the number
>     of lookback bars that are processed, and thus speed up execution?
>     > 
>     > As the number of indicators I wish to process into a "Total
>     Rank" score increases, I imagine that executing this code will
get slower
>     and slower and may not be possible at all. Would you agree?
>     > 
>     > Thanks for any added help.
>     > 
>     > Ken
>     > 
>     > 
>     > 
>     >
>     ----------------------------------------------------------
>     > From: amibroker@xxxxxxxxxxxxxxx
>     [mailto:amibroker@xxxxxxxxxxxxxxx] On Behalf Of Ken Close
>     > Sent: Saturday, July 05, 2008 10:47 AM
>     > To: amibroker@xxxxxxxxxxxxxxx
>     > Subject: [amibroker] What a Great Ranking Tool
>     > 
>     > 
>     > Paul Ho has come up with a supurb ranking tool. I have expanded
>     it to two indicators. Feel free to expand the code structure to
any number
>     of indicators.
>     > 
>     > Possible next step: stick the Tot_Rank values into the OI field
>     for the symbols, then Plot the Ranks for a visual representation
of "where
>     the symbol is over time".
>     > 
>     > The possibilities are endless (or at least enlarged because of
>     Paul's code idea). Thanks Paul for your creative input.
>     > 
>     > Ken
>     > 
>     > // Ranking_Alt01.afl KSC 07/05/2008
>     > 
>     > // Original code by Paul Ho, Amibroker list 07/05/2008
>     > 
>     > // Modifications and expansions by Ken Close 07/05/2008
>     > 
>     > 
>     > 
>     > // Will ordinal rank every symbol in watchlist for every bar.
>     > 
>     > 
>     > 
>     > 
>     > 
>     > mOwnROC = ROC(C, 14);
>     > 
>     > mOwnRSI = RSIa(C, 14);
>     > 
>     > mRoc = 0;
>     > 
>     > mRSI = 0;
>     > 
>     > list = CategoryGetSymbols(categoryWatchlist, 16);
>     > 
>     > ROCcount[0] = rocrank[0] = 0;
>     > 
>     > RSIcount[0] = RSIrank[0] = 0;
>     > 
>     > for(i = 0; (sym = StrExtract(list, i)) != ""; i++)
>     > 
>     > {
>     > 
>     > SetForeign(sym);
>     > 
>     > mRoc = ROC(C, 14);
>     > 
>     > mRSI = RSIa(C, 14);
>     > 
>     > RestorePriceArrays();
>     > 
>     > n = !IsNull(mRoc);
>     > 
>     > m = !IsNull(mRSI);
>     > 
>     > roccount += n;
>     > 
>     > rsicount += m;
>     > 
>     > rocrank = IIf(Nz(mRoc) > mOwnROC, Rocrank + n, rocrank);
>     > 
>     > rsirank = IIf(Nz(mRsi) > mOwnRSI, Rsirank + m, rsirank);
>     > 
>     > Totrank = rocrank + rsirank;
>     > 
>     > }
>     > 
>     > ROCn = ROC(C, 14);
>     > 
>     > RSIn = RSIa(C, 14);
>     > 
>     > Filter = 1;
>     > 
>     > Buy = Sell = 0;
>     > 
>     > AddColumn(ROCn, "ROCn",1.2);
>     > 
>     > AddColumn(RSIn, "RSIn",1.2);
>     > 
>     > AddColumn(mRoc, "MROC", 1.2);
>     > 
>     > AddColumn(ROCrank, "ROCRank", 1.0);
>     > 
>     > AddColumn(RSIrank, "rsirank",1.0);
>     > 
>     > AddColumn(Totrank, "Totrank", 1.0);
>     > 
>     > 
>     > 
>     > // To check the sorting, run on a watchlist, then click once on
>     the date column, 
>     > 
>     > // Then shift click on one of the indicators, ie, RSIn, and you
>     will see the 
>     > 
>     > // ordinal values in order.
>     >
> 
>     ------------------------------------
> 
>     Please note that this group is for discussion between users only.
> 
>     To get support from AmiBroker please send an e-mail directly to
SUPPORT {at}
>     amibroker.com
> 
>     For NEW RELEASE ANNOUNCEMENTS and other news always check DEVLOG:
>     http://www.amibroker.com/devlog/
> 
>     For other support material please check also:
>     http://www.amibroker.com/support.html
>     Yahoo! Groups Links
>



------------------------------------

Please note that this group is for discussion between users only.

To get support from AmiBroker please send an e-mail directly to 
SUPPORT {at} amibroker.com

For NEW RELEASE ANNOUNCEMENTS and other news always check DEVLOG:
http://www.amibroker.com/devlog/

For other support material please check also:
http://www.amibroker.com/support.html
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/amibroker/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/amibroker/join
    (Yahoo! ID required)

<*> To change settings via email:
    mailto:amibroker-digest@xxxxxxxxxxxxxxx 
    mailto:amibroker-fullfeatured@xxxxxxxxxxxxxxx

<*> To unsubscribe from this group, send an email to:
    amibroker-unsubscribe@xxxxxxxxxxxxxxx

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
Follow-Ups:
- [amibroker] Re: Paul Ho: Memory Challenges with Great Ranking Tool
  - From: rijnaars
References:
- Re: [amibroker] Re: Paul Ho: Memory Challenges with Great Ranking Tool
  - From: Tomasz Janeczko
Prev by Date: Re: [amibroker] Re: Code help please... Optimize with CMAE
Next by Date: Re: [amibroker] Local vs External Database Storage
Previous by thread: Re: [amibroker] Re: Paul Ho: Memory Challenges with Great Ranking Tool
Next by thread: [amibroker] Re: Paul Ho: Memory Challenges with Great Ranking Tool
Index(es):
- Date
- Thread