[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[amibroker] Re: Optimization speed increase in 5.01



PureBytes Links

Trading Reference Links

I think you misunderstood what I was trying to say. When I'm talking
about signals, I'm not talking about the signals that you actually see
in your Optimization/Backtester results pane, I am talking about the
RAW signals that AB generates during the first pass of backtesting.
These are just temporary signals that can be saved until the end of
processing. Let me try to be a little more clear. This is how I
understand the optimizer currently works for optimizing over a portfolio:

- Loop through the parameter combinations
  - For each parameter combination, loop through every symbol
    - Calculate the signals from the AFL for this one symbol
  - Once the raw signals have been generated for all the
    symbols for this particular combination, generate the
    final buy/sell signals and results of the portfolio
    backtest for the given parameter combination.

When I do a portfolio optimization I can see that this order of events
is being done because as soon as one combination of parameters has
been backtested, the results are immediately outputted in the results
pane of the Optimization/Backtest window.

This method switches between symbols (# Symbols * # Optimization
combinations) times.


After reorganizing the order of events:
- Loop through every symbol
  - For each symbol, loop through each parameter combination
    - Calculate the signals from the AFL for this one symbol
  - After the AFL has been run for this symbol for every
    optimization combination, save these raw signals in
    memory and move on to the next symbol
- Once the raw signals have been generated for all the
  symbols, for all combinations, then the final buy/sell
  signals and results are generated for all parameter
  combinations.

The downside of this is that the optimization results are not obtained
for any parameter combination until all the combinations have been
backtested (not a big deal in my opinion).

The upside is that it switches between the symbols only (# Symbols)
times. Say you have a 5000 symbol database and are doing a 100
combination optimization, that means 500,000 symbol switches with the
current method. By reorganizing the sequence of events that means only
5000 symbol switches. You would have to save more raw signals to
memory, but this would probably be better than doing all those symbol
switches. The goal is to be processing over one section of memory for
a longer period of time, which will help the cache hit ratio and make
it less problematic to expand to a multi-core processor.

Nick


--- In amibroker@xxxxxxxxxxxxxxx, "vlanschot" <vlanschot@xxx> wrote:
>
> Nick, you make one wrong assumption: that the default optimization 
> process is based on seperate optimizations per individual symbol. 
> Instead, default optimization is at the portfolio level, which is why 
> ALL symbols of the backtest-universe need to be included. Optimized 
> parameters for one symbol don't have any value if I do not know what 
> their impact is at the portfolio level, i.e. on all the other 
> symbols. Hope this makes sense.
> 
> PS 
> 
> --- In amibroker@xxxxxxxxxxxxxxx, "nhall" <c-yahoo@> wrote:
> >
> > Hello Tomasz,
> > 
> > Thanks for all you've done with AmiBroker. It is a great program. In
> > your response to your dual-core optimization comments, there's
> > something I've been wondering about. You said that the core cache
> > becomes a limitation for AFL because backtesting is so memory
> > intensive and the memory interface speed is fixed no matter the 
> number
> > of cores on a processor.
> > 
> > Currently, if I do an Optimization over a large watchlist, AB will 
> run
> > the AFL with the given parameters for the entire watchlist all the 
> way
> > until it has the overall system result for the entire watchlist, for
> > the given parameters. Then it alters the parameters according to the
> > Optimization and reruns the AFL for the entire watchlist again. This
> > continues for all the Optimization iterations.
> > 
> > As I understand you, the problem with this is that the data for the
> > watchlist takes up a lot more memory than will fit in a core's 
> cache,
> > so if you have multiple cores doing this processing simultaneously,
> > they will be fighting each other for memory bandwidth.
> > 
> > Would it be possible to alter the order of events? If I'm running an
> > Optimization with 100 combinations I don't need to see the results
> > from each combination until the entire set has been processed. What 
> if
> > the Optimization sequence of events was changed to run the AFL for
> > just one symbol from the watchlist, then alter the parameters and 
> run
> > the AFL again for the *same symbol*, just with different parameters,
> > and continue this for all the combinations of the Optimization. 
> After
> > signals have been generated for this particular symbol for all
> > parameter combinations, the signals can be stored in memory and then
> > it can move on to the next symbol. After all the symbols have been
> > processed, AB can do the backtesting for all the signals.
> > 
> > The advantage of this is if I am Optimizing 100 combinations of
> > parameters on a watchlist of 5000 symbols, hopefully 1 symbol can 
> fit
> > in the processor's cache and it can do 100 runs through the AFL,
> > generating signals, before it has to fetch more data from the 
> memory.
> > This could provide some concurrency as another core could do the 
> same
> > thing for a different symbol. The Optimization would be more 
> efficient
> > with more combinations.
> > 
> > Does this make sense? I know that I glossed over details of how the
> > cache really works and also that I do not know the internals of
> > AmiBroker and could be missing some critical information and 
> therefore
> > this idea may not work at all. But maybe it could help. Thanks for
> > listening!
> > 
> > Nick
> > 
> > 
> > --- In amibroker@xxxxxxxxxxxxxxx, "Tomasz Janeczko" <groups@> wrote:
> > >
> > > Hello,
> > > 
> > > It is perfectly valid question.
> > > 
> > > First it does not really matter if the process goes through both 
> ends or
> > > sequentially but one core goes though odd and another through even
> > steps,
> > > and at first look it seems like this would give significant speed 
> up.
> > > 
> > > BUT... in real world things are more ugly that in theory.
> > > I did lots of testing and profiling (measuring time of execution 
> of
> > code on function-level),
> > > and dual thread execution on dual core processor is faster if and
> > only if
> > > each core can execute accessing data only from its own on-chip 
> data
> > cache.
> > > This is unfortunatelly NOT the case for backtesting/optimization.
> > > On-chip caches are usually limited to well below 1MB. Almost every
> > backtest
> > > requires way more than 1MB. 
> > > Now what happens if you run code that uses more memory - 
> > > BOTH cores need to access on-board (regular) RAM. Both cores do 
> this
> > > through single memory interface that is SHARED between cores and
> > > access one memory that runs at fixed speed (no matter if 1 or 8
> > cores access
> > > the memory - it can not respond quicker than factory limit and one
> > core is
> > > fast enough to actually need to WAIT for memory).
> > > 
> > > Now if you run on 2 or more cores, they have to wait for the same,
> > single shared memory
> > > that runs at constant pace, slow enough for one core, not to 
> mention
> > more.
> > > 
> > > Net result is that if you actually try to run something that needs
> > more than 1MB
> > > of data and does not fit into individual data cache, the 
> performance
> > drops down
> > > to actually single-core. What's more it can run slower because of
> > additional overhead with
> > > thread management.
> > > 
> > > And it is not imagination or theory. I did actual code profiling 
> and
> > I was surprised to when I tested multi-threaded 
> > > code. It works upto 2x faster, on dual core BUT ONLY IF you don't
> > access more than
> > > the size of on-chip per-core data cache. Or your code needs way 
> more
> > calculation than memory access.
> > > If your code does a LOT of memory access (more than 1MB) and does 
> it
> > QUICKLY
> > > (backtesting is extremely memory intensive and AFL scans through 
> mem
> > like crazy) 
> > > all advantages of running in multiple cores are gone.
> > > 
> > > BTW: what I did in this upgrade to speed up the
> > backtest/optimization was to reduce the COUNT
> > > of memory accesses to absolute minimum required. As it turns out
> > even single CPU core was waiting for memory.
> > > 
> > > Best regards,
> > > Tomasz Janeczko
> > > amibroker.com
> > > ----- Original Message ----- 
> > > From: "tipequity" <tagroups@>
> > > To: <amibroker@xxxxxxxxxxxxxxx>
> > > Sent: Friday, October 05, 2007 4:20 AM
> > > Subject: [amibroker] Re: Optimization speed increase in 5.01
> > > 
> > > 
> > > > Tomasz, at the risk of sounding stupid, I am gonna run this 
> idea by 
> > > > you. Since AB during backtest and optimizations work on a list 
> of 
> > > > stocks why not have one cpu (dual core CPUs) work on symbols 
> from top 
> > > > of the list and another cpu to work on symbols from the bottom 
> of the 
> > > > list. Like buring candles from both end.
> > > > 
> > > > Regards
> > > > 
> > > > Kam
> > > > 
> > > > 
> > > > --- In amibroker@xxxxxxxxxxxxxxx, "Tomasz Janeczko" <groups@> 
> > > > wrote:
> > > >>
> > > >> Hello,
> > > >> 
> > > >> If you are running optimizations using new version I would 
> love to 
> > > > hear about the timings you get
> > > >> compared with old one.
> > > >> Note that optimization with new version may run even 2 times 
> faster 
> > > > (or more),
> > > >> but actual speed increase depends how complex the formula is 
> and 
> > > > how often system
> > > >> trades and how large baskets. Speed increases are larger with 
> > > > simpler formulas,
> > > >> because AFL execution speed did NOT change. The only things 
> that 
> > > > has changed
> > > >> is collection of signals (1st backtest phase) and entire 2nd 
> phase 
> > > > of backtest.
> > > >> As it turns out, when backtesting very simple formulas the AFL 
> code 
> > > > execution is only less
> > > >> than 20% of total time, the rest is collecting signals and 
> sorting 
> > > > them
> > > >> according to score and 2nd phase of the backtest (actual 
> trading 
> > > > simulation). 
> > > >> These latter areas were the subject of performance tweaking.
> > > >> 
> > > >> Best regards,
> > > >> Tomasz Janeczko
> > > >> amibroker.com
> > > >>
> > > > 
> > > > 
> > > > 
> > > > 
> > > > Please note that this group is for discussion between users 
> only.
> > > > 
> > > > To get support from AmiBroker please send an e-mail directly to 
> > > > SUPPORT {at} amibroker.com
> > > > 
> > > > For NEW RELEASE ANNOUNCEMENTS and other news always check 
> DEVLOG:
> > > > http://www.amibroker.com/devlog/
> > > > 
> > > > For other support material please check also:
> > > > http://www.amibroker.com/support.html
> > > > 
> > > > Yahoo! Groups Links
> > > > 
> > > > 
> > > > 
> > > > 
> > > >
> > >
> >
>




Please note that this group is for discussion between users only.

To get support from AmiBroker please send an e-mail directly to 
SUPPORT {at} amibroker.com

For NEW RELEASE ANNOUNCEMENTS and other news always check DEVLOG:
http://www.amibroker.com/devlog/

For other support material please check also:
http://www.amibroker.com/support.html
 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/amibroker/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/amibroker/join
    (Yahoo! ID required)

<*> To change settings via email:
    mailto:amibroker-digest@xxxxxxxxxxxxxxx 
    mailto:amibroker-fullfeatured@xxxxxxxxxxxxxxx

<*> To unsubscribe from this group, send an email to:
    amibroker-unsubscribe@xxxxxxxxxxxxxxx

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/