Re: [amibroker] Re: The EASIEST way to use new optimizer engines, AmiBroker Email List Archive

On Jun 28, 2008, at 3:30 PM, Tomasz Janeczko wrote:

Yes you are right. Windows is black box. It truly is. And several people were so worried that they developed Linux :-)

Visual Studio - fortunatelly Microsoft DOES provide source codes for MFC and you can single-step into source down to C++ or assembly level
on ANY code produced by Visual Studio. Without that I won't use it. Many times I needed to investigate the sources of MFC
to find why things work that way or another. There are zillions of details that do not exist in Microsoft help files, but
can be revealed when analysing MFC sources.

The main thing that makes difference whenever you need sources or not is the ability to validate results.

If you have Windows function SetWindowText that just sets the text for the window, it is easy to verify if it works or not, without having source code.
The same with AmiBroker's AFL functions. It is easy to verify if 10-day simple moving average is correct or not.

As 2*2 the result is 4 and everyone knows that.   With "intelligent search" algorithm the answers are not that obvious. Very few evolutionary algorithms
have strong mathematical proofs behind them.

That's why having source files is huge advantage, and that's why all non-exhaustive optimizers from AmiBroker come with the source code.

Best regards,
Tomasz Janeczko
amibroker.com
----- Original Message -----
From: Fred Tonetti
To: amibroker@xxxxxxxxxxxxxxx
Sent: Saturday, June 28, 2008 9:01 PM
Subject: RE: [amibroker] Re: The EASIEST way to use new optimizer engines

Thanks for your comments about the individual engines … What I have or haven’t read has nothing to do with how someone chose to implement something. IO too is capable of performing multiple runs or what I termed “passes” but this is implemented somewhat differently as documented.

One last thought about “black boxes” … By your definition Windows would be a black box as would Visual Studio or for that matter all of AmiBroker with the exception of the plugins where the code was supplied.

Should we not use these tools because they are black boxes ?

From: amibroker@xxxxxxxxxxxxxxx [mailto:amibroker@xxxxxxxxxxxxxxx] On Behalf Of Tomasz Janeczko
Sent: Saturday, June 28, 2008 6:12 AM
To: amibroker@xxxxxxxxxxxxxxx
Subject: Re: [amibroker] Re: The EASIEST way to use new optimizer engines

Hello,

Fred>"fairly standard simple algorithms with some tweaks that after tons of experimentation "

The PSO was first described in 1995 by James Kennedy and Russel C. Eberhart, since then
LOTS of people developed their own algorithms based on PSO.
There are at least 20 DIFFERENT PSO public algorithms that I know. All producing different results. What is "fairly standard" then?
I am pretty sure that you are not using Standard 2007 (as "spso"), are you???
Unless source code for IO is provided, it *IS* a black box.

Fred> How does one intelligently decide how many runs and tests to use for PSO & Tribes based on differing number of variables to be optimized ?

You actually answered yourself: you decide "after tons of experimentation". Depending on problem under test, its complexity, etc, etc.
Any stochastic non-exhaustive method does not give you guarantee of finding global max/min, regardless of number of tests if it is smaller
than exhaustive. The easiest answer is to : specify as large number of tests as it is reasonable for you in terms of time required to complete.
Another simple advice is to multiply by 10 the number of tests with adding new dimension. That may lead to overestimating number
of tests required, but it is quite safe.
In case you did not notice this is a very first version that is subject to improvements. I want to keep things simple to use and do not require
people to read 60+ page doc to be able to run first optimization. Therefore the work is being done to provide "reasonable" default/automatic values
so optimization can be run without specifying anything.

Fred> What happens differently for these two engines when one specifies 5 runs of 1000 tests versus 1 run of 5000 tests ?

Well, if you read that many scientific papers on intelligent methods, you should already now the difference, as it is the most basic thing.
TEST (or evaluation) is single backtest (or evaluation of objective function value).
RUN is one full run of the algorithm (finding optimum value).
Each run simply RESTARTS the entire optimization process from the new beginning (new initial random population).
Therefore each run may lead to finding different local max/min (if it does not find global one).

Once you know the basics the difference is obvious.
5 RUNS of 1000 tests is simply doing 5 times the 1000-backtest PSO optimization .
1 RUN of 5000 tests is simply doing 5000-backtest PSO optimization ONCE only.

Now if the problem is relatively simple and 1000 tests are enough to find global max, 5x1000 is more likely to find global maximum
because there are less chances to be stuck in local max, as subsequent runs will start from different initial random population.

The difference will be if problem is complex enough (has many dimensions). In that case running 1x5000 is more likely to
produce better result.

Actually this can be used as a stop condition. You can for example say that you want to restart (make another run) as long
as two (or three) subsequent runs produce the same maximum.

CMA-ES is slightly different in terms of how RUN is interpreted.

Currently the CMA-ES plugin implements G-CMA-ES flavour (i.e. global search with increasing population size).
As it is written in the READ ME
http://www.amibroker.com/devlog/wp-content/uploads/2008/06/readme5130.html

You may vary it using OptimizerSetOption("Runs", N ) call, where N should be in range 1..10.
Specifying more than 10 runs is not recommended, although possible.
**** Note that each run uses TWICE the size of population of previous run so it grows exponentially.
Therefore with 10 runs you end up with population 2^10 greater (1024 times) than the first run. ****

So each subsequent CMA-ES run will take TWICE as much time as previous one and TWICE the population size.

Of course this can be changed (the source code is available and well documented).

Fred>     How should one set up CMA-ES so that it produces superior results in less time for problems like the one I outlined i.e. that are of a type that can not be solved by exhaustive search ?

Just use one run.

OptimizeSetOption("Runs", 1 );

it will produce results in less time.
Doing so is actually equivalent to running L-CMA-ES (local search).

Best regards,
Tomasz Janeczko
amibroker.com
----- Original Message -----
From: Fred Tonetti
To: amibroker@xxxxxxxxxps.com
Sent: Saturday, June 28, 2008 10:46 AM
Subject: RE: [amibroker] Re: The EASIEST way to use new optimizer engines

TJ,

IO, which was preceded by PSO, was initially an experiment to determine whether or not it could even be done and then whether or not it was a worthwhile tool to have.

Following that it was and is for the most part a give back to the community as most of the bells and whistles are FREEWARE in a user friendly format. Stating that it is a black box is absurd as it uses fairly standard simple algorithms with some tweaks that after tons of experimentation I know to be of benefit and users have control over all aspects of how the algorithms work from their AFL if they choose to use them without having to research them on the internet as there’s 60+ pages of documentation about what has been implemented, how it works and the associated feature/functions …

Frankly I could care less if anyone ever bought a copy with the more advanced features as the fees associated with those features were put on simply to reduce the amount of support that would no doubt be required if the entire community used them.

What I want to compare is the usefulness of the different engines for different types of problems and how long they take to arrive at relatively decent results to solve problems that can not be solved by exhaustive search and to that end I have already asked several straight forward questions that for whatever reason you have chosen to ignore … So I’ll try them again …

-          How does one intelligently decide how many runs and tests to use for PSO & Tribes based on differing number of variables to be optimized ?

-          What happens differently for these two engines when one specifies 5 runs of 1000 tests versus 1 run of 5000 tests ?

-          How should one set up CMA-ES so that it produces superior results in less time for problems like the one I outlined i.e. that are of a type that can not be solved by exhaustive search ?

These are basic questions about the use of the intelligent optimization engines that you have chosen to include in the product which I would think lots of folks would want the answers to without having to search the internet.

Personally I’ve already read way beyond my share of scientific papers on intelligent optimization.

From: amibroker@xxxxxxxxxps.com [mailto:amibroker@yahoogroups.com] On Behalf Of Tomasz Janeczko
Sent: Saturday, June 28, 2008 3:59 AM
To: amibroker@xxxxxxxxxps.com
Subject: Re: [amibroker] Re: The EASIEST way to use new optimizer engines

Fred,

I don't know why you took some kind of mission on criticizing last developments maybe this is because
you are selling IO while AB optimizer is offered as free upgrade and that makes you angry.
I don't know why this is so, because actually you can benefit from that too - I have provided
full source code so everything is open for innovation and improvement, unlike black box IO.

The fact is that you are comparing APPLES TO ORANGES.

You should really READ the documentation I have provided and visit links I have provided.

CMA-ES DEFAULTS are well suited for tests that are replacement of exhaustive searches.

They are however too large for 15 variables. For example CMO by default will use
900 * (N + 3 ) * (N+3 ) max evaluations. It converges much quicker therefore estimate
displayed in the progress bar is calculated as follows 30 * (N+3) * (N+3)

You are comparing 1000 evaluations of PSO with CONSTANT population size
to 10000+ evaluations of CMAE with GROWING population size default settings.

You are comparing elephant to an ant.

If you want to COMPARE things you need to set up IDENTICAL conditions.
That would be:

OptimizerSetOption("Runs", 1 );
OptimizerSetOption("MaxEval", 10000 );

With *IDENTICAL* conditions, CMA-ES will run faster.

Best regards,
Tomasz Janeczko
amibroker.com
----- Original Message -----
From: Fred Tonetti
To: amibroker@xxxxxxxxxps.com
Sent: Saturday, June 28, 2008 7:15 AM
Subject: RE: [amibroker] Re: The EASIEST way to use new optimizer engines

It is somewhat meaningless to compare intelligent optimizers with exhaustive search due to the fact that for most real world problems exhaustive search would need more time than the universe has been around to solve them … It is also somewhat meaningless to compare intelligent optimizers with each other based on problems that are solvable by exhaustive search.

In regards to the imbedded PSO & Tribes algorithms you state …

“You should increase the number of evaluations with increasing number of dimensions. The default 1000 is good for 2 or maximum 3 dimensions” …

Can you provide any guidance as to what relationship should exist between the number of dimensions and the number of tests ? i.e. what’s a reasonable number of tests for 5 dimensions, 10, 100 ?

Can you explain the difference between 1 run with 5000 tests and 5 runs with 1000 tests ?

As far as CMAE is concerned … Maybe I’m missing something but it doesn’t seem that CMAE has anything in terms of speed over AB’s PSO or Tribes …

I tried CMAE out on a real world intelligent optimization problem with 15 variables trading 100 symbols by adding the required statement to the AFL …

Run time for CMAE to complete was 459 minutes …

Run times for AB’s PSO and Tribes to complete with 5 runs and 1000 tests was in the neighborhood of 75 minutes each with results being the sane as CMAE.

As an FYI …

Run times for IO’s DE and PS to complete via their own internal decision making process w/o the help of additional cores ( servers ) was in the same neighborhood with times of 72 and 53 minutes respectively.

With the help of additional cores ( 7 ) IO’s DE and PSO ran to completion in 11 and 8 minutes respectively …

From: amibroker@xxxxxxxxxps.com [mailto:amibroker@yahoogroups.com] On Behalf Of Tomasz Janeczko
Sent: Friday, June 27, 2008 8:05 PM
To: amibroker@xxxxxxxxxps.com
Subject: Re: [amibroker] Re: The EASIEST way to use new optimizer engines

FYI: using new optimizer engine (cmae) to optimize seemingly
simple 3 parameter (ranging 1..100) system gives speed up
of more than 1000 times, as cmae optimizer is able to find best
value in less than 1000 backtests compared to one million backtests
using exhaustive search. It also outperforms PSO usually by factor of 10.

That is 500 times faster than you would get from exhaustive opt using your dual core
and 5 times faster than PSO on dual core.

CMA-ES delivers MORE in terms of speed with LESS development time.

Best regards,
Tomasz Janeczko
amibroker.com

I am using the free version of SPAMfighter for private users.
It has removed 492 spam emails to date.
Paying users do not have this message in their emails.
Try SPAMfighter for free now!

I am using the free version of SPAMfighter for private users.
It has removed 492 spam emails to date.
Paying users do not have this message in their emails.
Try SPAMfighter for free now!

I am using the free version of SPAMfighter for private users.
It has removed 492 spam emails to date.
Paying users do not have this message in their emails.
Try SPAMfighter for free now!