Hi Ton,
I haven't set the runs, so it runs only once. I cannot
afford running more as it takes quite long. I optimize for 10 variable
although I could fix some of them. An exhaustive optimizer should give the
same results as it scans for all possibilities, but I haven't tried
it.
Is it the normal behavior of CMAE optimizer? How many time I am
supposed to run? Or does it mean my system is not robust?
Please
comment on my following statement.
A system optimized with different
profit/risk type fitnesses can be claimed robust if each result shows
strength.
It tends to be my feeling that it is true.
Zozu
---
In amibroker@xxxxxxxxxps.com,
"Ton Sieverding" <ton.sieverding@...> wrote:
>
> Hi
Z^4,
>
> Did you try to optimize the same AFL with CMAE several
times ? So a standard optimize. Not WF. Did you get different results for
every optimization ?
>
> Regards, Ton.
>
> -----
Original Message -----
> From: zozuzoza
> To: amibroker@xxxxxxxxxps.com
> Sent: Sunday, October 11, 2009 10:22 PM
> Subject: [amibroker]
Re: Is the Walk forward study useful?
>
>
> I found an
interesting behavior of WF testing in Amibroker. Using the same AFL code, same
parameters, same environment, same fitness function, everything is the same,
but the results are completely different when I run it second time. I say
completely, i.e. good WF results turned weak when I run it the 2nd time the WF
test. I did not expect to have the same results due to the nature of non
exhaustive optimiser but the results I got eliminated my faith in Amibroker WF
usefulness. I used cmae optimiser.
>
> Running 2 times the WF
test turned the average CAR of 18% to 4% the second time I run. There were
about 50 trades in the IS period.
>
> Try it yourself!
>
> --- In amibroker@xxxxxxxxxps.com,
"zozuzoza" <zozuka@> wrote:
> >
> > Aronson quote
"Each strategy will have its own best values for IS/OOS periods". - and its
own fitness function. For me, different systems perform different results
based on different fitness function. I have developed 7 fitness functions and
I test them on 4 systems in order to find the best fitness function but it
seems mission impossible. All my fitness functions are profit/risk type ones
like UPI, CAR/Mdd etc.
> >
> > I think you express too much
weight on IS/OOS time period. I think the fitness function, the parameter
range are much more important.
> >
> > So far, I must
agree with Tony that he does not belive in WF. I only use it for verification,
just to see another way of the results, nothing more, so far.
> >
> > --- In amibroker@xxxxxxxxxps.com,
"Ton Sieverding" <ton.sieverding@> wrote:
> > >
>
> > Thanks again Mike ... See also my previous answer. Just one more
remark. Here you are suggesting to take 1 to 3 year for the OOS period. When
using commodity time series, this is more or less what I am doing. Why ?
Because a lot of commodities coming from the agricultural sector have these
typical yearly cycles. But when using time series based upon stocks (
S&P500 etc. ), I am using a 5 to 7 year OOS period. Simply because of the
economic cycle. I am telling you this because it shows how I am thinking. Just
taking a period because somebody gave me a rule of thumb is rather tricky in
my eyes. For me there must be a good explanation for the length of that period
...
> > >
> > > Regards, Ton.
> > >
> > >
> > > ----- Original Message -----
>
> > From: Mike
> > > To: amibroker@xxxxxxxxxps.com
> > > Sent: Monday, October 05, 2009 11:32 AM
> > >
Subject: [amibroker] Re: Is the Walk forward study useful?
> > >
> > >
> > > Ton,
> > >
> >
> You said "If you can help me to get things done in an objective way then
I will be delighted to know how you want to do that"
> > >
> > > What I was suggesting was:
> > >
> >
> 1. Identify what measure you will use to judge the IS/OOS period sizes
(i.e. in my case I used consistency of CAR).
> > >
> >
> 2. Run walk forward with IS ranging from 1 year to 3 years and OOS
ranging from 1/8 to 1/3 of the IS period.
> > >
> > >
3. Calculate summary statistics for each IS/OOS combination for the measure
that you decided upon in step 1 (i.e. in my case I calculated the average CAR
and the standard deviation of CAR from the OOS samples). It may help to plot a
distribution to visualize the data.
> > >
> > > 4.
Observe whether one IS/OOS combination stands out as having the most normally
distributed values.
> > >
> > > Naturally, there is a
limit to how many IS/OOS combinations we can try before we have curve fit our
results. This is where I find Pardo's ratios to be helpful. By keeping within
the suggested range, we are leaving untested many alternative
combinations.
> > >
> > > Mike
> > >
> > > --- In amibroker@xxxxxxxxxps.com,
"Mike" <sfclimbers@> wrote:
> > > >
> >
> > Ton,
> > > >
> > > > 1. Pardo
disagrees with Aronson (and Bandy). Pardo suggests that a OOS to IS ration of
25% - 35% is best, but that a good rule of thumb for empirical testing is 1/8
to 1/3.
> > > >
> > > > 2. Yes, I suspect that
each strategy will have its own best values for IS/OOS and that other values
will appear as useless. It is up to us to try and find the best
values.
> > > >
> > > > With respect to your
comment: "I am getting results that show a random pattern", my question
remains; What are you measuring? In other words, what values appear random -
your fitness value? CAR? Something else?
> > > >
> >
> > 3. I have done very much as you ask, except that I also varied my IS
period. I mostly kept my ratios within Pardo's suggested 1/8 to 1/3, but went
as low as 1/12 and as high as 1/2 just to be sure.
> > > >
> > > > For example IS=1 year, IS=2 years, IS=3 years
giving
> > > >
> > > > IS1yr+OOS6mth,
IS1yr+OOS3mth, IS1yr+OOS1mth
> > > > IS2yr+OOS12mth,
IS2yr+OOS6mth, IS2yr+OOS3mth
> > > > IS3yr+OOS18mth,
IS3yr+OOS12mth, IS3yr+OOS6mth
> > > >
> > > >
IS2yr+OOS6mth produced the most consistent CAR, even though a weighted UPI was
used as the fitness function for the actual walk forward.
> > >
>
> > > > I do not have a strong opinion as to whether or
not there really is a relationship between IS and OOS sizes. I found that
Pardo's rule of thumb was as good a starting place as any. I was happy that my
values (25%) coincided with what he advised. But, had my studies suggested a
ratio outside of Pardo's range, I would have still gone with what my results
suggested, despite Pardo's advice.
> > > >
> > >
> Mike
> > > >
> > > > --- In amibroker@xxxxxxxxxps.com,
"Ton Sieverding" <ton.sieverding@> wrote:
> > > >
>
> > > > > Hi Mike,
> > > > >
>
> > > > What I am saying is :
> > > > >
>
> > > > 1. That according to David Aronson "There is no theory
that suggests what fraction of the data should be assigned to training ( IS )
and testing ( OOS )." and that "Results can be very sensitive to these choices
... ". I assume that he knows where he is talking about ...
> > >
> >
> > > > > 2. That when I am doing WalkFoward
tests following the advice of Howard Bandy, Robert Pardo AND Van Tharp, I am
getting results that show a random patron when changing the OOS en IS periods.
So my conclusion is that WalkFoward is a subjective test ...
> > >
> >
> > > > > Therefore I have serious problems using
WalkFoward tests. If you can help me to get things done in an objective way
then I will be delighted to know how you want to do that. But for sure Van
Tharp did not help me ...
> > > > >
> > > >
> Please do a simple WF test with OOS=1year and IS=1month...12months.
So creating WF results for OOS1y+IS1m, OOS1y+IS2m etc. And see what you are
getting. This is purely random. The result says nothing to me ...
> >
> > >
> > > > > Regards, Ton.
> > >
> >
> > > > >
> > > > >
>
> > > > ----- Original Message -----
> > > > >
From: Mike
> > > > > To: amibroker@xxxxxxxxxps.com
> > > > > Sent: Monday, October 05, 2009 9:29 AM
>
> > > > Subject: [amibroker] Re: Is the Walk forward study
useful?
> > > > >
> > > > >
> >
> > > Ton,
> > > > >
> > > > >
Are you saying that you have not found an IS/OOS pair that works well? What
measure are you using to judge "stability" of the walk forward process (i.e.
what measure are you using to judge the process as random)?
> > >
> >
> > > > > After testing with multiple IS periods,
and with multiple OOS periods, I was able to identify "fixed" window lengths
that proved more consistent than the others tested.
> > > >
>
> > > > > I reached this conclusion by charting a
distribution curve of CAR for the OOS results. My fitness function is
currently based on UPI, and thus my walk forward is driven by that value.
However, ultimately my interest is in how consistent CAR would be which is why
I used that for evaluating the goodness of fit for the IS/OOS period
lengths.
> > > > >
> > > > > In my case,
over a 13 year period, a 2 year IS and 6 month OOS (for a total of 26 OOS data
points) produced the most normal looking distribution of CAR results (i.e.
central peak, smallest standard deviation). Excluding the results from all of
1999 and the first half of 2000 (during which results were abnormally strong),
the distribution curve looks even better.
> > > > >
>
> > > > Also, have you tried working with different fitness
functions? Perhaps your fitness function doesn't adequately identify the
"signal" and thus misguides the walk forward, regardless of IS/OOS window
lengths.
> > > > >
> > > > > I am in the
process of running a new walk forward over the last 7.5 years using Van
Tharp's System Quality Number (SQN) as my fitness function. I have kept the
same 2 year IS/6 months OOS for a total of 15 OOS data points. My system
strives to generate a minimum average of 2 trades per day, so each IS period
generally has 1000 or more trades from which to calculate the fitness.
>
> > > >
> > > > > It has not run to completion
yet. But, for the periods that have produced results, the results look
promising (at least with respect to the SQN of the OOS relative to the SQN of
the IS, I have not yet created the distribution of CAR for OOS).
> >
> > >
> > > > > Assuming that the remainder of the
results are equally strong, I will walk forward further back in history to get
the full 26 data points to compare against the results produced using my UPI
fitness. If the CAR distribution is more normal using SQN as fitness, then I
will officially start using SQN for generating optimal values for my next live
OOS.
> > > > >
> > > > > If you are
willing to share, I would be curious to hear if SQN as a fitness function was
able to produce a more stable walk forward for you, and what measure you are
using to judge "stable".
> > > > >
> > > >
> Mike
> > > > >
> > > > > --- In amibroker@xxxxxxxxxps.com,
"Ton Sieverding" <ton.sieverding@> wrote:
> > > >
> >
> > > > > > Hi Howard,
> > > >
> >
> > > > > > I still am struggling with the
following sentence from David Aronson : "The decision about how to apportion
the data between the IS and OOS subsets is arbitrary. There is no theory that
suggests what fraction of the data should be assigned to training ( IS ) and
testing ( OOS ). Results can be very sensitive to these choices ... ". Because
this is exactly what I am seeing. WalkFoward results are more then sensitive
to the IS/OOS relation and in many cases a pure random story. I am getting
more and more the feeling that WalkForward is not the correct or better
objective way to test trading systems. With all respect to Robert Pardo's
idea's about this topic and what you are writing in QTS ...
> > >
> > >
> > > > > > Regards, Ton.
> >
> > > >
> > > > > >
> > > >
> > ----- Original Message -----
> > > > > > From:
Howard B
> > > > > > To: amibroker@xxxxxxxxxps.com
> > > > > > Sent: Monday, October 05, 2009 12:48
AM
> > > > > > Subject: Re: [amibroker] Re: Is the Walk
forward study useful?
> > > > > >
> > > >
> >
> > > > > > Greetings all --
> > >
> > >
> > > > > > My point of view on the
length of the in-sample and out-of-sample may be a little different.
>
> > > > >
> > > > > > The logic of the
code has been designed to recognize some pattern or characteristic of the
data. The length of the in-sample period is however long it takes to keep the
model (the logic) in synchronization with the data. There is no one answer to
what that length is. When the pattern changes, the model fits it less well.
When the pattern changes significantly, the model must be re-synchronized. The
only person who can say whether the length is correct or should be longer or
shorter is the person running the tests.
> > > > > >
> > > > > > The length of the out-of-sample period is
however long the model and the data remain in sync. That must be some length
of time beyond the in-sample period in order to make profitable trades. It
could be a long time, in which case there is no need to modify the model at
all during that period. There is no general relationship between the length of
the in-sample period and the length of the out-of-sample period -- none. There
is no general relationship between the performance in-sample and the
performance out-of-sample. The greater the difference between the two, the
better the system has been fit to the data over the in-sample period. But that
does not necessarily mean that the out-of-sample results are less
meaningful.
> > > > > >
> > > > > >
You can perform some experiments to see what the best in-sample length is. And
then to see what the typical out-of-sample length is. Knowing these two, set
up a walk forward run using those lengths. After the run is over, ignore the
in-sample results. They have no value in estimating the future performance of
the system. It is the out-of-sample results that can give you some idea of how
the system might act when traded with real money.
> > > > >
>
> > > > > > It is nice to have a lot of closed
traded in the out-of-sample period, but you can run statistics on as few as 5
or 6. Having fewer trades means that it will be more difficult to achieve
statistical significance. The number 30 is not magic -- it is just
conventional.
> > > > > >
> > > > >
> I think it helps to distinguish between the in-sample and out-of-sample
periods this way -- in-sample is seeing how well the model can be made to fit
the older data, out-of-sample is seeing how well it might fit future
data.
> > > > > >
> > > > > >
Ignore the television ads where person after person exclaims
"backtesting!" as though that is the key to system development. It is
not. Backtesting by itself, without going on to walk forward testing, will
give the trading system developer the impression that the system is good.
In-sample results are always good. We do not stop fooling with the system
until they are good. But in-sample results have no value in predicting future
performance -- none.
> > > > > >
> > > >
> > There are some general characteristics of trading systems that make
them easier to validate. Those begin with having a positive expectancy -- no
system can be profitable in the long term unless it has a positive expectancy.
Then going on to include trade frequently, hold a short time, minimize losses.
Of course, there have been profitable systems that trade infrequently, hold a
long time, and suffer deep drawdowns. It is much harder to show that those
were profitable because they were good rather than lucky.
> > >
> > >
> > > > > > There is more information
about in-sample, out-of-sample, walk forward testing, statistical validation,
objective functions, and so forth in my book, "Quantitative Trading
Systems."
> > > > > > http://www.quantitativetradingsystems.com/
> > > > > >
> > > > > > Thanks for
listening,
> > > > > > Howard
> > > > >
>
> > > > > >
> > > > > >
> > > > > > On Sun, Oct 4, 2009 at 10:56 AM, Bisto
<bistoman73@> wrote:
> > > > > >
> >
> > > >
> > > > > > Yes, I believe that you
should increase the IS period
> > > > > >
> >
> > > > as general rule is not true "the shortest the best" trying
to catch every market change because it's possible that a too short IS period
produces a too low number of trades with no statistical robustness --> you
will find parameters that are more likely candidated to fail in OS
>
> > > > >
> > > > > > try a longer IS
period and let's see what will happen
> > > > > >
> > > > > > I read an interesting book on this issue:
"The evaluation and optimization of trading strategies" by Pardo. Maybe he
repeated too much times the same concepts nevertheless I liked it
> >
> > > >
> > > > > > if anyone could suggest
a better book about this issue it would be very appreciated
> > >
> > >
> > > > > >
> > > > >
>
> > > > > > Bisto
> > > > > >
> > > > > > --- In amibroker@xxxxxxxxxps.com,
"Gonzaga" <gonzagags@> wrote:
> > > > > >
>
> > > > > > > Oh, sorry, I am lost in translation
... ;-)
> > > > > > > Yes I meant trades of my IS
period.
> > > > > > > I've got about 70 trades in my
IS period, three months.
> > > > > > > BUT, I buy
stocks in a multiposition way.This means, that my hole capital divides among
several stocks purchased simultaneously.
> > > > > > >
So, in my statistics, I use to average my trades. When I use
maxopenpositions=7, I use to average my results every 7 trades.
>
> > > > > > Considering that, my trades in three months are
not 70, but less ( not exactly 70/7, but less than 70)
> > > >
> > >
> > > > > > > If I use
maxopenposition=1, which is, invest all my capital every trade, in three
months I would have about 29 trades.
> > > > > > > So
I suppose I have to increase the IS period.. isn`t it?
> > > >
> > >
> > > > > > >
> > > >
> > > --- In amibroker@xxxxxxxxxps.com,
"Bisto" <bistoman73@> wrote:
> > > > > > >
>
> > > > > > > > What do you mean with "I don't
have many buyings and sellings"?
> > > > > > > >
> > > > > > > > If you have less than 30 trades in
an IS period, IMHO, you are using a too short period due to not statistical
robustness --> WFA is misleading, try a longer IS period
> > >
> > > > >
> > > > > > > >
Bisto
> > > > > > > >
> > > > >
> > > --- In amibroker@xxxxxxxxxps.com,
"Gonzaga" <gonzagags@> wrote:
> > > > > > > >
>
> > > > > > > > > Thanks for the
answers
> > > > > > > > > To Keith McCombs
:
> > > > > > > > >
> > > > >
> > > > I use 3 months IS test and 1 month step, this is, 1 month
OS test. My system is an end-of day-system, so I don't have many buyings and
sellings..
> > > > > > > > > Perhaps I should
make bigger the IS period?
> > > > > > > > >
> > > > > > > > > anyway, my parameter behaves
well in any period. Of course it is an optimized variable, but it doesn't fail
in ten years, in none of those ten years, over 500 stocks.. a very long
period..
> > > > > > > > > So, couldn't it be
better, on the long run, than the parameters optimized with the WF
study?
> > > > > > > > > (In fact, I am using it
now, the optimized variable)
> > > > > > > > >
That's my real question..
> > > > > > > > >
> > > > > > > > > To dloyer123:
> >
> > > > > > > I haven't understood the meaning of the
Walk Forward Efficency, and seems interesting.
> > > > >
> > > > can you explain it better, please..?
> > >
> > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > >
> > > > --- In amibroker@xxxxxxxxxps.com,
"dloyer123" <dloyer123@> wrote:
> > > > > > >
> > >
> > > > > > > > > > I have had
similar experiences. I like to use WFT to estimate what Pardo call's his "Walk
Forward Efficency", or the ratio of the out of sample WF profits to just
optimizing over the entire time period.
> > > > > > >
> > >
> > > > > > > > > > A good
system should have as high a WFE as posible. Systems with a poor WFE tend to
do poorly in live trading.
> > > > > > > > >
>
> > > > > > > > > > If you have a parm
set that works well over a long period of live trading, then you are doing
well!
> > > > > > > > > >
> > >
> > > > > >
> > > > > > >
>
> > > > > > >
> > > > >
>
> > > > >
> > > >
> >
>
> >
>