Hi Ton,
I agree that the rule of thumb is subjective. So far, I've
been willing to live with it.
It appears that you and I have different
expectations of IS/OOS window sizes. I treat the calculation of walk forward
window sizes as a second pass optimization, similar to a simple moving average
(SMA) crossover system.
- There are two variables (e.g. IS length/OOS
length vs. fast SMA/slow SMA)
- An optimal combination is desired
- We
use a fitness function to measure optimal (e.g. OOS:IS ratio vs.
CAR/MDD)
This is how I try to satisfy your Aronson quote "Each strategy
will have its own best values for IS/OOS periods".
Upon finding an
optimal CAR/MDD using fast SMA/slow SMA, we should theoretically be able to
trade that same optimal combination of fast SMA/slow SMA over different time
periods and expect to get a somewhat stable CAR/MDD (subject to changing
market conditions).
I would not expect combinations of fast SMA/slow
SMA to be stable relative to each other. Looking at a 3-D graph for this
crossover system will reveal peaks and valleys. Taking a single slice of that
graph (i.e. holding slow SMA constant and varying only fast SMA) will reveal a
rising and falling wave.
So, I would expect exactly the same in the
IS/OOS experiment you describe. You are simply taking a slice of the 2
variable optimization graph (holding IS constant and varying OOS). I would
expect a rising and falling wave representing the peaks and valleys that would
appear on the full 3-D graph.
If I optimize the ratio of OOS:IS using
IS length/OOS length, then I expect to get a somewhat consistent OOS:IS ratio
(subject to market changes) when using that same optimal IS length/OOS length
over different data ranges. I don't expect to get a stable OOS:IS ratio using
a fixed IS length and variable OOS length.
Mike
--- In amibroker@xxxxxxxxxps.com,
"Ton Sieverding" <ton.sieverding@...> wrote:
>
> Thanks
for your patience Mike -)
>
> 1. I know Pardo disagrees with
Aronson. And yes I am also using Pardo's rule of thumb. But a rule of thumb
without a scientific explanation is still a rule of thumb and therefore
subjective. The result of this is when taking 1/8 in stead of 1/3, I am
getting a completely different results. That's what Aronson tells me. So I do
not understand why Pardo disagrees with Aronson ... Of course I should ask
him. And I will ...
>
> 2. Here you are telling me what Aronson
says : "Each strategy will have its own best values for IS/OOS periods". But
trying to find the best values is empirical and therefore without having a
'good theory' why your are getting these values is highly subjective. Pardo is
not giving me this good theory and Aronson tells me this good theory does not
exist ...
>
> 3. With regard to our topic, it's not so important
which objective function you are using for the WalkFoward. In general I use
the CAR/MDD. But whatever OF gives you the same random WalkForward results.
Where of course by definition you should use a return/risk related OF
...
>
> 4. The way I am analyzing the WalkForward result is
simple. I am calculating the differences between the IS and OOS results in
percentages from OOS. Then I am taking the average and standard deviation of
all these percentages. This gives me an idea about the average IS/OOS error as
well as the spread around this average. For the same AFL using the same Symbol
you should do the WalkFoward in the way I mentioned in my previous email and
calculate the above average/stdev relation. In order to get a stable
WalkForward result being independent of the IS/OOS ratio, the average/stdev
relation should be more or less stable. It's not. It's highly dependent on the
IS/OOS ratio you are using ...
>
> BTW ... To get things
straight, I am not throwing WalkFoward out of the window. I am just trying to
believe in what I am using. And it's getting more and more difficult for me
...
>
> Regards, Ton.
>
>
>
>
>
----- Original Message -----
> From: Mike
> To: amibroker@xxxxxxxxxps.com
> Sent: Monday, October 05, 2009 11:09 AM
> Subject: [amibroker]
Re: Is the Walk forward study useful?
>
>
> Ton,
>
> 1. Pardo disagrees with Aronson (and Bandy). Pardo suggests that a
OOS to IS ration of 25% - 35% is best, but that a good rule of thumb for
empirical testing is 1/8 to 1/3.
>
> 2. Yes, I suspect that each
strategy will have its own best values for IS/OOS and that other values will
appear as useless. It is up to us to try and find the best values.
>
> With respect to your comment: "I am getting results that show a
random pattern", my question remains; What are you measuring? In other words,
what values appear random - your fitness value? CAR? Something else?
>
> 3. I have done very much as you ask, except that I also varied my IS
period. I mostly kept my ratios within Pardo's suggested 1/8 to 1/3, but went
as low as 1/12 and as high as 1/2 just to be sure.
>
> For
example IS=1 year, IS=2 years, IS=3 years giving
>
>
IS1yr+OOS6mth, IS1yr+OOS3mth, IS1yr+OOS1mth
> IS2yr+OOS12mth,
IS2yr+OOS6mth, IS2yr+OOS3mth
> IS3yr+OOS18mth, IS3yr+OOS12mth,
IS3yr+OOS6mth
>
> IS2yr+OOS6mth produced the most consistent CAR,
even though a weighted UPI was used as the fitness function for the actual
walk forward.
>
> I do not have a strong opinion as to whether or
not there really is a relationship between IS and OOS sizes. I found that
Pardo's rule of thumb was as good a starting place as any. I was happy that my
values (25%) coincided with what he advised. But, had my studies suggested a
ratio outside of Pardo's range, I would have still gone with what my results
suggested, despite Pardo's advice.
>
> Mike
>
> ---
In amibroker@xxxxxxxxxps.com,
"Ton Sieverding" <ton.sieverding@> wrote:
> >
> >
Hi Mike,
> >
> > What I am saying is :
> >
> > 1. That according to David Aronson "There is no theory that
suggests what fraction of the data should be assigned to training ( IS ) and
testing ( OOS )." and that "Results can be very sensitive to these choices ...
". I assume that he knows where he is talking about ...
> >
>
> 2. That when I am doing WalkFoward tests following the advice of Howard
Bandy, Robert Pardo AND Van Tharp, I am getting results that show a random
patron when changing the OOS en IS periods. So my conclusion is that
WalkFoward is a subjective test ...
> >
> > Therefore I
have serious problems using WalkFoward tests. If you can help me to get things
done in an objective way then I will be delighted to know how you want to do
that. But for sure Van Tharp did not help me ...
> >
> >
Please do a simple WF test with OOS=1year and IS=1month...12months. So
creating WF results for OOS1y+IS1m, OOS1y+IS2m etc. And see what you are
getting. This is purely random. The result says nothing to me ...
> >
> > Regards, Ton.
> >
> >
> >
>
> ----- Original Message -----
> > From: Mike
> > To:
amibroker@xxxxxxxxxps.com
> > Sent: Monday, October 05, 2009 9:29 AM
> > Subject:
[amibroker] Re: Is the Walk forward study useful?
> >
> >
> > Ton,
> >
> > Are you saying that you have not
found an IS/OOS pair that works well? What measure are you using to judge
"stability" of the walk forward process (i.e. what measure are you using to
judge the process as random)?
> >
> > After testing with
multiple IS periods, and with multiple OOS periods, I was able to identify
"fixed" window lengths that proved more consistent than the others
tested.
> >
> > I reached this conclusion by charting a
distribution curve of CAR for the OOS results. My fitness function is
currently based on UPI, and thus my walk forward is driven by that value.
However, ultimately my interest is in how consistent CAR would be which is why
I used that for evaluating the goodness of fit for the IS/OOS period
lengths.
> >
> > In my case, over a 13 year period, a 2
year IS and 6 month OOS (for a total of 26 OOS data points) produced the most
normal looking distribution of CAR results (i.e. central peak, smallest
standard deviation). Excluding the results from all of 1999 and the first half
of 2000 (during which results were abnormally strong), the distribution curve
looks even better.
> >
> > Also, have you tried working
with different fitness functions? Perhaps your fitness function doesn't
adequately identify the "signal" and thus misguides the walk forward,
regardless of IS/OOS window lengths.
> >
> > I am in the
process of running a new walk forward over the last 7.5 years using Van
Tharp's System Quality Number (SQN) as my fitness function. I have kept the
same 2 year IS/6 months OOS for a total of 15 OOS data points. My system
strives to generate a minimum average of 2 trades per day, so each IS period
generally has 1000 or more trades from which to calculate the fitness.
>
>
> > It has not run to completion yet. But, for the periods that
have produced results, the results look promising (at least with respect to
the SQN of the OOS relative to the SQN of the IS, I have not yet created the
distribution of CAR for OOS).
> >
> > Assuming that the
remainder of the results are equally strong, I will walk forward further back
in history to get the full 26 data points to compare against the results
produced using my UPI fitness. If the CAR distribution is more normal using
SQN as fitness, then I will officially start using SQN for generating optimal
values for my next live OOS.
> >
> > If you are willing to
share, I would be curious to hear if SQN as a fitness function was able to
produce a more stable walk forward for you, and what measure you are using to
judge "stable".
> >
> > Mike
> >
> > ---
In amibroker@xxxxxxxxxps.com,
"Ton Sieverding" <ton.sieverding@> wrote:
> > >
>
> > Hi Howard,
> > >
> > > I still am
struggling with the following sentence from David Aronson : "The decision
about how to apportion the data between the IS and OOS subsets is arbitrary.
There is no theory that suggests what fraction of the data should be assigned
to training ( IS ) and testing ( OOS ). Results can be very sensitive to these
choices ... ". Because this is exactly what I am seeing. WalkFoward results
are more then sensitive to the IS/OOS relation and in many cases a pure random
story. I am getting more and more the feeling that WalkForward is not the
correct or better objective way to test trading systems. With all respect to
Robert Pardo's idea's about this topic and what you are writing in QTS
...
> > >
> > > Regards, Ton.
> > >
> > >
> > > ----- Original Message -----
>
> > From: Howard B
> > > To: amibroker@xxxxxxxxxps.com
> > > Sent: Monday, October 05, 2009 12:48 AM
> > >
Subject: Re: [amibroker] Re: Is the Walk forward study useful?
> >
>
> > >
> > > Greetings all --
> > >
> > > My point of view on the length of the in-sample and
out-of-sample may be a little different.
> > >
> > >
The logic of the code has been designed to recognize some pattern or
characteristic of the data. The length of the in-sample period is however long
it takes to keep the model (the logic) in synchronization with the data. There
is no one answer to what that length is. When the pattern changes, the model
fits it less well. When the pattern changes significantly, the model must be
re-synchronized. The only person who can say whether the length is correct or
should be longer or shorter is the person running the tests.
> > >
> > > The length of the out-of-sample period is however long the
model and the data remain in sync. That must be some length of time beyond the
in-sample period in order to make profitable trades. It could be a long time,
in which case there is no need to modify the model at all during that period.
There is no general relationship between the length of the in-sample period
and the length of the out-of-sample period -- none. There is no general
relationship between the performance in-sample and the performance
out-of-sample. The greater the difference between the two, the better the
system has been fit to the data over the in-sample period. But that does not
necessarily mean that the out-of-sample results are less meaningful.
>
> >
> > > You can perform some experiments to see what the
best in-sample length is. And then to see what the typical out-of-sample
length is. Knowing these two, set up a walk forward run using those lengths.
After the run is over, ignore the in-sample results. They have no value in
estimating the future performance of the system. It is the out-of-sample
results that can give you some idea of how the system might act when traded
with real money.
> > >
> > > It is nice to have a
lot of closed traded in the out-of-sample period, but you can run statistics
on as few as 5 or 6. Having fewer trades means that it will be more difficult
to achieve statistical significance. The number 30 is not magic -- it is just
conventional.
> > >
> > > I think it helps to
distinguish between the in-sample and out-of-sample periods this way --
in-sample is seeing how well the model can be made to fit the older data,
out-of-sample is seeing how well it might fit future data.
> > >
> > > Ignore the television ads where person after person
exclaims "backtesting!" as though that is the key to system development.
It is not. Backtesting by itself, without going on to walk forward testing,
will give the trading system developer the impression that the system is good.
In-sample results are always good. We do not stop fooling with the system
until they are good. But in-sample results have no value in predicting future
performance -- none.
> > >
> > > There are some
general characteristics of trading systems that make them easier to validate.
Those begin with having a positive expectancy -- no system can be profitable
in the long term unless it has a positive expectancy. Then going on to include
trade frequently, hold a short time, minimize losses. Of course, there have
been profitable systems that trade infrequently, hold a long time, and suffer
deep drawdowns. It is much harder to show that those were profitable because
they were good rather than lucky.
> > >
> > > There
is more information about in-sample, out-of-sample, walk forward testing,
statistical validation, objective functions, and so forth in my book,
"Quantitative Trading Systems."
> > > http://www.quantitativetradingsystems.com/
> > >
> > > Thanks for listening,
> > >
Howard
> > >
> > >
> > >
> >
> On Sun, Oct 4, 2009 at 10:56 AM, Bisto <bistoman73@>
wrote:
> > >
> > >
> > > Yes, I believe
that you should increase the IS period
> > >
> > > as
general rule is not true "the shortest the best" trying to catch every market
change because it's possible that a too short IS period produces a too low
number of trades with no statistical robustness --> you will find
parameters that are more likely candidated to fail in OS
> > >
> > > try a longer IS period and let's see what will
happen
> > >
> > > I read an interesting book on this
issue: "The evaluation and optimization of trading strategies" by Pardo. Maybe
he repeated too much times the same concepts nevertheless I liked it
>
> >
> > > if anyone could suggest a better book about this
issue it would be very appreciated
> > >
> > >
> > >
> > > Bisto
> > >
> >
> --- In amibroker@xxxxxxxxxps.com,
"Gonzaga" <gonzagags@> wrote:
> > > >
> > >
> Oh, sorry, I am lost in translation ... ;-)
> > > > Yes I
meant trades of my IS period.
> > > > I've got about 70 trades
in my IS period, three months.
> > > > BUT, I buy stocks in a
multiposition way.This means, that my hole capital divides among several
stocks purchased simultaneously.
> > > > So, in my statistics,
I use to average my trades. When I use maxopenpositions=7, I use to
average my results every 7 trades.
> > > > Considering that, my
trades in three months are not 70, but less ( not exactly 70/7, but less than
70)
> > > >
> > > > If I use
maxopenposition=1, which is, invest all my capital every trade, in three
months I would have about 29 trades.
> > > > So I suppose I
have to increase the IS period.. isn`t it?
> > > >
>
> > >
> > > > --- In amibroker@xxxxxxxxxps.com,
"Bisto" <bistoman73@> wrote:
> > > > >
>
> > > > What do you mean with "I don't have many buyings and
sellings"?
> > > > >
> > > > > If you
have less than 30 trades in an IS period, IMHO, you are using a too short
period due to not statistical robustness --> WFA is misleading, try a
longer IS period
> > > > >
> > > > >
Bisto
> > > > >
> > > > > --- In amibroker@xxxxxxxxxps.com,
"Gonzaga" <gonzagags@> wrote:
> > > > > >
>
> > > > > Thanks for the answers
> > > > >
> To Keith McCombs :
> > > > > >
> > >
> > > I use 3 months IS test and 1 month step, this is, 1 month OS
test. My system is an end-of day-system, so I don't have many buyings and
sellings..
> > > > > > Perhaps I should make bigger the
IS period?
> > > > > >
> > > > > >
anyway, my parameter behaves well in any period. Of course it is an optimized
variable, but it doesn't fail in ten years, in none of those ten years, over
500 stocks.. a very long period..
> > > > > > So,
couldn't it be better, on the long run, than the parameters optimized with the
WF study?
> > > > > > (In fact, I am using it now, the
optimized variable)
> > > > > > That's my real
question..
> > > > > >
> > > > > >
To dloyer123:
> > > > > > I haven't understood the
meaning of the Walk Forward Efficency, and seems interesting.
> >
> > > > can you explain it better, please..?
> > >
> > >
> > > > > >
> > > > >
>
> > > > > > --- In amibroker@xxxxxxxxxps.com,
"dloyer123" <dloyer123@> wrote:
> > > > > >
>
> > > > > > > I have had similar experiences. I
like to use WFT to estimate what Pardo call's his "Walk Forward Efficency", or
the ratio of the out of sample WF profits to just optimizing over the entire
time period.
> > > > > > >
> > > >
> > > A good system should have as high a WFE as posible. Systems
with a poor WFE tend to do poorly in live trading.
> > > > >
> >
> > > > > > > If you have a parm set that
works well over a long period of live trading, then you are doing
well!
> > > > > > >
> > > > >
>
> > > > >
> > > >
> >
>
> >
>