Ton,
Are you saying that you have not found an IS/OOS pair that
works well? What measure are you using to judge "stability" of the walk
forward process (i.e. what measure are you using to judge the process as
random)?
After testing with multiple IS periods, and with multiple OOS
periods, I was able to identify "fixed" window lengths that proved more
consistent than the others tested.
I reached this conclusion by
charting a distribution curve of CAR for the OOS results. My fitness function
is currently based on UPI, and thus my walk forward is driven by that value.
However, ultimately my interest is in how consistent CAR would be which is why
I used that for evaluating the goodness of fit for the IS/OOS period
lengths.
In my case, over a 13 year period, a 2 year IS and 6 month OOS
(for a total of 26 OOS data points) produced the most normal looking
distribution of CAR results (i.e. central peak, smallest standard deviation).
Excluding the results from all of 1999 and the first half of 2000 (during
which results were abnormally strong), the distribution curve looks even
better.
Also, have you tried working with different fitness functions?
Perhaps your fitness function doesn't adequately identify the "signal" and
thus misguides the walk forward, regardless of IS/OOS window lengths.
I
am in the process of running a new walk forward over the last 7.5 years using
Van Tharp's System Quality Number (SQN) as my fitness function. I have kept
the same 2 year IS/6 months OOS for a total of 15 OOS data points. My system
strives to generate a minimum average of 2 trades per day, so each IS period
generally has 1000 or more trades from which to calculate the
fitness.
It has not run to completion yet. But, for the periods that
have produced results, the results look promising (at least with respect to
the SQN of the OOS relative to the SQN of the IS, I have not yet created the
distribution of CAR for OOS).
Assuming that the remainder of the
results are equally strong, I will walk forward further back in history to get
the full 26 data points to compare against the results produced using my UPI
fitness. If the CAR distribution is more normal using SQN as fitness, then I
will officially start using SQN for generating optimal values for my next live
OOS.
If you are willing to share, I would be curious to hear if SQN as
a fitness function was able to produce a more stable walk forward for you, and
what measure you are using to judge "stable".
Mike
--- In amibroker@xxxxxxxxxps.com,
"Ton Sieverding" <ton.sieverding@...> wrote:
>
> Hi
Howard,
>
> I still am struggling with the following sentence
from David Aronson : "The decision about how to apportion the data between the
IS and OOS subsets is arbitrary. There is no theory that suggests what
fraction of the data should be assigned to training ( IS ) and testing ( OOS
). Results can be very sensitive to these choices ... ". Because this is
exactly what I am seeing. WalkFoward results are more then sensitive to the
IS/OOS relation and in many cases a pure random story. I am getting more and
more the feeling that WalkForward is not the correct or better objective way
to test trading systems. With all respect to Robert Pardo's idea's about this
topic and what you are writing in QTS ...
>
> Regards,
Ton.
>
>
> ----- Original Message -----
> From:
Howard B
> To: amibroker@xxxxxxxxxps.com
> Sent: Monday, October 05, 2009 12:48 AM
> Subject: Re:
[amibroker] Re: Is the Walk forward study useful?
>
>
>
Greetings all --
>
> My point of view on the length of the
in-sample and out-of-sample may be a little different.
>
> The
logic of the code has been designed to recognize some pattern or
characteristic of the data. The length of the in-sample period is however long
it takes to keep the model (the logic) in synchronization with the data. There
is no one answer to what that length is. When the pattern changes, the model
fits it less well. When the pattern changes significantly, the model must be
re-synchronized. The only person who can say whether the length is correct or
should be longer or shorter is the person running the tests.
>
>
The length of the out-of-sample period is however long the model and the data
remain in sync. That must be some length of time beyond the in-sample period
in order to make profitable trades. It could be a long time, in which case
there is no need to modify the model at all during that period. There is no
general relationship between the length of the in-sample period and the length
of the out-of-sample period -- none. There is no general relationship between
the performance in-sample and the performance out-of-sample. The greater the
difference between the two, the better the system has been fit to the data
over the in-sample period. But that does not necessarily mean that the
out-of-sample results are less meaningful.
>
> You can perform
some experiments to see what the best in-sample length is. And then to see
what the typical out-of-sample length is. Knowing these two, set up a walk
forward run using those lengths. After the run is over, ignore the in-sample
results. They have no value in estimating the future performance of the
system. It is the out-of-sample results that can give you some idea of how the
system might act when traded with real money.
>
> It is nice to
have a lot of closed traded in the out-of-sample period, but you can run
statistics on as few as 5 or 6. Having fewer trades means that it will be more
difficult to achieve statistical significance. The number 30 is not magic --
it is just conventional.
>
> I think it helps to distinguish
between the in-sample and out-of-sample periods this way -- in-sample is
seeing how well the model can be made to fit the older data, out-of-sample is
seeing how well it might fit future data.
>
> Ignore the
television ads where person after person exclaims "backtesting!" as
though that is the key to system development. It is not. Backtesting by
itself, without going on to walk forward testing, will give the trading system
developer the impression that the system is good. In-sample results are always
good. We do not stop fooling with the system until they are good. But
in-sample results have no value in predicting future performance -- none.
>
> There are some general characteristics of trading systems
that make them easier to validate. Those begin with having a positive
expectancy -- no system can be profitable in the long term unless it has a
positive expectancy. Then going on to include trade frequently, hold a short
time, minimize losses. Of course, there have been profitable systems that
trade infrequently, hold a long time, and suffer deep drawdowns. It is much
harder to show that those were profitable because they were good rather than
lucky.
>
> There is more information about in-sample,
out-of-sample, walk forward testing, statistical validation, objective
functions, and so forth in my book, "Quantitative Trading Systems."
> http://www.quantitativetradingsystems.com/
>
> Thanks for listening,
> Howard
>
>
>
> On Sun, Oct 4, 2009 at 10:56 AM, Bisto
<bistoman73@...> wrote:
>
>
> Yes, I believe
that you should increase the IS period
>
> as general rule is not
true "the shortest the best" trying to catch every market change because it's
possible that a too short IS period produces a too low number of trades with
no statistical robustness --> you will find parameters that are more likely
candidated to fail in OS
>
> try a longer IS period and let's see
what will happen
>
> I read an interesting book on this issue:
"The evaluation and optimization of trading strategies" by Pardo. Maybe he
repeated too much times the same concepts nevertheless I liked it
>
> if anyone could suggest a better book about this issue it would be
very appreciated
>
>
>
> Bisto
>
> ---
In amibroker@xxxxxxxxxps.com,
"Gonzaga" <gonzagags@> wrote:
> >
> > Oh, sorry, I am
lost in translation ... ;-)
> > Yes I meant trades of my IS
period.
> > I've got about 70 trades in my IS period, three
months.
> > BUT, I buy stocks in a multiposition way.This means, that
my hole capital divides among several stocks purchased simultaneously.
>
> So, in my statistics, I use to average my trades. When I use
maxopenpositions=7, I use to average my results every 7 trades.
>
> Considering that, my trades in three months are not 70, but less ( not
exactly 70/7, but less than 70)
> >
> > If I use
maxopenposition=1, which is, invest all my capital every trade, in three
months I would have about 29 trades.
> > So I suppose I have to
increase the IS period.. isn`t it?
> >
> >
> >
--- In amibroker@xxxxxxxxxps.com,
"Bisto" <bistoman73@> wrote:
> > >
> > >
What do you mean with "I don't have many buyings and sellings"?
> >
>
> > > If you have less than 30 trades in an IS period, IMHO,
you are using a too short period due to not statistical robustness --> WFA
is misleading, try a longer IS period
> > >
> > >
Bisto
> > >
> > > --- In amibroker@xxxxxxxxxps.com,
"Gonzaga" <gonzagags@> wrote:
> > > >
> > >
> Thanks for the answers
> > > > To Keith McCombs :
>
> > >
> > > > I use 3 months IS test and 1 month
step, this is, 1 month OS test. My system is an end-of day-system, so I don't
have many buyings and sellings..
> > > > Perhaps I should make
bigger the IS period?
> > > >
> > > > anyway,
my parameter behaves well in any period. Of course it is an optimized
variable, but it doesn't fail in ten years, in none of those ten years, over
500 stocks.. a very long period..
> > > > So, couldn't it be
better, on the long run, than the parameters optimized with the WF
study?
> > > > (In fact, I am using it now, the optimized
variable)
> > > > That's my real question..
> > >
>
> > > > To dloyer123:
> > > > I haven't
understood the meaning of the Walk Forward Efficency, and seems
interesting.
> > > > can you explain it better,
please..?
> > > >
> > > >
> > >
>
> > > > --- In amibroker@xxxxxxxxxps.com,
"dloyer123" <dloyer123@> wrote:
> > > > >
> >
> > > I have had similar experiences. I like to use WFT to estimate
what Pardo call's his "Walk Forward Efficency", or the ratio of the out of
sample WF profits to just optimizing over the entire time period.
>
> > > >
> > > > > A good system should have as
high a WFE as posible. Systems with a poor WFE tend to do poorly in live
trading.
> > > > >
> > > > > If you have
a parm set that works well over a long period of live trading, then you are
doing well!
> > > > >
> > > >
> >
>
> >
>