PureBytes Links
Trading Reference Links
|
Gary-
What software do you use to do your walk forward (out of sample) testing?
TradeStation doesn't, of course, natively handle this type of testing.
Are you testing a larger portfolio, or a small number of markets?
-Mike
-----Original Message-----
From: Gary Fritz [mailto:fritz@xxxxxxxx]
Sent: Saturday, May 05, 2001 3:15 PM
To: omega-list@xxxxxxxxxx
Subject: Re: Walk BACKWARD TESTING
> Aside from the curve-fitting aspect of walk forward testing,
> and the seductive confidence it provides, I'd like to raise
> still another objection: it is conceptually flawed.
I don't think it has to be. I have a very strong system that does
very well, and is still doing passably well with parameters I tuned
on it 2.5 years ago. But it does benefit from an occasional retune.
I've got an adaptive version that basically adjusts itself to market
conditions, but I find I get slightly better results with the non-
adaptive version and a periodic tuneup -- maybe every 6 months or so.
Walk-forward testing would let me test how a periodic retune like
that would have actually behaved if I was doing it two years ago, and
would assure me that my periodic-retune process is actually doing
what I want.
Walk-forward testing is no substitute for a good system. I.e. you
can't take a worthless system, tune it every week, and expect it to
become a good system. I imagine some systems fall "out of tune"
sooner than others, and it's possible that retuning a system like
that every week could result in a decent system. But you're a lot
better off if you work on developing a system on sound principles
rather than trying to make an unsound system hold together for a few
days until the next tuning.
> Aside from data-fitting, another cause of system degradation
> in real-time, as well as discretionary traders finding the
> markets more difficult, is due to the fact that markets
> evolve. Today's markets, are not your father's markets.
> Thus, what I often do in my system development, is to
> create systems that work on the most RECENT 1 - 2 years.
I also prefer this. Although frankly, the markets of today are
nothing like the markets of 2 years ago. Compare last April's
volatility spike with the market of only 6 months earlier.
> If I'm satisfied with the system behavior, as well as the various
> stats, then I can test it on prior years' data. If it even
> breaks-even over the prior years, I may trade it.
Here you lost me. Why would you consider a backward test to be a
more valid measure of the system's performance than a forward test?
My assumption is that markets change, and that there may be a uni-
directional change in overall behavior -- increasing volatility, for
example. Thus a system that behaves in one market might behave well
in a less-volatile market, but get creamed in a more-volatile market,
so I prefer to see how the system behaves when its out-of-sample
period is a later, more "evolved" market.
What I generally do is tune for a long enough period to get 100-150
trades, ending the test at an early enough point that I still have
time *after* the test period for an out-of-sample test that will
generate at least 50-75 and preferably >100 trades. That gives me an
idea how the system handles a "newer" market in its OOS test. Then,
assuming it does well in that test, I re-tune the system on ALL my
data (or possibly just more recent data, enough for >100 trades) for
real-time trading.
Last year was a real challenge. The volatility spike, the turn in
market direction, etc, made March through about August fairly
unusual. I want to make sure any system can *handle* that market
appropriately, but I don't want to include it in my testing period if
I can avoid it. Many borderline systems looked *great* in the first
3-6 months of 2000, but they fall apart in 1999 or after 9/2000. If
you tune on that unusual period, you might get great results with a
system that would fail in a more "normal" market. I want a system
that behaves well in "normal" times AND does well when the market
goes berserk.
> Contrast this to walk forward, where you test on old data,
> then "blindly" test on the most recent periods. Yes, some
> concepts may work equally over extended periods, but if the
> goal is at all to be more in sync with current conditions,
> you will probably not achieve that, as all your development
> is on prior, out of sync, periods.
That's why I re-tune for more recent data if the system passes its
OOS test.
> Thus, I'm proposing that walk BACKWARD testing is far more
> sound conceptually, and is likely to lead to better results
> than the walk forward methodology.
If you re-tune on recent data before trading, would you still feel
that walk-backward is better? I feel walk-backward is conceptually
unsound because you're testing something -- trading on older data --
that you're not going to do in actual use.
Gary
|