[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Walk BACKWARD TESTING



PureBytes Links

Trading Reference Links

> Aside from the curve-fitting aspect of walk forward testing,
> and the seductive confidence it provides, I'd like to raise
> still another objection:  it is conceptually flawed.

I don't think it has to be.  I have a very strong system that does 
very well, and is still doing passably well with parameters I tuned 
on it 2.5 years ago.  But it does benefit from an occasional retune.  
I've got an adaptive version that basically adjusts itself to market 
conditions, but I find I get slightly better results with the non-
adaptive version and a periodic tuneup -- maybe every 6 months or so.

Walk-forward testing would let me test how a periodic retune like 
that would have actually behaved if I was doing it two years ago, and 
would assure me that my periodic-retune process is actually doing 
what I want.

Walk-forward testing is no substitute for a good system.  I.e. you 
can't take a worthless system, tune it every week, and expect it to 
become a good system.  I imagine some systems fall "out of tune" 
sooner than others, and it's possible that retuning a system like 
that every week could result in a decent system.  But you're a lot 
better off if you work on developing a system on sound principles 
rather than trying to make an unsound system hold together for a few 
days until the next tuning.

> Aside from data-fitting, another cause of system degradation
> in real-time, as well as discretionary traders finding the
> markets more difficult, is due to the fact that markets
> evolve.  Today's markets, are not your father's markets.
> Thus,   what I often do in my system development, is to
> create systems that work on the most RECENT  1 - 2 years.  

I also prefer this.  Although frankly, the markets of today are 
nothing like the markets of 2 years ago.  Compare last April's 
volatility spike with the market of only 6 months earlier.

> If I'm satisfied with the system behavior, as well as the various
> stats, then I can test it on prior years' data.  If it even
> breaks-even over the prior years, I may trade it. 

Here you lost me.  Why would you consider a backward test to be a 
more valid measure of the system's performance than a forward test?

My assumption is that markets change, and that there may be a uni-
directional change in overall behavior -- increasing volatility, for 
example.  Thus a system that behaves in one market might behave well 
in a less-volatile market, but get creamed in a more-volatile market, 
so I prefer to see how the system behaves when its out-of-sample 
period is a later, more "evolved" market.

What I generally do is tune for a long enough period to get 100-150 
trades, ending the test at an early enough point that I still have 
time *after* the test period for an out-of-sample test that will 
generate at least 50-75 and preferably >100 trades.  That gives me an 
idea how the system handles a "newer" market in its OOS test.  Then, 
assuming it does well in that test, I re-tune the system on ALL my 
data (or possibly just more recent data, enough for >100 trades) for 
real-time trading.

Last year was a real challenge.  The volatility spike, the turn in 
market direction, etc, made March through about August fairly 
unusual.  I want to make sure any system can *handle* that market 
appropriately, but I don't want to include it in my testing period if 
I can avoid it.  Many borderline systems looked *great* in the first 
3-6 months of 2000, but they fall apart in 1999 or after 9/2000.  If 
you tune on that unusual period, you might get great results with a 
system that would fail in a more "normal" market.  I want a system 
that behaves well in "normal" times AND does well when the market 
goes berserk.

> Contrast this to walk forward, where you test on old data,
> then "blindly" test on the most recent periods.  Yes, some
> concepts may work equally over extended periods, but if the
> goal is at all to be more in sync with current conditions,
> you will probably not achieve that, as all your development
> is on prior, out of sync, periods.

That's why I re-tune for more recent data if the system passes its 
OOS test.

> Thus, I'm proposing that walk BACKWARD testing is far more
> sound conceptually, and is likely to lead to better results
> than the walk forward methodology.

If you re-tune on recent data before trading, would you still feel 
that walk-backward is better?  I feel walk-backward is conceptually 
unsound because you're testing something -- trading on older data -- 
that you're not going to do in actual use.  

Gary