Re: [amibroker] What is a valid number of Back test results to Optimize?, AmiBroker Email List Archive

Re: [amibroker] What is a valid number of Back test results to Optimize?

Subject: Re: [amibroker] What is a valid number of Back test results to Optimize?

From: "Howard B" <howardbandy@xxxxxxxxx>

Date: Wed, 6 Feb 2008 11:03:33 -0700

PureBytes Links

Trading Reference Links

Greetings --

I'll respond to several of the points made by different posters (who all made good points) all in this one reply.

First -- My comment about looking at the equity curve. The objective function that will be used when performing the walk forward tests must, of course, be a single-valued function that can be computed. I recommend that it be designed and defined early in the system development process, and that it be tailored to the person or organization who will be trading. Some of the metrics that I favor are those that incorporate both the slope and smoothness of the equity curve. K-ratio, RRR, CAR/MDD, and RAR/MDD are good ones to start with. Since AmiBroker easily accommodates custom metrics of any degree of complexity, you can combine as many features as are important to you into that one objective function. I recommend beginning by generating and printing out several equity curves (keeping track of how they were produced), then sorting them into rank order by your preference. I think that a visual inspection of the curve itself gives a better feel for the acceptability of the system, but others may prefer to read the columns of statistics associated with each curve. At any rate, after choosing your own person objective function, it will be used to rank all alternatives as you perform the walk forward steps. If, at any point, you find that you prefer the results from one of the alternatives that was not the first-ranked alternative, then there is an adjustment needed in the objective function. This is quite important, because when you move on to automated walk forward testing, AmiBroker will be using the values for the parameters associated with the alternative that was top-ranked to perform the out-of-sample test -- you will not have an opportunity to second guess it.

Second -- The length of the in-sample period is up to you. The arguments in favor of a longer period are that the resulting system will have a better opportunity to learn the signal portion of the data and be less likely to be curve-fit to the noise portion. The arguments in favor of a shorter period are that the system will be more easily kept in synchronization with the market it is meant to represent. Different markets and different trading models results in different systems (where a system is a combination of a model and a market), and in different optimal in-sample lengths. Even given a single data series, different trading methods will have different "best" in-sample lengths. You will need to determine what yours is for your system for yourself. My recommendation is to make it as short as is consistent with having the model learn the signal and not become curve-fit.

Third -- The length of the out-of-sample period is in no way related to the length of the in-sample period. It is entirely dependent on the length of time that the model stays in sync with the underlying market. It is entirely possible that your in-sample period is three months and the model-market relationship stays stable for an additional three months. The way to tell is run the optimization over the in-sample data, then extend the "to" date in Automatic Analysis (leaving the "from" date as it was) and perform a Backtest. Plot and look at the equity curve. Put the vertical line cursor at the date that marks the end of the in-sample period. Evaluate the period that follows -- first visually, and then by reading the statistics. As you perform the walk forward steps several times, each time looking at how long the out-of-sample performance remains good, you will get a feeling for the useful life of any particular system. Some may fail in a few days, others may last for many months -- only your experiments will tell you. Your out-of-sample results will very rarely be better than the in-sample results, but they may regularly be almost as good. They will probably always be poorer. You can gather up all the statistics from all the out-of-sample tests and use them as controls by which to measure the live trades. Since there is no way you can predict what the conditions will be in the future, the best you can do is study how your system has reacted in the past and gain some confidence about its "expected" performance. There are no guarantees. Tomorrow is always out-of-sample.

A very big caution!! Beware of making changes to your model based on information you obtain by studying the out-of-sample performance. It takes very little to transform previously out-of-sample data into newly incorporated in-sample data. Of course, if you see your system fails badly under some conditions, and you can modify your afl code to take those conditions into account, and you should. But the "out-of-sampleness" of the data used to make those changes has now been compromised, and new out-of-sample data is required.

Thanks for listening,
Howard

On Feb 6, 2008 6:18 AM, wavemechanic <timesarrow@xxxxxxxxxxxxx> wrote:

I would add one thing - the out of sample test should be the same type of market as that used for the in sample test. I have seen, for example, some good neural net models that take this into account, resulting in models for up, down, and flat markets, as opposed to one model fits all. Of course, defining up, down, and flat can be a bit subjective but performance definitely improves when one takes a shot at it. In any case, the equity curve (bottom line) will keep one generally on the right track without serious derailments. There was much discussion of this topic on this board several years ago with reference to the many books and articles on the subject which all tend to come down in the same place.

Bill

----- Original Message -----

From: Howard B

To: amibroker@xxxxxxxxxxxxxxx

Sent: Tuesday, February 05, 2008 10:38 AM

Subject: Re: [amibroker] What is a valid number of Back test results to Optimize?

Hi Chris --

You can do anything you want to in your search for a good trading system. The data period you work with during that search is the in-sample period. The results you achieve over the in-sample period have no value in predicting what the future performance will be. In order to estimate the future performance, you need to test the program on a set of data that follows the in-sample period and has not been used at all in the development of the system. That data is called the out-of-sample data. You can perform statistical tests on the out-of-sample results, but the quickest way to evaluate it is to look at the out-of-sample equity curve.

Be careful to avoid the following procedure. Optimize in-sample, evaluate out-of-sample, modify the system based on the the out-of-sample results, retest out-of-sample. The previously out-of-sample data period has become part of an expanded in-sample data set and a new out-of-sample test is required in order to estimate future performance.

There is a lot more to system development, testing, and validation than those two paragraphs. I am presenting a two-day workshop in Las Vegas February 21 and 22 devoted to that subject.
http://www.ftmonitor.com/lv08/lv08intro.html

And I have written a book devoted to that subject.
http://www.quantitativetradingsystems.com/

Thanks,
Howard

On Feb 5, 2008 4:34 AM, ChrisB <kris45mar@xxxxxxxxxxxx> wrote:

What is a valid or reasonable number of backtest results to subject to
Optimization?

For general statistics a minimum of 30 or so is needed to start getting
valid StdDevs etc.

If I run a backtest on hourly currency data over three months I get
around 16 -20 tradeable signals per currency.
This give a nice smooth plateau on 3D optimization.

If I test over two months of data I get around 10 - 12 trades

If I test over only 1 month I get only 5 or 6 trades.

These shorter time periods still give visually acceptable 3D plateaus
but I am wondering if there is enough data to be statistically significant.

I am trying to get a handle on how close I can get to current
fluctuations in the market without hitting noise. The idea being to redo
the Optimization every x time frame and shift the entry and exit
parameters to stay in the middle of the plateau.

Of course I can backtest over longer time frames, say 6 months of data,
shifting the starting date forward by one month at a time, but this
would seem to introduce more "lag" into my selection of best parameters
to trade.

Does anyone have any thoughts/references on this?

--
Regards

ChrisB

No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.516 / Virus Database: 269.19.19/1258 - Release Date: 2/4/2008 10:10 AM

__._,_.___

Please note that this group is for discussion between users only. To get support from AmiBroker please send an e-mail directly to SUPPORT {at} amibroker.com For NEW RELEASE ANNOUNCEMENTS and other news always check DEVLOG: http://www.amibroker.com/devlog/ For other support material please check also: http://www.amibroker.com/support.html

Your email settings: Individual Email|Traditional
Change settings via the Web (Yahoo! ID required)
Change settings via email: Switch delivery to Daily Digest | Switch to Fully Featured
Visit Your Group | Yahoo! Groups Terms of Use | Unsubscribe

__,_._,___