[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[amibroker] Re: Margin of Error



PureBytes Links

Trading Reference Links

Fred,


I put forward the following proposition, not as a method, but rather 
as an interesting discussion point.

My proposition is that OOS testing after optimisation is not actually 
OOS testing at all.

Break some historical data into three equal segments and conduct 
intitial testing/optimisation on the first segment.
This is not  testing as such and the data used can be considered as 
optimisation or design data.
After the optimisation the top model can be tested using segment two.
I claim that this is not OOS but actually the first test of the top 
model.
No harm is done if the top model is disappointing in tests and the 
second top model is tested on the second segment of data or even if we 
go back and re-optimise on the first segment of data to obtain some 
new top models.
The data doesn't know we have optimised on it.
It suddenly doesn't become data-non-gratus just because our computer 
software has tip-toed over it once, twice or even a thousand times.

Once an optimised model has been correctly tested in out of 
optimisation data, it can be statistically evaluated and then traded 
with confidence, providing it is part of a balanced freelance trading 
portfolio.

If the system is tested on the third set of data, that would 
constitute an OOS test, but no one ever does that do they?

Further to that, the equity curve obtained from the OOS test will 
*always* be within the range predicted by the test profile of a 
correctly analysed system, and yes, sometimes it might not look so 
pretty.
The exception there is the occasional equity curve outlier, but hey, 
nothings perfect.

My second outlandish proposition is that the dangers of over-fitting 
during optimisation are over-emphasised.
If the outcome of the trade system test is a dataset with an adequate 
number of samples then it will be a true test and definitely will not 
be a result of over-fitting.
The corrollary of that is that if a system has that many rules that 
after back-testing 10 X 250 daily bars for 2000 symbols (5million 
datapoints) it only produces 5 signals it is obvious that something is 
wrong.

Genuinely significant events that occur rarely require massive amounts 
of data to produce a resonable number of signals, so the data becomes 
suspect anyway and as well as that it is no fun to trade a system when 
you have to wait for a leap year to get in.

Most of the time over-fitting simply receives the blame for incorrect 
testing and evaluation.

BrianB2.




 --- In amibroker@xxxxxxxxxxxxxxx, "Fred" <ftonetti@xxx> wrote:
>
> OOS and/or WF Testing is not a concept invented in or for IO so that 
> particular piece of software is not really the issue per se except 
that 
> IO has facilitated making it considerably easier to perform.
> 
> "If one OOS test had a 50% drawdown it doesn't say that much about 
the 
> system.  It only says something about that one single OOS test of 
> however many samples."
> 
> It doesn't ? ... It speaks volumes to me ... From my perspective, 
> decent in sample performance, whether or not one applies MCS after 
the 
> fact, is not the end of system testing, it is only a milestone along 
> the way to developing a system that MIGHT be tradable ...
>




Content-Description: "AVG certification"
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.409 / Virus Database: 268.14.11/542 - Release Date: 11/20/2006