[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[amibroker] Re: Expectancy - and related--specifically K-rato

To: amibroker@xxxxxxxxxxxxxxx
Subject: [amibroker] Re: Expectancy - and related--specifically K-rato
From: "brian_z111" <brian_z111@xxxxxxxxx>
Date: Sun, 10 May 2009 09:38:17 -0000
PureBytes Links
Trading Reference Links
Keith,

You sure set me some tough homework there.

It was around 4 am here when I answered you so the fairy dust was floating around in my head ... some further opinion:

1) WFE Efficiency? (IS metric/IS + OOS metric ratio):

This is a form of significance testing but it is indirect and complex and therefore more prone to error than other significance testing methods.

It attempts to compare the variance in the metric from sample to sample. However, some of the variance is due to sample error and this % varies from metric to metric and sample to sample (the inherent sample error is propagated differently according to the maths operations performed in each metric).

It could be useful in the hands of a very skilled practitioner who understands and allows for all of the above.

2) Significance testing.

That is all we have.
We really don't have anything else.
Yes, estimating the significance of the OOS objective function is the only way to validate the system (short of actually trading it).

How should we carry out that test?

Using a fair coin toss, or any other random event, as the case for the null hypothesis isn't valid because markets have a slight basis (I am thinking obout stockmarkets ... say the US S&P500).

Therefore we have to quantify the null hypothesis metrics for the market/period we are testing (so benchmarking is actually part of significance testing).

Off the 'top of my head' the buy & hold isn't the correct benchmark to use.

Using the combined IS and OOS sample to quantify the benchmark metrics is valid because the market bias is inherent in the market, it was there before we carried out any testing and is not created by optimising or any of our efforts for that matter..... in fact measuring our benchmark metric over the longest possible period is the preferred method.

I have been viewing all of the discussion on evaluation through the rose coloured Core Metric Evaluation glasses.

I specialise in Core Metric evaluation because:

- it is simple
- it is direct
- it minimizes (propogation of sample) error 
- it gives the same answer as more complex significance tests
- I can do the sig tests in my head 'on the fly' if necessary.
- it syncs nicely with a binomial model of the market which can be utilised in system evaluation via Binomial Simulation modelling. 

BiSim and Core Metric evaluation are my contribution to the trading community.

I have talked about it in this forum (from time to time) and a summary of some of the things I have talked about, plus some learning aids, are at the Zboard.

I doubt if I will ever present the subject formally.

I don't have the motivation for promoting it and what is at the Zboard is about as much as I have the energy to say.

I will post some Excel files there that introduce people to the idea of sample error propogation.

I have been reluctant to post them because they are very dirty, incomplete and the effor to figure out the maths expressions that trace propogation isn't to my taste (that was the point of my labtesting of sample error in the first place).

However, without some explanation of sample error CME isn't complete.

BTW I think I kept my promise to you and Ton :-)


--- In amibroker@xxxxxxxxxxxxxxx, "brian_z111" <brian_z111@xxx> wrote:
>
> Very interesting questions and they are not easy ones to find a definitive answer for.
> 
> I had a quick read of Howard's QTS book to refresh my memory on the basics. 
> 
> Using Howard's recommendations as the benchmark (nothing personal but since Howard is an AB user and many of us have the book etc):
> 
> - we select our objective function before we start the testing process and this is our measure of goodness ... staying with the W/L ratio, from my previous post, as the example ... we need to define a value, or range of values, for the metric's goodness or badness.
> 
> How to do that later.
> 
> - at the end of the IS testing we choose the system with the highest quotient of goodness (the best system) ... we have to since we chose the function and managed the test process (we will assume we did that part correctly).
> 
> How do we know if the IS W/L ratio is good?
> 
> Presumably it is a personal decision..... for some metrics the goodness is predefined e.g. for the K-ratio there is a minimum level recommended by it's creator. Where goodness is not defined then we need to mark out the boundaries for ourselves.
> 
> At this step we could apply some objective means to establihs goodness, or lack of it (for this example):
> 
> - we could test the market, or the portion of it we want to run our testing on, and establish a W/L ratio benchmark (the one to beat!) ... Howard's book, and the forum, contain examples and discussion on benchmarking for system development.
> - we could apply statistical tests to establish significant levels.
> - others?
> 
> Once the best system is chosen, from the IS candidates, we can proceed to OOS testing with it (once again we will assume we did this correctly).
> 
> Once the OOS data is in we need to validate the OOS result.
> 
> Howard tells us there are two ways to do this:
> 
> - prediction
> - statistical validation
> 
> What does it actually mean if we say that the system has good predictive ability .... I assume it means that if the IS W/L ratio was good and the OOS W/L ratio is good that the system has good predictive value OR simply that if the OOS W/L ratio is good then the system has good predictive value (that is what you are suggesting isn't it)?
> 
> Once again statistical validation comes down to significance tests, and rejection of the null hypothesis, but there are difficulties there ... trade series produced by systems don't necessarily have a normal distribution and significance is only stated in probabilities ... no stats test can produce a definite measure of goodness e.g. there is 5/100 chances that a high W/L ratio could have been produced by a break even system (a fair coin toss) and 0.05*0.05 chance that a break even system could produce two trials in a row with significant W/L ratios.  
> 
> Also - I don't think significance tests for non-normal dists are as accurate as those for normal dists?
> 
> Relating your questions to the above:
> 
> - If the OOS sample is validated then it validates the IS sample (we won't split hairs over how the samples were collected at this stage).
> The anchored IS sample then has value ... it is a larger sample than the OOS test. I haven't stipulated what it is useful for ... it might have more than one use.... give me a while and I might think of something!
> 
> - Is the objective function metric (the W/L ratio in this case), as measured by the IS test, relevant once the OOS sample has been validated?
> 
> Howard states that the OOS is often lower than the IS (remember that he is an optimizer so his viewpoint is biased towards typical opt outcomes).
> 
> Not all systems are designed using IS optimization, or even IS testing for that matter.
> 
> IMO there is room for further thought on that subject.
> 
> - Is the ratio of the OOS metric to the combined IS + OOS metric relevant?
> 
> This was Mr Pardo's recommendation, not mine.
> 
> In my previous post I explained why the ratio was unlikely to be useful.
> The numbers I used in the example aren't correct .. they are actually conservative but even then they prove that a good deal of caution is necessary when using significance for validation purposes.
> 
> - If the OOS sample is good what else do I need?
> 
> Once again you have to measure goodness and then validate it.
> 
> Assuming that the IS sample is worthless .... is it realistic to suggest that you are going to do that without reference to the IS sample? No need to answer that question, however, because it doesn't make much difference in the long run.
> 
> So far we only have two measures of goodness .. beat the market and statistical significance. 
> 
> Significance testing on an OOS sample, with or without regard to the IS test, is problematic (as discussed) .... if the IS counts then you could be experiencing a 2.5/1000 probability event ... if it doesn't then you could be experiencing a 5/100 event and that doesn't change the probability of a worthless system producing a significantly good result when traded live.
> 
> Beat the market metrics, are also problematic ... if we select a statistically valid number of trades, randomly, from some OOS data we will beat the market, on average, 50% of the time (I haven't tried this ... it is a prediction on my part). That then leaves us in the position of having to decide if the amount we beat the market by, in our OOS sample, is statistically significant. 
> 
> --- In amibroker@xxxxxxxxxxxxxxx, Keith McCombs <kmccombs@> wrote:
> >
> > I've been told that no question is a "dumb question".  So here goes:
> > If I have a system with good OOS performance, why should I care what the 
> > IS performance is?  And similarly, why should I care what the OOS/IS 
> > ratio is?
> > 
> > Couldn't it be more important that I have a high OOS/BH (buy and hold) 
> > ratio, so that I don't "confuse brains with a bull market"?  Or at least 
> > something that gives me confidence that I haven't just accidentally 
> > stumbled on a once in a lifetime event, that, of coarse, will disappear 
> > the minute I start trading real money?  Does OOS/IS ratio somehow help?
> > -- Keith
> > 
> > brian_z111 wrote:
> > >
> > >
> > > > I also create a t-test of the ave returns.
> > >
> > > How do you do that?
> > >
> > > --- In amibroker@xxxxxxxxxxxxxxx <mailto:amibroker%40yahoogroups.com>, 
> > > Rajiv Arya <rajivarya87@> wrote:
> > > >
> > > >
> > > > I also create a t-test of the ave returns.
> > > >
> > > > The in-sample is almost always significant
> > > >
> > > > And try to have the out of sample t-test greater than 1.64, which 
> > > happens for about 50% for the out-of sample results.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > To: amibroker@xxxxxxxxxxxxxxx <mailto:amibroker%40yahoogroups.com>
> > > > From: dloyer123@
> > > > Date: Sat, 9 May 2009 03:03:16 +0000
> > > > Subject: [amibroker] Re: Expectancy - and related--specifically K-rato
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > --- In amibroker@xxxxxxxxxxxxxxx 
> > > <mailto:amibroker%40yahoogroups.com>, Rajiv Arya <rajivarya87@> wrote:
> > > > >
> > > > >
> > > > > I like to compute a ratio of the out-sample metric and divide it 
> > > by the in-sample metric.
> > > > >
> > > > > And I like to look for multiple runs of out-sample/in-sample ratio 
> > > to be above 0.5 and with little fluctuation.
> > > > >
> > > >
> > > > That is similar to Pardo's WFE (Walk forward efficiency), or a 
> > > measure of how much curve fitting inflated test results. Pardo 
> > > suggests taking the concatenated out of sample returns and divide by 
> > > the result treating the entire combined data set as in sample. 
> > > Anything below 0.65 will probably not trade well live. The higher, the 
> > > better.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > __________________________________________________________
> > > > Hotmail® has a new way to see what's up with your friends.
> > > > 
> > > http://windowslive.com/Tutorial/Hotmail/WhatsNew?ocid=TXT_TAGLM_WL_HM_Tutorial_WhatsNew1_052009 
> > > <http://windowslive.com/Tutorial/Hotmail/WhatsNew?ocid=TXT_TAGLM_WL_HM_Tutorial_WhatsNew1_052009>
> > > >
> > >
> > >
> >
>




------------------------------------

**** IMPORTANT PLEASE READ ****
This group is for the discussion between users only.
This is *NOT* technical support channel.

TO GET TECHNICAL SUPPORT send an e-mail directly to 
SUPPORT {at} amibroker.com

TO SUBMIT SUGGESTIONS please use FEEDBACK CENTER at
http://www.amibroker.com/feedback/
(submissions sent via other channels won't be considered)

For NEW RELEASE ANNOUNCEMENTS and other news always check DEVLOG:
http://www.amibroker.com/devlog/

Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/amibroker/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/amibroker/join
    (Yahoo! ID required)

<*> To change settings via email:
    mailto:amibroker-digest@xxxxxxxxxxxxxxx 
    mailto:amibroker-fullfeatured@xxxxxxxxxxxxxxx

<*> To unsubscribe from this group, send an email to:
    amibroker-unsubscribe@xxxxxxxxxxxxxxx

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
Follow-Ups:
- Re: [amibroker] Re: Expectancy - and related--specifically K-rato
  - From: Howard B
References:
- [amibroker] Re: Expectancy - and related--specifically K-rato
  - From: brian_z111
Prev by Date: Re: [amibroker] TD Sequential again...
Next by Date: Re: [amibroker] Re: System Failing - When do you know?
Previous by thread: [amibroker] Re: Expectancy - and related--specifically K-rato
Next by thread: Re: [amibroker] Re: Expectancy - and related--specifically K-rato
Index(es):
- Date
- Thread