In this series of posts; am going to talk about my experience in handling data snooping bias (Also known as curve fitting bias, and data mining bias). In hope that we system developers learn more about this subject; and as a result benefit our trading career. I welcome posters to discuss their opinions and share thei rexperience.

The Problem

You have developed a new trading system, backtested it on recent data, and you were impressed with the 45% equity curve. what is the next step? certainly not trade it live until you proved you have an edge.

One of the biggest challenges of building mechanical systems; is whether the system you built has an edge; or simply is curve fitted.

Here are some tools you can use:

1- test it on out of sample data

2- forward test it on unseen data

3- robust statistical tests on the backtest performance to identify the robustness of the system

On part 1; i am going to talk about method (1) and method (2); and in the next post i will talk about the rather delicate topic; of robust statistics.

The Experiment

In this post; i am going to show a trading system that I developed; and walk you through the steps; to identify if the system is worthy of trading or not. In other words; if the system is curve fitted; or has a real edge.

The strategy is trend following on EUR/USD currency pair.

Backtest Result

The following is a backtest results are for two years (Jan 2010 till Nov 2011).

The total return was 177% with annual return of 98% and max drawdown of 12%. A sharpe ratio of 2.5.

at first glance; who wouldn't trade this system? the 45% degree angle of the equity curve is certainly appealing.

Out of Sample Test (OOS)

In reality OOS is not OOS. it is In-Sample (IS) testing; unless you are extremely disciplined and organized. why? because of hindsight.

Once you have tested your strategy in IS; then in OOS; and it failed in the latter. You went ahead and changed the rules; then tested again in IS then OOS until the OOS got good results.

Here is an example of this process:

1. You have in sample data set A, and out of sample data set B

2. Find a trading system that fits data set A, test it on data set B, it doesn’t work

3. Find or tweak the system to data set A, test it on data set B, it doesn’t work

4. Repeat step 3 for N times; and stop when you found the system that works on both data sets; A and B.

5. Et voila; you beleived you have found a system that tested well on OOS.

This process is essentially curve fitting. In practice; data set B is not OOS anymore. it is just an extension to data set A. You fooled yourself into beleiving your best strategy works in OOS.

You can only test your strategy one time only on OOS. Otherwise it became IS. You have now a hindsight; and hence you cannot claim later that your strategy worked in OOS.

as long as you only use OOS once; and only once for a particular strategy; then you have to go through forward testing before putting any money in it.

It is difficult psychologically to stick to the rules of IS / OOS. Here are some ways to get over this problem:

1. divide your data onto 3 parts, data set A, B and C.

2. develop your strategies using data set A. After you have found a 60 degrees equity curve,

3. test the strategy on data set B, if you still get a 60 degrees equity curve; or 30 degrees or what ever, prepare yourself for live trading the next day

4. do a dress rehearsal and do a final test on data set C, as a preparation for go live. If you get a 0 degree or less curve; back to step (1) above.

Forward Test

Forward testing is the mother of all tests. It is robust and guaranteed to work. Moreover; it addreses all sort of biases:

- curve fitting bias

- data mining bias

- look ahead bias

- survivalship bias

- software bugs

Experience suggest that forward testing of 2 months; or ~300 trades; is a must; in order to address the the biases (beside the curve fitting bias).

This robustness comes at a cost; both in time and efforts. If you were to test every strategy you develop in forward testing for a couple of months; it will take years and years before you find one profitbale strategy.

Also; it is easy to get carried out by other tasks and ignores the forward test system. You need to pay attention to it as close as to real life trading.

Back to our example above. Here is a 3 months forward test result of the above strategy:

The strategy went sideways for a couple of months; and then slide in the last 4 month.

In the next part of this post; i will discuss robust statistical tests; that saves you the hassle of forward testing.

If you have any thoughts on tackling this issue; please share them with us.

Nice post, I think a simulated live forward test is a good idea. As in run the strategy with your broker but in a paper trading account.

ReplyDeleteI had a strategy that back tested exceptionally well on EURUSD and XUDUSD, however when it came to live paper trading the strategy was destroyed by bid ask spread (strategy was back tested with only 5min high low open close data).

Gekko, absolutely; forward test is the final check for go live.

DeleteMost software bugs; and other biases such as bid/ask spread; will be identified during forward testing.

Thank you for the post.

ReplyDeleteBut i still can't fully understand what do you mean by "Forward test". It is a test on a paper account for at least two month?

Or "Forward test" is just a step 4 - final test on data C?

Alex.

Essentially a forward test is a test on unseen data. You can achieve that by dividing your data onto A,B and C segments.

DeleteAs long as the test is done on unseen data; and is done only once in a lifetime of that strategy; then it qualifies as a forward test.

Thanks for your reply.

ReplyDeleteJust to clarify. I test and optimize a system on A segment, find the best one and apply it to B segment. If the results are still good or even better, the system is almost ready for real life. But to be 100% sure, I apply the same "best" system from A segment on C segment for a final confirmation. And if on C the system still does good, I'm ready to trade it from tomorrow.

Am I correct?

Alex.

Anon,

ReplyDeletethat is correct.

alpha.

Thank you. Looking forward for Part2. Good luck

ReplyDeleteHi.

ReplyDeleteI was reading your comments (http://www.automated-trading-system.com/bootstrap-take-2-data-mining-bias-code-and-using-geometric-mean/#comments), on that blog, and i found something curious.

You said that Equity Curve scramble method did manage to difference between system A and system B, and that classical Bootstrap test of returns over some out-of-sample performace did not.

I would like if you could explain please, what is the rationale, for thinking that this case is not some isolated case. Maybe if you tested more than two systems, for example, 1000 systems, you would see what method is better....?

What is the logic behind Equity Curve permutation test, that would imply that we should use it.

Thank you very much.

A good read! I'll be waiting for the next part.

ReplyDeleteOptions Trading Strategies

This blog is very nice and such a great information. Cash Bootstrap Method Review

ReplyDeletecan you explain about the statistical test used to check the robustness of the trading system.??

ReplyDelete