After a long, heated debate with 2 fellow traders, I am curious to hear the opinion of others here on FF.
People underestimate the power of today's computers. Datamining software like Expert Advisor Studio, StrategyQuant, etc. try out thousands of random strategies per hour to hunt for good looking equity curves. Some people feel that a nice equity curve must mean the underlying strategy has some edge. They don't believe this can be just random chance.
To prove a point, I created an expert advisor for MT5 that enters and exits the market at random times in random directions. It does not have an edge; it trades purely random! It only has one relevant input parameter: the random seed. Changing the seed will generate a different series of random numbers. I have attached the code for you to play with it. (Note for the coders out there: it generates random numbers based on the current bar time to ensures that a later rerun of the backtest with the same random seed will show the same behavior during that date range even if we changed the start date of the backtest).
I let the MT5 strategy tester "optimize" the random seed for EURUSD,M1 for the Sharpe ratio. Effectively this means MT5 is merely hunting for profitable random strategies. I let it try random seeds 1 to 1000 over the year 2021. The best strategies have impressive performance numbers:
Here is the equity curve of the best strategy found (with random seed 703). Look at this beauty:
Below is the distribution of the profits of all 1000 strategies found. As expected roughly half of the random strategies lose money, the other half makes a profit:
To mimic the typical datamining workflow of my fellow traders, I selected the top 100 best strategies from the list. To verify that these strategies really have an "edge", I tested them on the next 6 months. The top strategies are again showing amazing results. Indeed, we found gold! ;) Below is the distribution of their profits:
Moving on, we now select the top 10 best strategies and run them over the next 3 months as a last verification. Indeed, our hard work brought us a very good strategy! The winner is random seed 975 which survived all three filter stages:
Lets bet the farm and get rich! Needless to say, I no longer own a farm...
Obviously, doing this on a random strategy is silly, but the same reasoning could be applied to strategies mined by EA Studio, or StrategyQuant. Also there you will find millions of "random" strategies and need to find those that actually have an edge.
"But you are going it wrong!", said my fellow traders. And this is where the heated debate started. "You must optimize over more historical data, do a Monte Carlo test, do a multicurrency test, Walk Forward Analysis, forward test on a demo account, etc. That will weed out those strategies that don't really have an edge!". I am not convinced that actually helps.
My experiment illustrated that even a fully random strategy can have a 50/50 chance of being profitable. If you filter these out and test again on another date range, you will get again a 50/50 chance. You can keep on filtering and throw more tests at the strategy, each time thinning the herd, but since today's computers can generate millions of strategies with ease, you will still end up with many strategies that survived all tests and yet have no edge at all. Monte Carlo or WFA is not going to help you, in my opinion.
What do you think? Am I wrong? Or is there another (better) way to actually find successful strategies using datamining?
People underestimate the power of today's computers. Datamining software like Expert Advisor Studio, StrategyQuant, etc. try out thousands of random strategies per hour to hunt for good looking equity curves. Some people feel that a nice equity curve must mean the underlying strategy has some edge. They don't believe this can be just random chance.
To prove a point, I created an expert advisor for MT5 that enters and exits the market at random times in random directions. It does not have an edge; it trades purely random! It only has one relevant input parameter: the random seed. Changing the seed will generate a different series of random numbers. I have attached the code for you to play with it. (Note for the coders out there: it generates random numbers based on the current bar time to ensures that a later rerun of the backtest with the same random seed will show the same behavior during that date range even if we changed the start date of the backtest).
Attached File
yzRandom.mq5
3 KB

48 downloads
I let the MT5 strategy tester "optimize" the random seed for EURUSD,M1 for the Sharpe ratio. Effectively this means MT5 is merely hunting for profitable random strategies. I let it try random seeds 1 to 1000 over the year 2021. The best strategies have impressive performance numbers:
Here is the equity curve of the best strategy found (with random seed 703). Look at this beauty:
Below is the distribution of the profits of all 1000 strategies found. As expected roughly half of the random strategies lose money, the other half makes a profit:
To mimic the typical datamining workflow of my fellow traders, I selected the top 100 best strategies from the list. To verify that these strategies really have an "edge", I tested them on the next 6 months. The top strategies are again showing amazing results. Indeed, we found gold! ;) Below is the distribution of their profits:
Moving on, we now select the top 10 best strategies and run them over the next 3 months as a last verification. Indeed, our hard work brought us a very good strategy! The winner is random seed 975 which survived all three filter stages:
Lets bet the farm and get rich! Needless to say, I no longer own a farm...
Obviously, doing this on a random strategy is silly, but the same reasoning could be applied to strategies mined by EA Studio, or StrategyQuant. Also there you will find millions of "random" strategies and need to find those that actually have an edge.
"But you are going it wrong!", said my fellow traders. And this is where the heated debate started. "You must optimize over more historical data, do a Monte Carlo test, do a multicurrency test, Walk Forward Analysis, forward test on a demo account, etc. That will weed out those strategies that don't really have an edge!". I am not convinced that actually helps.
My experiment illustrated that even a fully random strategy can have a 50/50 chance of being profitable. If you filter these out and test again on another date range, you will get again a 50/50 chance. You can keep on filtering and throw more tests at the strategy, each time thinning the herd, but since today's computers can generate millions of strategies with ease, you will still end up with many strategies that survived all tests and yet have no edge at all. Monte Carlo or WFA is not going to help you, in my opinion.
What do you think? Am I wrong? Or is there another (better) way to actually find successful strategies using datamining?