EDIT - this post was from two years ago originally, I've added an update at the end of the thread...
I'm working late tonight and have some free time between tasks, so here goes a bunch of rambling. I'm sure my coworker will enjoy this.
As I've begun to learn about trading, a logical train of if/then type statements and assumptions has formed, with what seems to be a natural conclusion I'm sure many others have reached.
(1) If the market is utterly unpredictable, 50/50 in the long run, then not even the best money management scheme will ultimately prevail.
Contrary to what some math wiz gamblers have concluded, I do not believe there is any way to consistently win in a negative expectancy game (broker's cut figured). All you can do IMO is reorganize your wins and losses into various clumping structures.
I assume that the market is not utterly unpredictable, as there do seem to be honest people managing to consistently make a profit. Not the big floor players that don't rely on price as much as strategy, but regular retail investors playing by the charts and news.
(2) If the market has some degree of predictability, then it can be codified.
Humans are remarkable at pattern recognition, and perhaps the ability to consistently recognize winning conditions is well beyond the simplistic trading rules forming the various systems. After all, a human can recognize a face almost instantly in almost any condition while powerful computers need good lighting, good angles, and the latest software to do a decent job. However, difficult is not impossible, and computers have demonstrated an ability for many years to find patterns in data that humans are oblivious to.
I assume that it is not impossible to codify market predictability, if it exists, due to the presumed existence of traders using fairly simple rules to generate consistent earnings.
(3) If human emotion and psychology are the enemies of good money management and system execution, then eliminating them from decision making should improve consistency and long term profitability.
I've been reading posts and articles about the psychology of trading, and both fear and greed appear to be serious obstacles. Not only that, but simple human error can prevent solid execution of otherwise sound trading rules (like a blackjack player making basic strategy mistakes). Cumulatively, a system that may have on paper had a positive expectancy could turn negative due to error and mindset, or money management could empty an account before it had a chance to succeed.
I assume that if the market is predictable and can be codified, mechanical execution of that code would yield the most favorable results. The obvious "what ifs" include the fundamental external events... earthquakes, profit statements, wars, little country strikes oil, etc. In theory these are simply more variables that can be codified, but their infrequency and variability might make that difficult or impossible in many cases. Even so, since there seem to be systematic rules that generate positive returns during times of relatively "news free" trading, a simple solution is to stop using mechanized trading during those times and return to it when the market is stabilized. A more robust solution is to add frequently occurring events to the rule variables, such as profit statements and earnings forcasts, prime interest changes, etc. As these events occurred numerous times in any market data used for backtesting, a robust system should be able to handle these "routine" events.
Now on to the interesting rambling. What is the ideal form of implementation for a trading system? Not the practical ideal, which would include the ability to actually connect to brokers and execute trades, but rather the ideal theoretical platform on which to develop rules? Here I make a few more assumptions. First, you would want the rule to change as the market behavior changed over time. Second, you would want the system to generate all decisions including entry and exit events and position size.
It would seem that Genetic Algorithms are one solution that could fulfill all the requirements. My programming ability is limited to simplistic scripting, and even then it's shakey, but I have pondered what such a GA system's structure would look like. I'll take a stab at it below...
There would be three primary parts: (1) a supervisor program that reads data, accepts user input, executes "organism" subroutines, evaluates organism fitness, modifies the organisms, and executes trade events based on the "winning" organism, (2) a file containing a list of all organism subroutines, and (3) a config file that controls parameters such as mutation rate, types of mutations, organism building blocks, maximum organism complexity, number of children born each generation, chart timeframes the organisms evaluate, maximum risk allowed to be taken, weighting for fitness evaluation, etc.
The supervisor program should be able to function in both real time and on historical data. i.e., it should accept input from the user to include current data in the updating of organisms or only run with the current winner. It should also have the flexibility to use virtual internal money accounts and trades to evaluate fitness when backtesting historical data.
The main difficulty is the fitness criteria. This could be judged in many ways: prediction that the market turned up or down at a specified time, prediction of a swing over X size, prediction of continuance of a current trend, etc. The problem, if I have any understanding at all of trading, is that knowing the market is about to go up or down doesn't guarantee profitability if you can't generate good entry and exit points. There is a simple and natural fitness solution that encompasses entry and exit points: profitability. The organism that is the most fit is the one that generates the most profit, however that might be accomplished. It does raise the issue of what timeframe to judge profitability over... the last day, week, or month for daytrading systems? The past year? The life of the organism? That question could probably be handled by the inclusion of drawdown limits in the fitness criteria. Technically one might want the organism that could produce the most profit by year's end, but practically if it depleted an account in the mean time it is useless.
The subroutine organisms would be interesting. The little reading I've done on GA used for economic prediction in the past has used them to find "optimal" parameters for a fixed function. i.e., fine tune variables in an equation or set of equations already established as working. However, evolution has demonstrated an ability to produce organisms capable of tremendous pattern recognition abilities (human eye, for example), and I see no reason why a GA aimed at economic markets should not have at least a reasonable chance of finding patterns where they might exist.
Since established systems seem to be based on a set of conditional rules (IF a > b, AND IF a > avg(b,c,d)... THEN...), I think it would be wise to retain this feature of market prediction. I don't believe a set of piecewise functions and conditional statements can be summarized in one grand equation that says IF xyz^abc/def-ghi=lmnop THEN trade.
The actions an organism could take in a simple market are buy, sell, and position size for each. In bidirectional markets you have short/long scenarios, so functionally you have enter short, enter long, exit, and position size. So the basic structure of an organism subroutine would likely be four sets of conditionals, each having an action when the conditions are met (enter short, enter long, exit short, and exit long), and a routine for calculating position size for each action. The position routine should look to the supervisor program for information on current account size, money already in play, and user limits to use in addition to market data. Money in play might be differentiated into total money in play in the market and money already in play in that specific market (leading to increasing or decreasing position size). This position routine should be allowed to produce different results for short/long and enter/exit actions, so it might as well be part of each conditional statement - a calculation piggybacking on the IF conditions.
The configuration file is fairly self explanatory but implementing it would take some thought. The first hurdle is controlling the types of variables the organisms are fed, the second is determining the types of data manipulations the organisms can conduct, and the third is determining the mutation rate.
There are already a lot of calculations performed on market data as well as "milemarkers" that perhaps the GA organism subroutines should be "force fed," such as previous period highs, lows, opens, closes, averages, or calculations such true average, moving averages, etc. Or you could let the algorithm operate on raw data without forcing it to look at specific times (opens, closes) or functions (maximums, minimums, averages, etc.). I think the best approach for GA diversity is to use a mixture... some organisms only use current market indices, some use raw price and volume data, others use a mixture. Configuration settings could assure that a certain percent of each generation always remain pure each direction, if desired.
Mutation is another complicated problem. Configuration settings that control how many organisms are retained from each generation (top % of performers and/or all within a certain spread from the winner), how many mutated copies are made of each, and the degree of mutation are fairly straightforward to outline. Types of mutation allowed is more involved. If each line of a conditional statement involves an algebraic comparison, the question is what form are the algebraic expressions allowed to take? I think it is obvious that every conditional line should have access to all of the available variables, so the question is limited to the functions that are available for the organisms to use. Standard functions would be addition and multiplication with the ability to invert and negate terms to produce subtraction and division. Sould array building and matrix operations be allowed? Probably so. What about logs, exponents, derivatives, integrals, etc.? Should the organisms be allowed to "discover" calculus (approximating derivatives with finite difference, etc.) or should these math tools be encoded explicitly? If the coding platform can handle it, I think the more operations available the more diversity produced, and ultimately the greater the chance of finding good system rules.
Once these building blocks are in place, you need to define the types of mutations allowed. Cross-breeding of successful programs is a good idea, as is adding brute mutations. But how many levels? First level would be swapping condition sets (enter short, enter long, ect.) both within an organism and between organisms. Second level would be swapping, adding, and/or removing lines individual lines within a conditional set, among conditional sets within an organisms, and between organisms. Third level would be mutating an individual line by swapping expressions from other lines. Fourth would be mutating a line by swapping variables in an expression. Fifth would be mutating a line by swapping, adding, and/or removing functions and variables. Settings would be needed to control how many children at each level are born, and what degree of swapping/adding/removing is allowed at each level (how many new lines can be added to a condition set in one generation... how many new variables or functions added to a line... etc.). Conservative settings might produce only a few children at each level each only containing one change, agressive settings might produce up to a hundred children (or more) at each level covering the spread from one change to only one thing left unchanged. And of course identical copies of all survivors of any generation.
The configuration file would also have to deal with hard limits on organism complexity (can't add functions or lines past a certain size...), limits on risk allowed, and perhaps variables to steer some organisms toward a desired timeframe (from scalping to long term swing trading). You would also need to control fitness weighting. For example, if the goal of a program is to maximize profits and minimize drawdown, then you need to rank the relative priority of each. Further, since each surviving organism grew through success in previous generations, you need to weight the importance of success and failure as a function of time (newer successes being most important, of course, but how much so?... even nearly perfect systems will make bad trades sometimes and you don't want to kill a proven winner). These rules could become quite elaborate, with for example exemptions from elimination for any outright winner of a single generation for the next X generations, exemption from any top 5% placer for the next Y generations, and taking the top half of the remainders based on a performance scale where losses in last generation count A%, last generation B%, two generations ago C% etc. Interestingly, since the organisms adapt to behavioral changes in data it would be most effective to start the backtesting with the earliest data available and let the organisms crank through it to the present day.
What would an organism look like? Well, take one of james16's rules for a crude example, double bar high close lower. In the appropriate condition section (short, right?
) you might see something like:
IF:
"1:00 to 2:00 high" - "12:00 to 1:00 high" =< 2 pips
AND IF:
"1:00 close - 2:00 close" => +XX pips
AND IF:
.............
AND IF:
................
THEN:
enter
You can see even in this psuedocode example plenty of room for simple mutations. Most will fail, a few might do as well or better than your initial rules.
How long it takes to breed a generation depends on the number of organisms, the complexity they are allowed to have, efficiency of code, and of course the computing power at hand. What is really interesting is the possibility that as the GA program receives realtime data and makes trading decisions, it continues to breed generations and improve. That is exactly what it seems technical traders want... system rules that don't go extinct but adapt to market changes and remain profitable. The idea would be that all organisms have access to real time data and "simulate" trades during each generation, with the supervisor program keeping track of how each of them has done cumulatively whether they were trading for real or not. While all organisms think they are making trades, and operate on the same data, only the chosen one would deal for real if the system were live. A decision would have to be made as to whether the rules used to determine survivors (and therefore produces a ranking of fitness) are the same as the ones used to decide which single survivor gets to make the "real" trading decisions (if actually using the system live) for the current generation. Do you want your last outright winner, who has a lineage of at least good performers, or do you want your time averaged and proven money maker at the helm? I think that question could be answered by simulating the grown of the colony over long periods of data with various rules for survival and "trading king."
A lingering issue is how to handle world events. Profit forcasts and earnings reports, interest rates, inflation, etc. could be entered into the config file each day as additional variables the organisms can use. Perhaps a stock of "special situation" organisms could be retained that proved especially well adapted for specific circumstances, like during a crash, a large natural disaster, a war, etc. The level of potential complexity of the colony once established is limited only by the user.
Once in place, the door opens to even more intriguing questions. What about two organisms that are actually in competition with each other? Instead of each simulating that it had full control over the account resources for a given market and the most fit actually having control, what if two or more of the most fit actually had to fight for their cut of the allowed risk? Would they develop strategies to control that market segment that were more harmful than beneficial? Would certain groups of organisms working together in various markets be more profitable cumulatively than the sum of each operating oblivious to the other players? That is the difference between the position size calculation taking into account what other market organisms are doing at that moment or only looking at config settings. Hard max for each transaction, or float it depending on the money already in play?
Enough thoughts for tonight. Computer just spit out results... back to work.
I'm working late tonight and have some free time between tasks, so here goes a bunch of rambling. I'm sure my coworker will enjoy this.

As I've begun to learn about trading, a logical train of if/then type statements and assumptions has formed, with what seems to be a natural conclusion I'm sure many others have reached.
(1) If the market is utterly unpredictable, 50/50 in the long run, then not even the best money management scheme will ultimately prevail.
Contrary to what some math wiz gamblers have concluded, I do not believe there is any way to consistently win in a negative expectancy game (broker's cut figured). All you can do IMO is reorganize your wins and losses into various clumping structures.
I assume that the market is not utterly unpredictable, as there do seem to be honest people managing to consistently make a profit. Not the big floor players that don't rely on price as much as strategy, but regular retail investors playing by the charts and news.
(2) If the market has some degree of predictability, then it can be codified.
Humans are remarkable at pattern recognition, and perhaps the ability to consistently recognize winning conditions is well beyond the simplistic trading rules forming the various systems. After all, a human can recognize a face almost instantly in almost any condition while powerful computers need good lighting, good angles, and the latest software to do a decent job. However, difficult is not impossible, and computers have demonstrated an ability for many years to find patterns in data that humans are oblivious to.
I assume that it is not impossible to codify market predictability, if it exists, due to the presumed existence of traders using fairly simple rules to generate consistent earnings.
(3) If human emotion and psychology are the enemies of good money management and system execution, then eliminating them from decision making should improve consistency and long term profitability.
I've been reading posts and articles about the psychology of trading, and both fear and greed appear to be serious obstacles. Not only that, but simple human error can prevent solid execution of otherwise sound trading rules (like a blackjack player making basic strategy mistakes). Cumulatively, a system that may have on paper had a positive expectancy could turn negative due to error and mindset, or money management could empty an account before it had a chance to succeed.
I assume that if the market is predictable and can be codified, mechanical execution of that code would yield the most favorable results. The obvious "what ifs" include the fundamental external events... earthquakes, profit statements, wars, little country strikes oil, etc. In theory these are simply more variables that can be codified, but their infrequency and variability might make that difficult or impossible in many cases. Even so, since there seem to be systematic rules that generate positive returns during times of relatively "news free" trading, a simple solution is to stop using mechanized trading during those times and return to it when the market is stabilized. A more robust solution is to add frequently occurring events to the rule variables, such as profit statements and earnings forcasts, prime interest changes, etc. As these events occurred numerous times in any market data used for backtesting, a robust system should be able to handle these "routine" events.
Now on to the interesting rambling. What is the ideal form of implementation for a trading system? Not the practical ideal, which would include the ability to actually connect to brokers and execute trades, but rather the ideal theoretical platform on which to develop rules? Here I make a few more assumptions. First, you would want the rule to change as the market behavior changed over time. Second, you would want the system to generate all decisions including entry and exit events and position size.
It would seem that Genetic Algorithms are one solution that could fulfill all the requirements. My programming ability is limited to simplistic scripting, and even then it's shakey, but I have pondered what such a GA system's structure would look like. I'll take a stab at it below...
There would be three primary parts: (1) a supervisor program that reads data, accepts user input, executes "organism" subroutines, evaluates organism fitness, modifies the organisms, and executes trade events based on the "winning" organism, (2) a file containing a list of all organism subroutines, and (3) a config file that controls parameters such as mutation rate, types of mutations, organism building blocks, maximum organism complexity, number of children born each generation, chart timeframes the organisms evaluate, maximum risk allowed to be taken, weighting for fitness evaluation, etc.
The supervisor program should be able to function in both real time and on historical data. i.e., it should accept input from the user to include current data in the updating of organisms or only run with the current winner. It should also have the flexibility to use virtual internal money accounts and trades to evaluate fitness when backtesting historical data.
The main difficulty is the fitness criteria. This could be judged in many ways: prediction that the market turned up or down at a specified time, prediction of a swing over X size, prediction of continuance of a current trend, etc. The problem, if I have any understanding at all of trading, is that knowing the market is about to go up or down doesn't guarantee profitability if you can't generate good entry and exit points. There is a simple and natural fitness solution that encompasses entry and exit points: profitability. The organism that is the most fit is the one that generates the most profit, however that might be accomplished. It does raise the issue of what timeframe to judge profitability over... the last day, week, or month for daytrading systems? The past year? The life of the organism? That question could probably be handled by the inclusion of drawdown limits in the fitness criteria. Technically one might want the organism that could produce the most profit by year's end, but practically if it depleted an account in the mean time it is useless.
The subroutine organisms would be interesting. The little reading I've done on GA used for economic prediction in the past has used them to find "optimal" parameters for a fixed function. i.e., fine tune variables in an equation or set of equations already established as working. However, evolution has demonstrated an ability to produce organisms capable of tremendous pattern recognition abilities (human eye, for example), and I see no reason why a GA aimed at economic markets should not have at least a reasonable chance of finding patterns where they might exist.
Since established systems seem to be based on a set of conditional rules (IF a > b, AND IF a > avg(b,c,d)... THEN...), I think it would be wise to retain this feature of market prediction. I don't believe a set of piecewise functions and conditional statements can be summarized in one grand equation that says IF xyz^abc/def-ghi=lmnop THEN trade.

The configuration file is fairly self explanatory but implementing it would take some thought. The first hurdle is controlling the types of variables the organisms are fed, the second is determining the types of data manipulations the organisms can conduct, and the third is determining the mutation rate.
There are already a lot of calculations performed on market data as well as "milemarkers" that perhaps the GA organism subroutines should be "force fed," such as previous period highs, lows, opens, closes, averages, or calculations such true average, moving averages, etc. Or you could let the algorithm operate on raw data without forcing it to look at specific times (opens, closes) or functions (maximums, minimums, averages, etc.). I think the best approach for GA diversity is to use a mixture... some organisms only use current market indices, some use raw price and volume data, others use a mixture. Configuration settings could assure that a certain percent of each generation always remain pure each direction, if desired.
Mutation is another complicated problem. Configuration settings that control how many organisms are retained from each generation (top % of performers and/or all within a certain spread from the winner), how many mutated copies are made of each, and the degree of mutation are fairly straightforward to outline. Types of mutation allowed is more involved. If each line of a conditional statement involves an algebraic comparison, the question is what form are the algebraic expressions allowed to take? I think it is obvious that every conditional line should have access to all of the available variables, so the question is limited to the functions that are available for the organisms to use. Standard functions would be addition and multiplication with the ability to invert and negate terms to produce subtraction and division. Sould array building and matrix operations be allowed? Probably so. What about logs, exponents, derivatives, integrals, etc.? Should the organisms be allowed to "discover" calculus (approximating derivatives with finite difference, etc.) or should these math tools be encoded explicitly? If the coding platform can handle it, I think the more operations available the more diversity produced, and ultimately the greater the chance of finding good system rules.
Once these building blocks are in place, you need to define the types of mutations allowed. Cross-breeding of successful programs is a good idea, as is adding brute mutations. But how many levels? First level would be swapping condition sets (enter short, enter long, ect.) both within an organism and between organisms. Second level would be swapping, adding, and/or removing lines individual lines within a conditional set, among conditional sets within an organisms, and between organisms. Third level would be mutating an individual line by swapping expressions from other lines. Fourth would be mutating a line by swapping variables in an expression. Fifth would be mutating a line by swapping, adding, and/or removing functions and variables. Settings would be needed to control how many children at each level are born, and what degree of swapping/adding/removing is allowed at each level (how many new lines can be added to a condition set in one generation... how many new variables or functions added to a line... etc.). Conservative settings might produce only a few children at each level each only containing one change, agressive settings might produce up to a hundred children (or more) at each level covering the spread from one change to only one thing left unchanged. And of course identical copies of all survivors of any generation.
The configuration file would also have to deal with hard limits on organism complexity (can't add functions or lines past a certain size...), limits on risk allowed, and perhaps variables to steer some organisms toward a desired timeframe (from scalping to long term swing trading). You would also need to control fitness weighting. For example, if the goal of a program is to maximize profits and minimize drawdown, then you need to rank the relative priority of each. Further, since each surviving organism grew through success in previous generations, you need to weight the importance of success and failure as a function of time (newer successes being most important, of course, but how much so?... even nearly perfect systems will make bad trades sometimes and you don't want to kill a proven winner). These rules could become quite elaborate, with for example exemptions from elimination for any outright winner of a single generation for the next X generations, exemption from any top 5% placer for the next Y generations, and taking the top half of the remainders based on a performance scale where losses in last generation count A%, last generation B%, two generations ago C% etc. Interestingly, since the organisms adapt to behavioral changes in data it would be most effective to start the backtesting with the earliest data available and let the organisms crank through it to the present day.
What would an organism look like? Well, take one of james16's rules for a crude example, double bar high close lower. In the appropriate condition section (short, right?

IF:
"1:00 to 2:00 high" - "12:00 to 1:00 high" =< 2 pips
AND IF:
"1:00 close - 2:00 close" => +XX pips
AND IF:
.............
AND IF:
................
THEN:
enter
You can see even in this psuedocode example plenty of room for simple mutations. Most will fail, a few might do as well or better than your initial rules.
How long it takes to breed a generation depends on the number of organisms, the complexity they are allowed to have, efficiency of code, and of course the computing power at hand. What is really interesting is the possibility that as the GA program receives realtime data and makes trading decisions, it continues to breed generations and improve. That is exactly what it seems technical traders want... system rules that don't go extinct but adapt to market changes and remain profitable. The idea would be that all organisms have access to real time data and "simulate" trades during each generation, with the supervisor program keeping track of how each of them has done cumulatively whether they were trading for real or not. While all organisms think they are making trades, and operate on the same data, only the chosen one would deal for real if the system were live. A decision would have to be made as to whether the rules used to determine survivors (and therefore produces a ranking of fitness) are the same as the ones used to decide which single survivor gets to make the "real" trading decisions (if actually using the system live) for the current generation. Do you want your last outright winner, who has a lineage of at least good performers, or do you want your time averaged and proven money maker at the helm? I think that question could be answered by simulating the grown of the colony over long periods of data with various rules for survival and "trading king."
A lingering issue is how to handle world events. Profit forcasts and earnings reports, interest rates, inflation, etc. could be entered into the config file each day as additional variables the organisms can use. Perhaps a stock of "special situation" organisms could be retained that proved especially well adapted for specific circumstances, like during a crash, a large natural disaster, a war, etc. The level of potential complexity of the colony once established is limited only by the user.
Once in place, the door opens to even more intriguing questions. What about two organisms that are actually in competition with each other? Instead of each simulating that it had full control over the account resources for a given market and the most fit actually having control, what if two or more of the most fit actually had to fight for their cut of the allowed risk? Would they develop strategies to control that market segment that were more harmful than beneficial? Would certain groups of organisms working together in various markets be more profitable cumulatively than the sum of each operating oblivious to the other players? That is the difference between the position size calculation taking into account what other market organisms are doing at that moment or only looking at config settings. Hard max for each transaction, or float it depending on the money already in play?
Enough thoughts for tonight. Computer just spit out results... back to work.