Disliked{quote} Google Colab Notebook Version 1.84 It seems that a few people had trouble getting the script to work so my only suggestion is to run the script in Colab. Remove the .txt extension In My Drive create a folder called TradingData In the TradingData folder: Upload all the asset CSV files Upload or create the .env file with your Gemini Key Upload the End_To_End_Advanced_ML_Trading_Framework_PRO_V184_Colab_Adapted.ipynb (as a back up) Run Google Colab Run the notebook from either the TradingData folder or just upload from your computer The Notebook...Ignored
End_To_End_Advanced_ML_Trading_Framework_PRO_V185_API_Fallback.py
I know I said a few versions ago I wasn't going to add any more major updates however ensuring that at least two features that most people use are added and working is where I will leave it. If I find any major bugs I'll update again.
Only until there has been a complete run of the script it produces a report. The real learning begin when it creates in the Results folder:
- champion.json
- historical_runs.jsonl
- framework_directives.json
These .json and jsonl files will be used by the AI to learn from.
Over time the hybrid strategies will be developed so until you see these in the testing you know that you're making progress.
You can add in as many assets as you want into your folder, sometimes more the better. I had 4 pairs originally, now I have 8 because I wanted negatively corelated pairs.
If you think that the strategies are very similar after awhile change the prompt. E.g. If you direct the AI to just 'make a profit', then it will take huge risks and most likely have large draw downs.
The trick is to find the right balance in the prompts to ensure that robust testing takes place for all market conditions.
If you are looking for a specific market edge, by all means load up the prompts to just find that 'edge'.
CRITICAL - !! MODEL QUALITY GATE FAILED !!
This is part of the learning system, it not an error. You will see many of these.
The Quality Gate acts as a disciplined supervisor. It ensures that only models with a verifiable minimum standard of performance are tested, and it implements a strict, intelligent retry-or-abandon policy to avoid wasting time on hopeless configurations and to maintain the statistical validity of the overall backtest.
Results too good to be true (mainly over-fitting)?
Is is a data leak?
Are the calculations correct?
Are you just using the same sample data over and over again? (this shouldn't happen with the script as it chooses random starting points each cycle)
Not seeing results or changes?
My only main suggestion if you have tried most other things, including many runs of the script, is ensure that you have many different strategies in the strategy_playbook.json in the Results folder.
I personally just scrape the internet and then have an LLM (Google Gemini) 'convert these into a strategies so they are compatable with the script' <<< That's the prompt hint (you do need to include the script in the prompt)
Loading more strategies into the playbook allows the AI to have a broader selection to train with and then create hybrid strategies later.
Example: Starting model training using strategy: 'A hybrid that trades high-volatility breakouts (ATR, Bollinger) but only in the direction of the long-term daily trend.'
Drawdown in the script is currently set at a maximum of 15% daily. The circuit breaker will trip (stop the trading) during the training if that happens. (DD% maximum is dynamic, you hard code the max% or change the 15%)
That is not a bad thing all the time as the AI overtime will incorporate this into its risk management learning, thus integrating these learnings into a strategy to stop trades from blowing out.
Seeing lots of Partial Take Profit in the tests?
Again not a bad thing as it will learn over time to breakeven at the right time to minimise losses.
The script is designed to also change its training if it sees that it has had too many of the same strategy.
The log output: WARNING - ! STRATEGIC INTERVENTION !: Current strategy has failed repeatedly. Engaging AI to select a new strategy.
Cycle Changes:
As a note, when the script starts it will analyse the data (csv's) and then make a selection what it will use to train with.
This selection then will be matched up with a cycle frequency in days, from 90D to 7D. Overall it doesn't really affect the training quality, however, if it is 90D then you will have 13 cycles, if is was 7D, then you have to wait 158 cycles.
Remember that until all the cycles have finished, produced the reports and exported its findings and learnings as .json and .jsonl, then a full 'Run' has not been completed. (this why it takes awhile to test and debug)
Additional Ideas To Train With: (I won't be adding these into the script)
Slippage
News events
Calendar events
Social Media
Other asset classes
Development and the API:
I use Google Gemini 2.5 Pro Preview for development (both gemini.google.com and aistudio.google.com) and Flash 2.0 API and 1.5 as a backup in the script, Yes, you can change this API to whatever you want.
Free Flash 2.0 API daily rate limits is what I have designed the script around for the calls and to stay reasonably in the limits then 5 minutes between each call is good.
The script accounts for this by setting the limit between each call at 300 seconds (5 minutes) and before it calls the API again it calculates the time since the last call before using the API again.
The three types of API messages you 'll see are:
Time since last API call: 1.5 seconds.
Waiting for 298.5 seconds to respect the 300s interval...
or
Time since last API call (1136.9s) exceeds interval. No wait needed.
During busy times you'll see '503 errors', this just means the server is overloaded.
Attachment:
End_To_End_Advanced_ML_Trading_Framework_PRO_V185_API_Fallback.py.txt (just remove the .txt)
Detailed Step-by-Step Explanation
The script's execution is orchestrated by the main() function and primarily carried out within the run_single_instance function.
1. Initialization and Setup (The main function)
The script begins by preparing its environment.
- Load Persistent Data: Before a run starts, the main() function loads crucial historical data and configurations that persist between runs.
- framework_history: Using load_memory, it reads historical_runs.jsonl (a log of every past run) and champion.json (the configuration and result of the single best-performing run to date).
- playbook: Using initialize_playbook, it loads strategy_playbook.json. This is a vital file containing a dictionary of pre-defined trading strategies (e.g., "TrendPullback", "RangeBound"), their default features, and characteristics. This gives the AI a menu of options to choose from.
- nickname_ledger: It loads a simple JSON file to remember the cool codenames it generates for new script versions.
- Start Run Loop: The script enters a while loop, allowing it to run continuously as a daemon if configured. For a single execution, this loop runs once.
2. Data Preparation and Initial AI Analysis (Start of run_single_instance)
This phase is about understanding the market and getting the initial strategy from the AI.
- Load and Process Data (DataLoader): The DataLoader class scans the BASE_PATH for all market data files (e.g., EURUSD_H1.csv, GBPUSD_H4.csv). It parses them, standardizes column names, and combines them into separate DataFrames for each timeframe (H1, H4, D1, etc.).
- Feature Engineering (FeatureEngineer): This is a critical step where raw data is turned into potentially predictive signals. The FeatureEngineer class:
- Calculates dozens of technical indicators for the base timeframe (e.g., RSI, ADX, Bollinger Bands).
- Calculates "contextual" features from higher timeframes (e.g., the trend direction on the Daily chart) and merges them with the base data. This gives the model a multi-dimensional view of the market.
- Generates anomaly scores using an IsolationForest model to flag unusual market conditions.
- Initial AI Consultation (GeminiAnalyzer): The framework now consults the AI to decide how to proceed.
- It compiles a detailed prompt containing the market data summary, the framework's performance history (framework_history), and the available strategies (playbook).
- It calls the get_initial_run_setup method of the GeminiAnalyzer class. This method sends the prompt to the Gemini API.
- The AI's task is to analyze all this information and return a JSON object with the strategy_name it thinks is best for the current conditions, along with a complete set of starting parameters (selected_features, MAX_DD_PER_CYCLE, etc.).
- Finalize Configuration (ConfigModel): The AI's suggestions are used to create an instance of the ConfigModel. This Pydantic model validates, sanitizes, and holds all configuration parameters for the entire run, including file paths for saving results.
3. The Walk-Forward Loop
This is the core iterative process where the model is repeatedly trained and tested.
- Pre-Cycle Regime Analysis (V184): At the start of each new cycle, the framework performs a quick analysis of the most recent market data (e.g., last 30 days). It asks the AI via propose_regime_based_strategy_switch if the current strategy is still optimal or if a switch (e.g., from a trending to a ranging strategy) is warranted due to a change in the market's personality.
- Data Slicing: The historical data is split into a TRAINING_WINDOW (e.g., the last 365 days) and a forward testing period (e.g., the next 90 days).
- Model Training (ModelTrainer):
- The train method is called. It uses the powerful Optuna library to perform hyperparameter optimization, running many trials to find the best model settings.
- Quality Gate Check: After optimization, the model's performance on a validation set is checked against a MODEL_QUALITY_THRESHOLD.
- Training Retry Logic (V184): If the model fails the quality gate, the framework doesn't immediately give up. It tells the AI that training failed and asks for a new set of parameters to try again. This retry can happen up to MAX_TRAINING_RETRIES_PER_CYCLE times. If it still fails, the cycle is abandoned with a $0 PNL, preventing a bad model from trading.
- SHAP Value Generation: If training is successful, it calculates SHAP values to understand which features were most important to the model's predictions.
- Backtesting (Backtester):
- The newly trained model is used to make predictions on the unseen forward testing data.
- The run_backtest_chunk method simulates the trading process candle by candle. It manages an equity curve, opens positions, calculates risk, and closes trades based on Stop Loss or Take Profit levels.
- Circuit Breaker: It constantly monitors for a maximum drawdown (MAX_DD_PER_CYCLE). If the equity drops by more than this percentage during the cycle, all trades are closed, and the cycle ends prematurely. This is a critical risk management feature.
- Post-Cycle AI Analysis (GeminiAnalyzer):
- After each cycle, the AI is consulted again via analyze_cycle_and_suggest_changes. The results of the just-completed cycle are provided.
- If the cycle was successful, the AI might suggest minor tweaks to improve performance.
- If the circuit breaker was tripped, the strategy is put on "probation." The AI is instructed to propose changes that reduce risk. If it fails again while on probation, it gets "quarantined," and the AI is forced to pick a completely different strategy from the playbook using propose_strategic_intervention.
- Loop Continuation: The framework saves the results of the cycle and begins the next one, moving the training and testing windows forward in time.
4. Reporting and Memory
Once all walk-forward cycles are complete, the final phase begins.
- Aggregate and Report (PerformanceAnalyzer):
- The PerformanceAnalyzer class gathers the trade data from all cycles.
- It calculates a comprehensive list of over 30 performance metrics (e.g., Sharpe Ratio, Profit Factor, Max Drawdown).
- It generates an equity curve plot, a feature importance plot (from SHAP), and a detailed text-based report comparing the current run against the previous run and the all-time champion run.
- Save to Memory (save_run_to_memory):
- The complete summary of the run, including all parameters and final metrics, is appended to historical_runs.jsonl.
- The run's performance (specifically its MAR Ratio) is compared to the current champion. If it's better, this run's summary overwrites the champion.json file, crowning a new champion.
This entire process makes the framework highly adaptive and robust, using AI not just for prediction but for high-level strategy, risk management, and self-correction.