Disliked{quote} Are you trying the first iterations that I shared, Gemini Commander? Let me know and I'll post what I have. I'm not developing that further until I finish the ML Framework.Ignored
Related Instruments
Similar Threads
VWAP Experimental 17 replies
Vicky's Experimental Trade Journal 5 replies
Experimental Circle Analysis of EUR/USD 39 replies
experimental account 0 replies
Experimental Reports 5 replies
- #82
- Jun 21, 2025 9:21pm Jun 21, 2025 9:21pm
- Joined Sep 2020 | Status: Strategy and Risk Manager | 164 Posts
The framework (script) is All-In-One.
@amvt
You'll find the backtester code block Line 1753 in V209. I have broken up the framework into modular sections to make it easy to read and find things.
Summary of Changes: V208 vs. V209
The fundamental structure of executing walk-forward cycles and generating a final report remains the same in both versions. The key developments in V209 focus on making this process more efficient and autonomous by introducing feature caching and a significantly more intelligent system for handling failures mid-run.
Efficiency Enhancements
Feature Caching: V209 adds the ability to save the results of the computationally intensive feature engineering process to a .parquet file.
Cache Validation: On subsequent runs, V209 checks if the input data or key feature generation parameters have changed. If they have not, it loads the data from the cache, bypassing the need to re-calculate all features. This dramatically speeds up testing and is controlled by the USE_FEATURE_CACHING parameter. V208 lacked this caching ability and would re-calculate features on every run.
Improved Failure Handling & Autonomy
In-Cycle Retries: When a model fails to meet the minimum quality standard during a training cycle, V209 now attempts to salvage it. It engages the Gemini AI to suggest immediate parameter tweaks and retries the training up to three times within the same cycle. V208 would simply proceed with a failed or non-existent model for that cycle.
Strategic Intervention & Pivoting: If a strategy continues to fail even with AI-assisted retries, V209 implements a "Strategic Intervention".
The failing strategy is "quarantined" for the remainder of the run.
The AI is then tasked to select a completely new strategy from the playbook to continue the walk-forward analysis.
This allows V209 to dynamically pivot away from a failing approach mid-run, a capability V208 did not have.
However...
The AI, in its quest for the highest possible risk-adjusted return (like the Calmar Ratio), has determined that the most rational action is to not trade at all. This is a classic machine learning outcome often called "learning to be flat."
This is similar behaviour experienced in an earlier update when the AI was given the objective to basically make as much profit as possible and it manipulated a calculation weakness but thee account suffered 66% DD but made 3000% PNL.
I have decided to remove LSTM as I don't need it and there were callback conflicts.
I introduced Phases in V209 and my current testing version 10 I have built in Groups.
The "Phases" and the "Group Updates" are two distinct but highly synergistic sets of enhancements that transform the system from a static, pre-configured engine into a dynamic and adaptive trading partner.
Here is an explanation of how each component improves the framework and how they work together.
The "Phases": Building Contextual Intelligence (The 'Brain')
The primary goal of the Phased updates was to give the framework a deep, upfront understanding of the market environment before making decisions. It's the strategic, analytical component.
@amvt
You'll find the backtester code block Line 1753 in V209. I have broken up the framework into modular sections to make it easy to read and find things.
Inserted Code
# =============================================================================
# 7. BACKTESTER & 8. PERFORMANCE ANALYZER
# =============================================================================
class Backtester:
def __init__(self,config:ConfigModel):
self.config=config
self.is_meta_model = False
self.is_transformer_model = False
self.use_tp_ladder = self.config.USE_TP_LADDER
if self.use_tp_ladder:
if len(self.config.TP_LADDER_LEVELS_PCT) != len(self.config.TP_LADDER_RISK_MULTIPLIERS):
logger.error("TP Ladder config error: 'TP_LADDER_LEVELS_PCT' and 'TP_LADDER_RISK_MULTIPLIERS' must have the same length. Disabling ladder.")
self.use_tp_ladder = False
elif not np.isclose(sum(self.config.TP_LADDER_LEVELS_PCT), 1.0):
logger.error(f"TP Ladder config error: 'TP_LADDER_LEVELS_PCT' sum ({sum(self.config.TP_LADDER_LEVELS_PCT)}) is not 1.0. Disabling ladder.")
self.use_tp_ladder = False
else:
logger.info("Take-Profit Ladder is ENABLED. Standard partial profit logic will be skipped.")
def _get_tiered_risk_params(self, equity: float) -> Tuple[float, int]:
"""Looks up risk percentage and max trades from the tiered config."""
sorted_tiers = sorted(self.config.TIERED_RISK_CONFIG.keys())
for tier_cap in sorted_tiers:
if equity <= tier_cap:
tier_settings = self.config.TIERED_RISK_CONFIG[tier_cap]
profile_settings = tier_settings.get(self.config.RISK_PROFILE, tier_settings['Medium'])
return profile_settings['risk_pct'], profile_settings['pairs']
highest_tier_cap = sorted_tiers[-1]
tier_settings = self.config.TIERED_RISK_CONFIG[highest_tier_cap]
profile_settings = tier_settings.get(self.config.RISK_PROFILE, tier_settings['Medium'])
return profile_settings['risk_pct'], profile_settings['pairs']
def _calculate_realistic_costs(self, candle: Dict, on_exit: bool = False) -> Tuple[float, float]:
"""Calculates dynamic spread and variable slippage."""
symbol = candle['Symbol']
point_size = 0.0001 if 'JPY' not in symbol and candle.get('Open', 1) < 50 else 0.01
# Spread cost is only applied on entry
spread_cost = 0
if not on_exit:
# --- FIX: Robustly get spread info to prevent KeyError ---
if symbol in self.config.SPREAD_CONFIG:
spread_info = self.config.SPREAD_CONFIG[symbol]
else:
# Fallback to 'default' key, with a hardcoded ultimate fallback
spread_info = self.config.SPREAD_CONFIG.get('default', {'normal_pips': 1.8, 'volatile_pips': 5.5})
# --- END FIX ---
vol_rank = candle.get('market_volatility_index', 0.5)
spread_pips = spread_info.get('volatile_pips', 5.5) if vol_rank > 0.8 else spread_info.get('normal_pips', 1.8)
spread_cost = spread_pips * point_size
slippage_cost = 0
if self.config.USE_VARIABLE_SLIPPAGE:
atr = candle.get('ATR', 0)
vol_rank = candle.get('market_volatility_index', 0.5)
# Slippage can be higher on panicked stop-loss exits
random_factor = random.uniform(0.1, 1.2 if on_exit else 1.0) * self.config.SLIPPAGE_VOLATILITY_FACTOR
slippage_cost = atr * vol_rank * random_factor
return spread_cost, slippage_cost
def run_backtest_chunk(self, df_chunk_in: pd.DataFrame, confidence_threshold: float, initial_equity: float, strategy_details: Dict) -> Tuple[pd.DataFrame, pd.Series, bool, Optional[Dict], Dict]:
if df_chunk_in.empty:
return pd.DataFrame(), pd.Series([initial_equity]), False, None, {}
df_chunk = df_chunk_in.copy()
self.is_meta_model = strategy_details.get("requires_meta_labeling", False)
self.is_transformer_model = strategy_details.get("requires_transformer", False)
trades, equity, equity_curve, open_positions = [], initial_equity, [initial_equity], {}
chunk_peak_equity = initial_equity
circuit_breaker_tripped = False
breaker_context = None
candles = df_chunk.reset_index().to_dict('records')
daily_dd_report = {}
current_day = None
day_start_equity = initial_equity
day_peak_equity = initial_equity
def finalize_day_metrics(day_to_finalize, equity_at_close):
if day_to_finalize is None: return
daily_pnl = equity_at_close - day_start_equity
daily_dd_pct = ((day_peak_equity - equity_at_close) / day_peak_equity) * 100 if day_peak_equity > 0 else 0
daily_dd_report[day_to_finalize.isoformat()] = {'pnl': round(daily_pnl, 2), 'drawdown_pct': round(daily_dd_pct, 2)}
def close_trade(pos_to_close, exit_price, exit_reason, candle_info):
nonlocal equity
pnl = (exit_price - pos_to_close['entry_price']) * pos_to_close['direction'] * pos_to_close['lot_size'] * self.config.CONTRACT_SIZE
commission_cost = self.config.COMMISSION_PER_LOT * pos_to_close['lot_size'] * 2 # Entry and Exit
net_pnl = pnl - commission_cost
equity += net_pnl
mae = abs(pos_to_close['mae_price'] - pos_to_close['entry_price'])
mfe = abs(pos_to_close['mfe_price'] - pos_to_close['entry_price'])
trade_record = {
'ExecTime': candle_info['Timestamp'], 'Symbol': pos_to_close['symbol'], 'PNL': net_pnl,
'Equity': equity, 'Confidence': pos_to_close['confidence'],
'Direction': pos_to_close['direction'], 'ExitReason': exit_reason,
'MAE': round(mae, 5), 'MFE': round(mfe, 5)
}
trades.append(trade_record)
equity_curve.append(equity)
return net_pnl
for i in range(1, len(candles)):
current_candle = candles[i]
prev_candle = candles[i-1]
candle_date = current_candle['Timestamp'].date()
if candle_date != current_day:
finalize_day_metrics(current_day, equity)
current_day, day_start_equity, day_peak_equity = candle_date, equity, equity
if not circuit_breaker_tripped:
day_peak_equity = max(day_peak_equity, equity)
chunk_peak_equity = max(chunk_peak_equity, equity)
if equity > 0 and chunk_peak_equity > 0 and (chunk_peak_equity - equity) / chunk_peak_equity > self.config.MAX_DD_PER_CYCLE:
logger.warning(f" - CYCLE CIRCUIT BREAKER TRIPPED! Drawdown exceeded {self.config.MAX_DD_PER_CYCLE:.0%} for this cycle. Closing all positions.")
circuit_breaker_tripped = True
trade_df = pd.DataFrame(trades)
breaker_context = {"num_trades_before_trip": len(trade_df), "pnl_before_trip": round(trade_df['PNL'].sum(), 2), "last_5_trades_pnl": [round(p, 2) for p in trade_df['PNL'].tail(5).tolist()]} if not trade_df.empty else {}
# Close all open positions at current candle's open
for sym, pos in list(open_positions.items()):
close_trade(pos, current_candle['Open'], "Circuit Breaker", current_candle)
del open_positions[sym]
continue # Skip new trade checks for this candle
if equity <= 0:
logger.critical(" - ACCOUNT BLOWN!")
break
# Update MFE/MAE for open positions
for symbol, pos in open_positions.items():
if pos['direction'] == 1:
pos['mfe_price'] = max(pos['mfe_price'], current_candle['High'])
pos['mae_price'] = min(pos['mae_price'], current_candle['Low'])
else: # Short
pos['mfe_price'] = min(pos['mfe_price'], current_candle['Low'])
pos['mae_price'] = max(pos['mae_price'], current_candle['High'])
symbols_to_close = []
for symbol, pos in open_positions.items():
exit_price, exit_reason = None, None
candle_low, candle_high = current_candle['Low'], current_candle['High']
# Pessimistic exit check: If both SL and TP are hit in the same bar, assume SL is hit first
sl_hit = (pos['direction'] == 1 and candle_low <= pos['sl']) or \
(pos['direction'] == -1 and candle_high >= pos['sl'])
tp_hit = (pos['direction'] == 1 and candle_high >= pos['tp']) or \
(pos['direction'] == -1 and candle_low <= pos['tp'])
if sl_hit and tp_hit:
exit_reason = "Stop Loss (Pessimistic)"
_, sl_slippage = self._calculate_realistic_costs(current_candle, on_exit=True)
exit_price = pos['sl'] - (sl_slippage * pos['direction'])
elif sl_hit:
exit_reason = "Stop Loss"
_, sl_slippage = self._calculate_realistic_costs(current_candle, on_exit=True)
exit_price = pos['sl'] - (sl_slippage * pos['direction'])
elif tp_hit:
exit_reason = "Take Profit"
exit_price = pos['tp'] # Assume no slippage on limit orders
# Standard TP/SL exit logic
if exit_price is not None:
close_trade(pos, exit_price, exit_reason, current_candle)
symbols_to_close.append(symbol)
if equity <= 0: continue
for symbol in set(symbols_to_close):
if symbol in open_positions:
del open_positions[symbol]
# --- New Trade Entry Logic ---
symbol = prev_candle['Symbol']
if self.config.USE_TIERED_RISK:
base_risk_pct, max_concurrent_trades = self._get_tiered_risk_params(equity)
else:
base_risk_pct, max_concurrent_trades = self.config.BASE_RISK_PER_TRADE_PCT, self.config.MAX_CONCURRENT_TRADES
if not circuit_breaker_tripped and symbol not in open_positions and len(open_positions) < max_concurrent_trades:
if prev_candle.get('anomaly_score') == -1: continue
vol_idx = prev_candle.get('market_volatility_index', 0.5)
if not (self.config.MIN_VOLATILITY_RANK <= vol_idx <= self.config.MAX_VOLATILITY_RANK): continue
direction, confidence = 0, 0
# Meta-model or Standard Model Signal
if not self.is_transformer_model:
if self.is_meta_model:
prob_take_trade = prev_candle.get('prob_1', 0)
primary_signal = prev_candle.get('primary_model_signal', 0)
if prob_take_trade > confidence_threshold and primary_signal != 0:
direction, confidence = int(np.sign(primary_signal)), prob_take_trade
else: # Standard model
if 'prob_short' in prev_candle:
probs=np.array([prev_candle['prob_short'],prev_candle['prob_hold'],prev_candle['prob_long']])
max_confidence=np.max(probs)
if max_confidence >= confidence_threshold:
pred_class=np.argmax(probs)
direction=1 if pred_class==2 else -1 if pred_class==0 else 0
confidence = max_confidence
if direction != 0:
atr = prev_candle.get('ATR',0)
if pd.isna(atr) or atr<=1e-9: continue
# Determine risk tier
tier_name = 'standard'
if confidence >= self.config.CONFIDENCE_TIERS['ultra_high']['min']: tier_name = 'ultra_high'
elif confidence >= self.config.CONFIDENCE_TIERS['high']['min']: tier_name = 'high'
tier = self.config.CONFIDENCE_TIERS[tier_name]
# Calculate position size
sl_dist = atr * 1.5
if sl_dist <= 0: continue
risk_per_trade_usd = equity * base_risk_pct * tier['risk_mult']
risk_per_trade_usd = min(risk_per_trade_usd, self.config.RISK_CAP_PER_TRADE_USD)
# Calculate lot size based on monetary risk
point_value = self.config.CONTRACT_SIZE * (0.0001 if 'JPY' not in symbol else 0.01)
risk_per_lot = sl_dist * point_value
if risk_per_lot <= 0: continue
lots = risk_per_trade_usd / risk_per_lot
lots = max(self.config.MIN_LOT_SIZE, round(lots / self.config.LOT_STEP) * self.config.LOT_STEP)
if lots < self.config.MIN_LOT_SIZE: continue
margin_required = (lots * self.config.CONTRACT_SIZE * current_candle['Open']) / self.config.LEVERAGE
used_margin = sum(p.get('margin_used', 0) for p in open_positions.values())
if (equity - used_margin) < margin_required: continue
# Calculate entry and exit prices
entry_price_base = current_candle['Open']
spread_cost, slippage_cost = self._calculate_realistic_costs(prev_candle)
entry_price = entry_price_base + ((spread_cost + slippage_cost) * direction)
sl_price = entry_price - sl_dist * direction
tp_price = entry_price + (sl_dist * tier['rr']) * direction
open_positions[symbol] = {
'symbol': symbol, 'direction': direction, 'entry_price': entry_price,
'sl': sl_price, 'tp': tp_price, 'confidence': confidence, 'lot_size': lots,
'margin_used': margin_required, 'mfe_price': entry_price, 'mae_price': entry_price
}
day_peak_equity = max(day_peak_equity, equity)
finalize_day_metrics(current_day, equity)
return pd.DataFrame(trades), pd.Series(equity_curve), circuit_breaker_tripped, breaker_context, daily_dd_report
class PerformanceAnalyzer:
def __init__(self,config:ConfigModel):
self.config=config
def generate_full_report(self,trades_df:Optional[pd.DataFrame],equity_curve:Optional[pd.Series],cycle_metrics:List[Dict],aggregated_shap:Optional[pd.DataFrame]=None, framework_memory:Optional[Dict]=None, aggregated_daily_dd:Optional[List[Dict]]=None) -> Dict[str, Any]:
logger.info("-> Stage 4: Generating Final Performance Report...")
if equity_curve is not None and len(equity_curve) > 1: self.plot_equity_curve(equity_curve)
if aggregated_shap is not None: self.plot_shap_summary(aggregated_shap)
metrics = self._calculate_metrics(trades_df, equity_curve) if trades_df is not None and not trades_df.empty else {}
self.generate_text_report(metrics, cycle_metrics, aggregated_shap, framework_memory, aggregated_daily_dd)
logger.info(f"[SUCCESS] Final report generated and saved to: {self.config.REPORT_SAVE_PATH}")
return metrics
def plot_equity_curve(self,equity_curve:pd.Series):
plt.style.use('seaborn-v0_8-darkgrid')
plt.figure(figsize=(16,8))
plt.plot(equity_curve.values,color='dodgerblue',linewidth=2)
plt.title(f"{self.config.nickname or self.config.REPORT_LABEL} - Walk-Forward Equity Curve",fontsize=16,weight='bold')
plt.xlabel("Trade Event Number (including partial closes)",fontsize=12)
plt.ylabel("Equity ($)",fontsize=12)
plt.grid(True,which='both',linestyle=':')
try:
plt.savefig(self.config.PLOT_SAVE_PATH)
plt.close()
logger.info(f" - Equity curve plot saved to: {self.config.PLOT_SAVE_PATH}")
except Exception as e:
logger.error(f" - Failed to save equity curve plot: {e}")
def plot_shap_summary(self,shap_summary:pd.DataFrame):
plt.style.use('seaborn-v0_8-darkgrid')
plt.figure(figsize=(12,10))
shap_summary.head(20).sort_values(by='SHAP_Importance').plot(kind='barh',legend=False,color='mediumseagreen')
title_str = f"{self.config.nickname or self.config.REPORT_LABEL} ({self.config.strategy_name}) - Aggregated Feature Importance"
plt.title(title_str,fontsize=16,weight='bold')
plt.xlabel("Mean Absolute SHAP Value",fontsize=12)
plt.ylabel("Feature",fontsize=12)
plt.tight_layout()
try:
plt.savefig(self.config.SHAP_PLOT_PATH)
plt.close()
logger.info(f" - SHAP summary plot saved to: {self.config.SHAP_PLOT_PATH}")
except Exception as e:
logger.error(f" - Failed to save SHAP plot: {e}")
def _calculate_metrics(self,trades_df:pd.DataFrame,equity_curve:pd.Series)->Dict[str,Any]:
m={}
m['initial_capital']=self.config.INITIAL_CAPITAL
m['ending_capital']=equity_curve.iloc[-1]
m['total_net_profit']=m['ending_capital']-m['initial_capital']
m['net_profit_pct']=(m['total_net_profit']/m['initial_capital']) if m['initial_capital']>0 else 0
returns=trades_df['PNL']/m['initial_capital']
wins=trades_df[trades_df['PNL']>0]
losses=trades_df[trades_df['PNL']<0]
m['gross_profit']=wins['PNL'].sum()
m['gross_loss']=abs(losses['PNL'].sum())
m['profit_factor']=m['gross_profit']/m['gross_loss'] if m['gross_loss']>0 else np.inf
m['total_trade_events']=len(trades_df)
final_exits_df = trades_df[trades_df['ExitReason'].str.contains("Stop Loss|Take Profit", na=False)]
m['total_trades'] = len(final_exits_df)
m['winning_trades']=len(final_exits_df[final_exits_df['PNL'] > 0])
m['losing_trades']=len(final_exits_df[final_exits_df['PNL'] < 0])
m['win_rate']=m['winning_trades']/m['total_trades'] if m['total_trades']>0 else 0
m['avg_win_amount']=wins['PNL'].mean() if len(wins)>0 else 0
m['avg_loss_amount']=abs(losses['PNL'].mean()) if len(losses)>0 else 0
avg_full_win = final_exits_df[final_exits_df['PNL'] > 0]['PNL'].mean() if len(final_exits_df[final_exits_df['PNL'] > 0]) > 0 else 0
avg_full_loss = abs(final_exits_df[final_exits_df['PNL'] < 0]['PNL'].mean()) if len(final_exits_df[final_exits_df['PNL'] < 0]) > 0 else 0
m['payoff_ratio']=avg_full_win/avg_full_loss if avg_full_loss > 0 else np.inf
m['expected_payoff']=(m['win_rate']*avg_full_win)-((1-m['win_rate'])*avg_full_loss) if m['total_trades']>0 else 0
running_max=equity_curve.cummax()
drawdown_abs=running_max-equity_curve
m['max_drawdown_abs']=drawdown_abs.max() if not drawdown_abs.empty else 0
m['max_drawdown_pct']=((drawdown_abs/running_max).replace([np.inf,-np.inf],0).max())*100
exec_times=pd.to_datetime(trades_df['ExecTime']).dt.tz_localize(None)
years=((exec_times.max()-exec_times.min()).days/365.25) if not trades_df.empty else 1
years = max(years, 1/365.25)
m['cagr']=(((m['ending_capital']/m['initial_capital'])**(1/years))-1) if years>0 and m['initial_capital']>0 else 0
pnl_std=returns.std()
m['sharpe_ratio']=(returns.mean()/pnl_std)*np.sqrt(252*24*4) if pnl_std>0 else 0
downside_returns=returns[returns<0]
downside_std=downside_returns.std()
m['sortino_ratio']=(returns.mean()/downside_std)*np.sqrt(252*24*4) if downside_std>0 else np.inf
m['calmar_ratio']=m['cagr']/(m['max_drawdown_pct']/100) if m['max_drawdown_pct']>0 else np.inf
m['mar_ratio']=m['calmar_ratio']
m['recovery_factor']=m['total_net_profit']/m['max_drawdown_abs'] if m['max_drawdown_abs']>0 else np.inf
pnl_series = final_exits_df['PNL']
win_streaks = (pnl_series > 0).astype(int).groupby((pnl_series <= 0).cumsum()).cumsum()
loss_streaks = (pnl_series < 0).astype(int).groupby((pnl_series >= 0).cumsum()).cumsum()
m['longest_win_streak'] = win_streaks.max() if not win_streaks.empty else 0
m['longest_loss_streak'] = loss_streaks.max() if not loss_streaks.empty else 0
return m
def _get_comparison_block(self, metrics: Dict, memory: Dict, ledger: Dict, width: int) -> str:
champion = memory.get('champion_config')
historical_runs = memory.get('historical_runs', [])
previous_run = historical_runs[-1] if historical_runs else None
def get_data(source: Optional[Dict], key: str, is_percent: bool = False) -> str:
if not source: return "N/A"
val = source.get(key) if isinstance(source, dict) and key in source else source.get("final_metrics", {}).get(key) if isinstance(source, dict) else None
if val is None or not isinstance(val, (int, float)): return "N/A"
return f"{val:.2f}%" if is_percent else f"{val:.2f}"
def get_info(source: Optional[Union[Dict, ConfigModel]], key: str) -> str:
if not source: return "N/A"
if hasattr(source, key):
return str(getattr(source, key, 'N/A'))
elif isinstance(source, dict):
return str(source.get(key, 'N/A'))
return "N/A"
def get_nickname(source: Optional[Union[Dict, ConfigModel]]) -> str:
if not source: return "N/A"
version_key = 'REPORT_LABEL' if hasattr(source, 'REPORT_LABEL') else 'script_version'
version = get_info(source, version_key)
return ledger.get(version, "N/A")
c_nick, p_nick, champ_nick = get_nickname(self.config), get_nickname(previous_run), get_nickname(champion)
c_strat, p_strat, champ_strat = get_info(self.config, 'strategy_name'), get_info(previous_run, 'strategy_name'), get_info(champion, 'strategy_name')
c_mar, p_mar, champ_mar = get_data(metrics, 'mar_ratio'), get_data(previous_run, 'mar_ratio'), get_data(champion, 'mar_ratio')
c_mdd, p_mdd, champ_mdd = get_data(metrics, 'max_drawdown_pct', True), get_data(previous_run, 'max_drawdown_pct', True), get_data(champion, 'max_drawdown_pct', True)
c_pf, p_pf, champ_pf = get_data(metrics, 'profit_factor'), get_data(previous_run, 'profit_factor'), get_data(champion, 'profit_factor')
col_w = (width - 5) // 4
header = f"| {'Metric'.ljust(col_w-1)}|{'Current Run'.center(col_w)}|{'Previous Run'.center(col_w)}|{'All-Time Champion'.center(col_w)}|"
sep = f"+{'-'*(col_w)}+{'-'*(col_w)}+{'-'*(col_w)}+{'-'*(col_w)}+"
rows = [
f"| {'Run Nickname'.ljust(col_w-1)}|{c_nick.center(col_w)}|{p_nick.center(col_w)}|{champ_nick.center(col_w)}|",
f"| {'Strategy'.ljust(col_w-1)}|{c_strat.center(col_w)}|{p_strat.center(col_w)}|{champ_strat.center(col_w)}|",
f"| {'MAR Ratio'.ljust(col_w-1)}|{c_mar.center(col_w)}|{p_mar.center(col_w)}|{champ_mar.center(col_w)}|",
f"| {'Max Drawdown'.ljust(col_w-1)}|{c_mdd.center(col_w)}|{p_mdd.center(col_w)}|{champ_mdd.center(col_w)}|",
f"| {'Profit Factor'.ljust(col_w-1)}|{c_pf.center(col_w)}|{p_pf.center(col_w)}|{champ_pf.center(col_w)}|"
]
return "\n".join([header, sep] + rows)
def generate_text_report(self, m: Dict[str, Any], cycle_metrics: List[Dict], aggregated_shap: Optional[pd.DataFrame] = None, framework_memory: Optional[Dict] = None, aggregated_daily_dd: Optional[List[Dict]] = None):
WIDTH = 90
def _box_top(w): return f"+{'-' * (w-2)}+"
def _box_mid(w): return f"+{'-' * (w-2)}+"
def _box_bot(w): return f"+{'-' * (w-2)}+"
def _box_line(text, w):
padding = w - 4 - len(text)
return f"| {text}{' ' * padding} |" if padding >= 0 else f"| {text[:w-5]}... |"
def _box_title(title, w): return f"| {title.center(w-4)} |"
def _box_text_kv(key, val, w):
val_str = str(val)
key_len = len(key)
padding = w - 4 - key_len - len(val_str)
return f"| {key}{' ' * padding}{val_str} |"
ledger = {};
if self.config.NICKNAME_LEDGER_PATH and os.path.exists(self.config.NICKNAME_LEDGER_PATH):
try:
with open(self.config.NICKNAME_LEDGER_PATH, 'r') as f: ledger = json.load(f)
except (json.JSONDecodeError, IOError): logger.warning("Could not load nickname ledger for reporting.")
report = [_box_top(WIDTH)]
report.append(_box_title('ADAPTIVE WALK-FORWARD PERFORMANCE REPORT', WIDTH))
report.append(_box_mid(WIDTH))
report.append(_box_line(f"Nickname: {self.config.nickname or 'N/A'} ({self.config.strategy_name})", WIDTH))
report.append(_box_line(f"Version: {self.config.REPORT_LABEL}", WIDTH))
report.append(_box_line(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}", WIDTH))
if self.config.analysis_notes:
report.append(_box_line(f"AI Notes: {self.config.analysis_notes}", WIDTH))
if framework_memory:
report.append(_box_mid(WIDTH))
report.append(_box_title('I. PERFORMANCE vs. HISTORY', WIDTH))
report.append(_box_mid(WIDTH))
report.append(self._get_comparison_block(m, framework_memory, ledger, WIDTH))
sections = {
"II. EXECUTIVE SUMMARY": [
(f"Initial Capital:", f"${m.get('initial_capital', 0):>15,.2f}"),
(f"Ending Capital:", f"${m.get('ending_capital', 0):>15,.2f}"),
(f"Total Net Profit:", f"${m.get('total_net_profit', 0):>15,.2f} ({m.get('net_profit_pct', 0):.2%})"),
(f"Profit Factor:", f"{m.get('profit_factor', 0):>15.2f}"),
(f"Win Rate (Full Trades):", f"{m.get('win_rate', 0):>15.2%}"),
(f"Expected Payoff:", f"${m.get('expected_payoff', 0):>15.2f}")
],
"III. CORE PERFORMANCE METRICS": [
(f"Annual Return (CAGR):", f"{m.get('cagr', 0):>15.2%}"),
(f"Sharpe Ratio (annual):", f"${m.get('sharpe_ratio', 0):>15.2f}"),
(f"Sortino Ratio (annual):", f"${m.get('sortino_ratio', 0):>15.2f}"),
(f"Calmar Ratio / MAR:", f"${m.get('mar_ratio', 0):>15.2f}")
],
"IV. RISK & DRAWDOWN ANALYSIS": [
(f"Max Drawdown (Cycle):", f"{m.get('max_drawdown_pct', 0):>15.2f}% (${m.get('max_drawdown_abs', 0):,.2f})"),
(f"Recovery Factor:", f"${m.get('recovery_factor', 0):>15.2f}"),
(f"Longest Losing Streak:", f"{m.get('longest_loss_streak', 0):>15} trades")
],
"V. TRADE-LEVEL STATISTICS": [
(f"Total Unique Trades:", f"{m.get('total_trades', 0):>15}"),
(f"Total Trade Events (incl. partials):", f"{m.get('total_trade_events', 0):>15}"),
(f"Average Win Event:", f"${m.get('avg_win_amount', 0):>15,.2f}"),
(f"Average Loss Event:", f"${m.get('avg_loss_amount', 0):>15,.2f}"),
(f"Payoff Ratio (Full Trades):", f"${m.get('payoff_ratio', 0):>15.2f}")
]
}
for title, data in sections.items():
if not m: continue
report.append(_box_mid(WIDTH))
report.append(_box_title(title, WIDTH))
report.append(_box_mid(WIDTH))
for key, val in data: report.append(_box_text_kv(key, val, WIDTH))
report.append(_box_mid(WIDTH))
report.append(_box_title('VI. WALK-FORWARD CYCLE BREAKDOWN', WIDTH))
report.append(_box_mid(WIDTH))
cycle_df = pd.DataFrame(cycle_metrics)
if not cycle_df.empty:
if 'BreakerContext' in cycle_df.columns:
cycle_df['BreakerContext'] = cycle_df['BreakerContext'].apply(
lambda x: f"Trades: {x.get('num_trades_before_trip', 'N/A')}, PNL: {x.get('pnl_before_trip', 'N/A'):.2f}" if isinstance(x, dict) else ""
).fillna('')
if 'trade_summary' in cycle_df.columns:
cycle_df['MAE/MFE (Losses)'] = cycle_df['trade_summary'].apply(
lambda s: f"${s.get('avg_mae_loss',0):.2f}/${s.get('avg_mfe_loss',0):.2f}" if isinstance(s, dict) else "N/A"
)
cycle_df.drop(columns=['trade_summary'], inplace=True)
cycle_df_str = cycle_df.to_string(index=False)
else:
cycle_df_str = "No trades were executed."
for line in cycle_df_str.split('\n'): report.append(_box_line(line, WIDTH))
report.append(_box_mid(WIDTH))
report.append(_box_title('VII. MODEL FEATURE IMPORTANCE (TOP 15)', WIDTH))
report.append(_box_mid(WIDTH))
shap_str = aggregated_shap.head(15).to_string() if aggregated_shap is not None else "SHAP summary was not generated."
for line in shap_str.split('\n'): report.append(_box_line(line, WIDTH))
if aggregated_daily_dd:
report.append(_box_mid(WIDTH))
report.append(_box_title('VIII. HIGH DAILY DRAWDOWN EVENTS (>15%)', WIDTH))
report.append(_box_mid(WIDTH))
high_dd_events = []
for cycle_idx, cycle_dd_report in enumerate(aggregated_daily_dd):
for day, data in cycle_dd_report.items():
if data['drawdown_pct'] > 15.0:
high_dd_events.append(f"Cycle {cycle_idx+1} | {day} | DD: {data['drawdown_pct']:.2f}% | PNL: ${data['pnl']:,.2f}")
if high_dd_events:
for event in high_dd_events:
report.append(_box_line(event, WIDTH))
else:
report.append(_box_line("No days with drawdown greater than 15% were recorded.", WIDTH))
report.append(_box_bot(WIDTH))
final_report = "\n".join(report)
logger.info("\n" + final_report)
try:
with open(self.config.REPORT_SAVE_PATH,'w',encoding='utf-8') as f: f.write(final_report)
except IOError as e: logger.error(f" - Failed to save text report: {e}",exc_info=True)
def get_macro_context_data() -> Dict[str, Any]:
"""
Fetches the latest data for key macroeconomic indicators (VIX, DXY, US10Y),
with robust error handling for data structure and content.
"""
logger.info("-> Fetching external macroeconomic context data (VIX, DXY, US10Y)...")
macro_context = {}
tickers = {
"VIX": "^VIX",
"DXY": "DX-Y.NYB",
"US10Y_YIELD": "^TNX"
}
for name, ticker in tickers.items():
try:
data = yf.download(ticker, period="2wk", progress=False)
if not data.empty and len(data) > 5:
close = data['Close']
if isinstance(close, pd.DataFrame):
close = close.iloc[:, 0]
latest_level = close.iloc[-1]
one_week_ago_level = close.iloc[-6]
if hasattr(one_week_ago_level, "item"): one_week_ago_level = one_week_ago_level.item()
if hasattr(latest_level, "item"): latest_level = latest_level.item()
if one_week_ago_level != 0:
week_change_pct = ((latest_level - one_week_ago_level) / one_week_ago_level) * 100
else:
week_change_pct = 0.0
macro_context[name] = {"level": round(latest_level, 2), "1_week_change_pct": round(week_change_pct, 2)}
else:
logger.warning(f" - Not enough data returned for {name} ({ticker}) to calculate 1-week change.")
macro_context[name] = {"error": "Insufficient data"}
except Exception as e:
logger.error(f" - Failed to download or process macro data for {name} ({ticker}): {e}")
macro_context[name] = {"error": str(e)}
logger.info(f" - Macro context generated: {macro_context}")
return macro_context Summary of Changes: V208 vs. V209
The fundamental structure of executing walk-forward cycles and generating a final report remains the same in both versions. The key developments in V209 focus on making this process more efficient and autonomous by introducing feature caching and a significantly more intelligent system for handling failures mid-run.
Efficiency Enhancements
Feature Caching: V209 adds the ability to save the results of the computationally intensive feature engineering process to a .parquet file.
Cache Validation: On subsequent runs, V209 checks if the input data or key feature generation parameters have changed. If they have not, it loads the data from the cache, bypassing the need to re-calculate all features. This dramatically speeds up testing and is controlled by the USE_FEATURE_CACHING parameter. V208 lacked this caching ability and would re-calculate features on every run.
Improved Failure Handling & Autonomy
In-Cycle Retries: When a model fails to meet the minimum quality standard during a training cycle, V209 now attempts to salvage it. It engages the Gemini AI to suggest immediate parameter tweaks and retries the training up to three times within the same cycle. V208 would simply proceed with a failed or non-existent model for that cycle.
Strategic Intervention & Pivoting: If a strategy continues to fail even with AI-assisted retries, V209 implements a "Strategic Intervention".
The failing strategy is "quarantined" for the remainder of the run.
The AI is then tasked to select a completely new strategy from the playbook to continue the walk-forward analysis.
This allows V209 to dynamically pivot away from a failing approach mid-run, a capability V208 did not have.
However...
The AI, in its quest for the highest possible risk-adjusted return (like the Calmar Ratio), has determined that the most rational action is to not trade at all. This is a classic machine learning outcome often called "learning to be flat."
This is similar behaviour experienced in an earlier update when the AI was given the objective to basically make as much profit as possible and it manipulated a calculation weakness but thee account suffered 66% DD but made 3000% PNL.
I have decided to remove LSTM as I don't need it and there were callback conflicts.
I introduced Phases in V209 and my current testing version 10 I have built in Groups.
The "Phases" and the "Group Updates" are two distinct but highly synergistic sets of enhancements that transform the system from a static, pre-configured engine into a dynamic and adaptive trading partner.
Here is an explanation of how each component improves the framework and how they work together.
The "Phases": Building Contextual Intelligence (The 'Brain')
The primary goal of the Phased updates was to give the framework a deep, upfront understanding of the market environment before making decisions. It's the strategic, analytical component.
- Phase 1: Contextual Intelligence Upgrade
- What it does: This phase implemented granular, data-driven regime detection and AI-powered dynamic feature selection. The framework now clusters historical data to identify distinct market personalities (e.g., "high-volatility trending" or "low-volatility ranging") and tasks the AI with selecting the most appropriate strategy and the specific features for that exact regime.
- How it improves: This is a monumental leap from using a single, hard-coded set of features for a strategy. It allows the framework to tailor its approach with high precision. Instead of just picking a "mean reversion" strategy, it can now reason that "For this specific low-volatility, mean-reverting regime, the best features to use are RSI_zscore and bollinger_bandwidth."
- Phase 2: Advanced Optimization & Decision-Making
- What it does: This phase upgraded the model training process by introducing Multi-Objective Hyperparameter Optimization. Instead of just finding parameters that maximize profit, the system now finds a set of optimal models that represent the best trade-offs between competing objectives, such as maximizing risk-adjusted returns (Calmar Ratio) while minimizing the number of trades. An AI-powered selection function then chooses the best model from this set based on the run's overall risk profile.
- How it improves: This produces far more robust and realistic models. A strategy with the highest possible profit might be too erratic or expensive to trade in the real world. By balancing objectives, the framework can select a model that is more stable and cost-effective, even if it has slightly lower theoretical returns.
- Phase 3: Generative Strategy Creation
- What it does: This is the creative leap. This phase gives the framework the ability to evolve its own playbook. It can be tasked to invent entirely new hybrid strategies by combining elements of existing ones or use genetic algorithms to discover novel trading logic from scratch.
- How it improves: The framework is no longer limited by the strategies a human has pre-programmed into its playbook.json file. It can autonomously expand its own knowledge base, creating new tools to tackle market conditions it has not seen before.
The "Group Updates": Building Dynamic Execution (The 'Reflexes')
If the Phases are the "brain" that does the strategic planning, the Group Updates are the "nervous system" that provides real-time reflexes. This enhancement moves the framework from having a "single, static personality to one that can dynamically change its 'mood'" based on its own live performance.
- What it does: It introduces four distinct operating states that govern the framework's execution and risk-taking behaviour in real-time.
- Conservative Baseline: The default, capital-preserving state used to establish a stable performance baseline. Risk is low, and trade selection is highly filtered.
- Aggressive Expansion: When the system is performing well (e.g., hitting a new equity high or a streak of wins ), it "earns the right to become aggressive". Risk is increased to maximize profits during favourable conditions.
- Drawdown Control: If the system starts losing money or hits a drawdown limit , this defensive state is triggered to "halt losses, minimize further damage". Risk is immediately cut to emergency-low levels, and a temporary trading halt can be engaged to prevent "revenge trading".
- Strategic Pivot: The ultimate fail-safe. If a strategy proves to be fundamentally broken (e.g., by remaining in Drawdown Control for two consecutive cycles ), the framework "must be able to fire its current strategy and hire a new one". It quarantines the failing strategy and forces the AI to select a new, simpler one to re-establish a stable baseline.
- How it improves: This creates a critical performance-based feedback loop within a walk-forward cycle. The framework no longer has to wait until the next retraining period to react to what's happening. It can protect capital the moment a drawdown begins and press its advantage the moment a winning streak emerges. This is a profound enhancement to risk management and profit capture.
Synergy: How The Brain and Reflexes Work Together
The true power of these updates lies in how they combine. The Phases provide the intelligence, and the Group Updates provide the disciplined, adaptive execution.
Here is how a complete cycle works with the combined enhancements:
- Strategic Planning (Phase 1): At the start of a run, the framework uses its contextual intelligence to diagnose the market regime. The AI analyzes this regime and selects what it believes is the best strategy and feature set from the playbook.
- Cautious Start (Group 1 + Phase 2): The framework enters its first cycle in the CONSERVATIVE_BASELINE state. The model it uses was optimized by Phase 2's multi-objective engine with a conservative goal: Maximize (Calmar Ratio) and Minimize (Average Trade Count). It takes small, careful trades to validate the AI's strategic choice.
- Performance Reaction (Group 2 & 3 + Phase 2):
- Success: The strategy works well, hitting a new equity high. The system's reflexes kick in, and it transitions to the AGGRESSIVE_EXPANSION state. For the next retraining cycle, the AI's optimization objective (from Phase 2) will automatically switch to Maximize (Net Profit) and Maximize (Average Trade Count), actively searching for more aggressive parameters to capitalize on the proven edge.
- Failure: The strategy starts losing, tripping a drawdown circuit breaker. The framework immediately shifts to DRAWDOWN_CONTROL , cutting risk and locking new trades for the day. For the next cycle, the optimization objective reverts to the conservative "Maximize Calmar" goal to seek stability.
- Systemic Failure & Adaptation (Group 4 + Phase 3):
- The strategy continues to fail, remaining in DRAWDOWN_CONTROL for a second consecutive cycle. This triggers a STRATEGIC_PIVOT.
- The failing strategy is "fired" and put on a quarantine list. The AI (using its Phase 3 generative capabilities) is now called to either select a new, robust strategy from the playbook or, if necessary, invent a new one to replace the broken logic.
- The framework then resets to the CONSERVATIVE_BASELINE state with this brand new strategy, beginning the entire process of validation and adaptation anew.
In conclusion, the Phases give the framework the intelligence to create a sound strategic plan, while the Group Updates give it the reflexes and discipline to execute that plan with real-time adjustments, and a fail-safe mechanism to completely change the plan when it's proven wrong. This two-layered approach creates a robust, self-correcting system that can adapt to changing markets on both a strategic and a tactical level.
1
- #83
- Jun 21, 2025 9:22pm Jun 21, 2025 9:22pm
- Joined Sep 2020 | Status: Strategy and Risk Manager | 164 Posts
DislikedBlackPenguin!!! Really nice Thread have you opened there!!!I have tried with an AI to develop an EA that elaborate the data of indicators , send them to an AI api and give back the best choice to take... was in a good way (and no .. I'm not a coder) I have not coded further, because i used only the free AI and there are many restrictions and limitation...
To test your project, completly... how we have to start? Have you a HowTo for the project? I'm really curious how far you get and give me inspiration for some...
Ignored
https://github.com/sam-minns/ML-Trading-Framework-PRO
If you need detailed help, best to message me directly.
2
- #84
- Jun 22, 2025 12:20am Jun 22, 2025 12:20am
- Joined Sep 2020 | Status: Strategy and Risk Manager | 164 Posts
DislikedThe framework (script) is All-In-One. @amvt You'll find the backtester code block Line 1753 in V209. I have broken up the framework into modular sections to make it easy to read and find things. # ============================================================================= # 7. BACKTESTER & 8. PERFORMANCE ANALYZER # ============================================================================= class Backtester: def __init__(self,config:ConfigModel): self.config=config self.is_meta_model = False self.is_transformer_model...Ignored
They were in fact being simulated...
The Tool Configuration ("mode": "AUTO"): This was the crucial fix. This was on 'ANY' previously.
Here’s a breakdown of what the framework will now do, addressing some of the points:
- Get Real Spreads, Not the Default:
- The prompt explicitly instructs the AI: "Action: Use Google Search to find typical trading costs...".
- Because the tool calling now works correctly, the AI will perform a live web search to find current, typical spreads for the specific assets you are trading.
- The values it finds will be used to populate the SPREAD_CONFIG in the final JSON response, overriding the hardcoded defaults in your script.
- ECN-Specific Spreads:
- The prompt is very specific: "...on a retail ECN/Raw Spread account."
- The AI will tailor its search to find data from brokers that offer this account type, leading to more realistic and lower-spread values than a standard or fixed-spread account.
- Holidays and Other Events:
- This is handled by the second part of the grounded search in the prompt: "Grounded Calendar Check: Search the economic calendar for the next 5 trading days."
- A live search of an economic calendar will inherently include upcoming market holidays (e.g., UK Bank Holidays, US Thanksgiving, etc.), high-impact news releases (NFP, CPI), and central bank meetings.
- The AI will then synthesize this information. For example, if it sees a major bank holiday for the British Pound (GBP), it may choose a less aggressive strategy or suggest avoiding GBP pairs around that day due to expected low liquidity.
In short, the fix enables the framework to move from simulating a search based on its training data to actually performing a live search and using that real-time, specific data to configure the backtest environment and select a strategy.
Still testing:
- End_To_End_Advanced_ML_Trading_Framework_PRO_V210_Linux.py.txt
This version now incorporates the "Group Update'.
Attached File(s)
1
- #85
- Jun 22, 2025 12:47am Jun 22, 2025 12:47am
Hi
I was getting to many retries on yfinance and so I asked the LLM to fix this and here's the resultant code: -
I was getting to many retries on yfinance and so I asked the LLM to fix this and here's the resultant code: -
Inserted Code
def get_macro_context_data() -> Dict[str, Any]:
"""
Fetches the latest data for key macroeconomic indicators (VIX, DXY, US10Y),
with robust error handling, rate limiting, and retry logic.
"""
logger.info("-> Fetching external macroeconomic context data (VIX, DXY, US10Y)...")
macro_context = {}
tickers = {
"VIX": "^VIX",
"DXY": "DX-Y.NYB",
"US10Y_YIELD": "^TNX"
}
# Add delays between requests to avoid rate limiting
request_delay = 2.0 # seconds between requests
max_retries = 3
retry_delay = 5.0 # seconds between retries
for i, (name, ticker) in enumerate(tickers.items()):
# Add delay between requests (except for the first one)
if i > 0:
logger.info(f" - Waiting {request_delay}s before next request to avoid rate limits...")
time.sleep(request_delay)
success = False
for attempt in range(max_retries):
try:
logger.info(f" - Fetching {name} ({ticker}) - Attempt {attempt + 1}/{max_retries}")
# Download with explicit parameters to avoid warnings and rate limits
data = yf.download(
ticker,
period="2wk",
progress=False,
auto_adjust=True, # Explicitly set to avoid warning
prepost=False, # Don't include pre/post market data
threads=False, # Single threaded to be gentler on API
proxy=None
)
if not data.empty and len(data) > 5:
close = data['Close']
if isinstance(close, pd.DataFrame):
close = close.iloc[:, 0]
# Ensure we have enough data points
if len(close) < 6:
logger.warning(f" - {name}: Insufficient data points ({len(close)}) for calculation")
macro_context[name] = {"error": "Insufficient data points"}
success = True # Don't retry for data issues
break
latest_level = close.iloc[-1]
one_week_ago_level = close.iloc[-6] # Approximately 1 week ago (assuming daily data)
# Handle pandas scalar conversion
if hasattr(one_week_ago_level, "item"):
one_week_ago_level = one_week_ago_level.item()
if hasattr(latest_level, "item"):
latest_level = latest_level.item()
if one_week_ago_level != 0:
week_change_pct = ((latest_level - one_week_ago_level) / one_week_ago_level) * 100
else:
week_change_pct = 0.0
macro_context[name] = {
"level": round(float(latest_level), 2),
"1_week_change_pct": round(float(week_change_pct), 2)
}
logger.info(f" - {name}: Level={latest_level:.2f}, 1wk change={week_change_pct:.2f}%")
success = True
break
else:
logger.warning(f" - {name}: No data returned or insufficient data")
if attempt < max_retries - 1:
logger.info(f" - Retrying in {retry_delay}s...")
time.sleep(retry_delay)
else:
macro_context[name] = {"error": "No data available"}
success = True # Stop retrying
except Exception as e:
error_msg = str(e)
logger.warning(f" - {name} attempt {attempt + 1} failed: {error_msg}")
# Handle specific error types
if "Rate limit" in error_msg or "Too Many Requests" in error_msg:
if attempt < max_retries - 1:
backoff_delay = retry_delay * (2 ** attempt) # Exponential backoff
logger.info(f" - Rate limited. Backing off for {backoff_delay}s...")
time.sleep(backoff_delay)
else:
macro_context[name] = {"error": "Rate limited - max retries exceeded"}
success = True # Stop retrying
elif "404" in error_msg or "Not Found" in error_msg:
logger.error(f" - {name}: Ticker symbol not found")
macro_context[name] = {"error": f"Ticker {ticker} not found"}
success = True # Don't retry for 404s
else:
if attempt < max_retries - 1:
logger.info(f" - Retrying in {retry_delay}s...")
time.sleep(retry_delay)
else:
macro_context[name] = {"error": error_msg}
success = True # Stop retrying
if not success:
macro_context[name] = {"error": "Failed after all retry attempts"}
# Add a summary log
successful_fetches = sum(1 for v in macro_context.values() if "error" not in v)
total_fetches = len(macro_context)
logger.info(f" - Macro context fetch complete: {successful_fetches}/{total_fetches} successful")
# If all fetches failed, create a minimal fallback context
if successful_fetches == 0:
logger.warning(" - All macro data fetches failed. Using fallback context.")
macro_context = {
"VIX": {"level": 20.0, "1_week_change_pct": 0.0, "note": "fallback_data"},
"DXY": {"level": 103.0, "1_week_change_pct": 0.0, "note": "fallback_data"},
"US10Y_YIELD": {"level": 4.5, "1_week_change_pct": 0.0, "note": "fallback_data"}
}
return macro_context - #86
- Jun 22, 2025 9:36am Jun 22, 2025 9:36am
- Joined Sep 2020 | Status: Strategy and Risk Manager | 164 Posts
DislikedHi I was getting to many retries on yfinance and so I asked the LLM to fix this and here's the resultant code: - def get_macro_context_data() -> Dict[str, Any]: """ Fetches the latest data for key macroeconomic indicators (VIX, DXY, US10Y), with robust error handling, rate limiting, and retry logic. """ logger.info("-> Fetching external macroeconomic context data (VIX, DXY, US10Y)...") macro_context = {} tickers = { "VIX": "^VIX", "DXY": "DX-Y.NYB", "US10Y_YIELD": "^TNX" } # Add delays between requests to avoid rate limiting request_delay = 2.0...Ignored
Your system will have different issues and it's always best to explore fixes that work for you.
I have just fixed an issue which was a KeyError. Even though I had a number of perfectly good cycles the error occurred because macroeconomic indicators like VIX and DXY were suggested as features by the AI but were never actually added as columns to the main training dataframe. This revealed a fundamental gap between the AI's context and the model's feature set.
Here is how I resolved it and enhanced the framework:
- 1. Bug Fix: Integrating Macro Data as Features
- The primary bug was fixed by re-engineering the get_macro_context_data function to fetch a full time-series history for the macro indicators.
- Crucially, this macro dataframe is now merged directly into the main full_df using pd.merge_asof. This ensures that every trading candle in the dataset has the corresponding daily macro values (like the VIX or DXY level) attached to it.
- This action directly resolves the KeyError by making the macro indicators available as features for the model to train on.
- 2. Improvement: AI-Driven Dynamic Ticker Selection
- I enhanced the framework by making the selection of macro indicators dynamic. Instead of using a hardcoded list, a new function, select_relevant_macro_tickers, was added to the GeminiAnalyzer.
- This function prompts the AI to choose the most relevant macro tickers from a master list based on the specific assets the user is trading (e.g., selecting German bond yields for Euro pairs). This makes the feature set significantly more intelligent and context-aware.
- 3. Improvement: Data Conversion for AI Prompts
- To resolve a subsequent TypeError, I ensured that when macro data is fetched for use in an AI prompt (which requires a text format), the pandas DataFrame is converted into a JSON-serializable list of dictionaries using .to_dict(orient='records').
- 4. Improvement (Proposed): Macro Data Caching
- I wanted to cache the fetched macroeconomic data, similar to the existing Feature Engineering cache. This would prevent redundant downloads from yfinance on repeated runs and improve overall efficiency.
In essence, I transformed the macro data handling from a simple, prompt-only context piece into a fully integrated, AI-driven feature engineering pipeline. The framework now correctly merges macro data, intelligently selects which data to use aligned with the assets in your directory, and is designed to cache the results for speed.
1
- #87
- Jun 22, 2025 5:58pm Jun 22, 2025 5:58pm
- Joined Sep 2020 | Status: Strategy and Risk Manager | 164 Posts
Disliked{quote} I have just realised I didn't read the Google documentation correctly and thought that the applied Grounded Search Tool was 'on'. They were in fact being simulated... The Tool Configuration ("mode": "AUTO"): This was the crucial fix. This was on 'ANY' previously. Here’s a breakdown of what the framework will now do, addressing some of the points: Get Real Spreads, Not the Default: The prompt explicitly instructs the AI: "Action: Use Google Search to find typical trading costs...". Because the tool calling now works correctly, the AI will...Ignored
The Analysis Paralysis Issue...
Whilst now the V210 (not the one I posted) has overcome the F1 (gate score) issue the next issue I am trying to solve is why won't the AI trade.
This is the classic 'analysis paralysis' issue whereby the AI has an objective and is smart enough to work out that to not lose any money it is just better not to trade. This in part is caused by the Ai's analysis and the objective so I am trying to figure out how to to make the AI trade.
I have now implemented a 'mission plan' to first establish a baseline by forcing the AI into a tactical, short-sighted loop. It can refer to allows it to make decisions that might seem suboptimal in the short term (like accepting a less profitable model) but are correct for the long-term goal (establishing a baseline).
- Phase 1: Baseline Establishment
- Trigger: The run has not yet had 2 consecutive cycles pass the model quality gate.
- AI Directive: Your primary goal is to find a model that can pass the F1-score quality gate and execute trades. Prioritize suggestions that make the model easier to train (simpler features, easier labelling) over financial performance. A trading model is better than no model.
- Phase 2: Performance Optimization
- Trigger: The baseline has been established (2+ consecutive successful training cycles).
- AI Directive: A stable baseline model is trading. Your primary goal is now to improve profitability and risk-adjusted returns. Focus on refining features based on SHAP, tuning risk parameters, and maximizing financial metrics like Sortino or MAR ratio.
- Phase 3: Drawdown Control & Adaptation
- Trigger: The system is in the DRAWDOWN_CONTROL state (due to losses or circuit breakers).
- AI Directive: The system is in a drawdown. Your absolute priority is capital preservation. Aggressively reduce risk. Suggest switching to "low complexity" strategies. Your goal is to stop the losses and re-establish a stable baseline.
This strategic_directive string will then be passed as context to the other AI calls (analyze_cycle_and_suggest_changes and select_best_tradeoff) within that cycle, ensuring every decision is aligned with the current phase of the plan.
Example Log
2025-06-23 02:46:00,012 - INFO - - AI suggestions applied for next cycle. Notes: The primary directive is to establish a baseline, which means finding a configuration that trades and passes the F1-score gate. The current configuration resulted in zero trades despite an F1 score of 1.0. This suggests a potential issue with the trading logic or parameter settings that prevent trade execution. To address this and make the model easier to train and execute trades, I suggest simplifying the feature set and focusing on the core components of the Bollinger RSI strategy. By reducing the number of features, we reduce the complexity and computational burden on the model to increase the model's ability to be trained effectively. Further, there is the possibility that using too many features leads to overfitting, so this might also improve the model's robustness.
2025-06-23 02:46:00,012 - INFO - --- Cycle complete. PNL: $0.00 | Final Equity: $1,000.00 | Time: 1489.63s --- <<< analysis paralysis
1
Still testing all the code, I seem to be unable to get a completed run, it is very slow but one question I wonder is:
What is the amount of data we should ingest for each pair? Is it really relevant to dumpt tickers from 2020 or should we focus on more recent data? And additionally, does it affect the speed?
For now I am backtesting one pair, EURUSD and added H1, Daily and M15 timeframes. With version v208 it was four days running non-stop and couldn't deliver any output.
What is the amount of data we should ingest for each pair? Is it really relevant to dumpt tickers from 2020 or should we focus on more recent data? And additionally, does it affect the speed?
For now I am backtesting one pair, EURUSD and added H1, Daily and M15 timeframes. With version v208 it was four days running non-stop and couldn't deliver any output.
- #89
- Jun 23, 2025 3:32am Jun 23, 2025 3:32am
DislikedHi I was getting to many retries on yfinance and so I asked the LLM to fix this and here's the resultant code: - def get_macro_context_data() -> Dict[str, Any]: """ Fetches the latest data for key macroeconomic indicators (VIX, DXY, US10Y), with robust error handling, rate limiting, and retry logic. """ logger.info("-> Fetching external macroeconomic context data (VIX, DXY, US10Y)...") macro_context = {} tickers = { "VIX": "^VIX", "DXY": "DX-Y.NYB", "US10Y_YIELD": "^TNX" } # Add delays between requests to avoid rate limiting request_delay = 2.0...Ignored
there are alternatives like those of google etc, but free is free and will always have issues with pulling data.
It's better to either automate the OHLCV process directly through MT5/MT4 by linking it to excel or setting up an RPA
— Developed by MMQ
- #90
- Edited 11:55am Jun 23, 2025 3:54am | Edited 11:55am
- | Joined Jun 2025 | Status: Trader | 14 Posts
Btw I am getting an error after a three hours executing
Why VIX is missing
Inserted Code
2025-06-23 12:40:17,609 - ML_Trading_Framework - INFO - Successfully extracted JSON object using JSONDecoder.raw_decode.
2025-06-23 12:40:17,609 - ML_Trading_Framework - INFO - - API call to 'analyze_cycle_and_suggest_changes' complete.
2025-06-23 12:40:17,610 - ML_Trading_Framework - INFO - - AI suggestions applied for next cycle. Notes: The portfolio is healthy with 0.00% drawdown. However, the strategy has been mostly failing training cycles after the initial conservative baseline. Given the repeated training failures in the aggressive expansion phase, it's clear the current feature set or model configuration isn't robust enough. Since the overall drawdown is minimal, we can afford to experiment with slightly more risk to improve performance, but we need to ensure training success. To improve training success, I will reduce the available features and make sure I include features that are already used by the `ClassicBollingerRSI` Strategy. I will include basic technical indicators and volatility measures, and will re-introduce some basic features to the set to ensure a good outcome during the next training cycles.
2025-06-23 12:40:17,610 - ML_Trading_Framework - INFO - --- Cycle complete. PNL: $0.00 | Final Equity: $1,000.00 | Time: 649.82s ---
2025-06-23 12:40:17,611 - ML_Trading_Framework - INFO - -> Applying rules for Operating State: 'Aggressive Expansion'
2025-06-23 12:40:17,611 - ML_Trading_Framework - INFO - - Set MAX_DD_PER_CYCLE to 30%
2025-06-23 12:40:17,611 - ML_Trading_Framework - INFO - - Set BASE_RISK_PER_TRADE_PCT to 1.500%
2025-06-23 12:40:17,611 - ML_Trading_Framework - INFO - - Set MAX_CONCURRENT_TRADES to 5
2025-06-23 12:40:17,611 - ML_Trading_Framework - INFO -
--- Starting Cycle [11/13] in state 'Aggressive Expansion' ---
2025-06-23 12:40:17,611 - ML_Trading_Framework - INFO - Using valid frequency alias from AI: '90D'
2025-06-23 12:40:17,616 - ML_Trading_Framework - INFO - -> Stage 3: Generating Trade Labels ('standard')...
2025-06-23 12:40:18,313 - ML_Trading_Framework - INFO - - Label Sanity Check Passed. Distribution: Longs=40.90%, Shorts=42.64%
2025-06-23 12:40:18,313 - ML_Trading_Framework - INFO - - Starting model training using strategy: '[RANGING] A traditional mean-reversion strategy entering at the outer bands, filtered by low trend strength. Ideal for ranging markets. Example features: `bollinger_bandwidth`, `RSI`, `ADX`, `market_regime`.'
2025-06-23 12:40:18,314 - ML_Trading_Framework - CRITICAL - A critical, unhandled error occurred during run 1: "['VIX'] not in index"
Traceback (most recent call last):
File "/home/geminiBot/End_To_End_Advanced_ML_Trading_Framework_PRO_V210_Linux.py", line 4492, in main
run_single_instance(fallback_config, framework_history, playbook, nickname_ledger, directives, api_interval_seconds)
File "/home/geminiBot/End_To_End_Advanced_ML_Trading_Framework_PRO_V210_Linux.py", line 4290, in run_single_instance
train_result = trainer.train(df_train_labeled, config.selected_features, strategy_details)
File "/home/geminiBot/End_To_End_Advanced_ML_Trading_Framework_PRO_V210_Linux.py", line 2512, in train
X = df_train[feature_list].copy().fillna(0)
File "/home/geminiBot/.venv/lib/python3.10/site-packages/pandas/core/frame.py", line 3899, in __getitem__
indexer = self.columns._get_indexer_strict(key, "columns")[1]
File "/home/geminiBot/.venv/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 6115, in _get_indexer_strict
self._raise_if_missing(keyarr, indexer, axis_name)
File "/home/geminiBot/.venv/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 6179, in _raise_if_missing
raise KeyError(f"{not_found} not in index")
KeyError: "['VIX'] not in index" Why VIX is missing
- VIX is only fetched inside the helper get_macro_context_data() (see ~line 3485 ff.).
That function builds a dictionary for reporting to the language-model, but the series is never merged into the price/feature DataFrame that you later pass into train(). - When the AI later proposes a new strategy it blindly inserts "VIX" into selected_features, so the config that reaches the trainer contains a column name that the data pipeline never created.
It feels like an underlying bug from YFinance?
- #91
- Jun 23, 2025 4:42pm Jun 23, 2025 4:42pm
- Joined Sep 2020 | Status: Strategy and Risk Manager | 164 Posts
Hello all.
I have been testing (and still am) V210 which I have made a number of changes that should improve the overall framework experience.
The main issue I am facing is still 'Analysis Paralysis' so one of the improvements does relate to a response below which is the caching.
(I may have answered some questions already but I'm just going through each that I think I haven't)
Here we go...
@amvt
Your interpretation is fundamentally correct, with the following clarifications:
The frameworks' architecture utilises the Gemini Large Language Model (LLM) as an external, adaptive control unit for several discrete, high-level functions within the lifecycle. These are executed via the GeminiAnalyzer class:
The reason why I suggest AI Studio is that there is no API limits, I do use the Gemini Web App however that sometimes will limit you. Just find a workflow that works for you. I keep it simple by using Notepad++ for the coding and the terminal / powershell to execute the scripts. I do have a addon in Notepad++ that lets me execute script but that's another story.
@preptraditio / @amvt
In relation to yfinance
1. How V210 Pulls Data from this Source
The framework deliberately avoids using yfinance for its primary, high-frequency OHLCV (Open, High, Low, Close, Volume) data. The only place yfinance is used is within the get_macro_context_data function.
For about two weeks I have not completed a full run yet as I am fixing bugs and finding ways to improve the framework experience.
Question / Simple Answer
How does it test? It uses a "Sliding Window" to test on unseen future data, one chunk at a time.
How does it train? It uses only a recent slice of past data to train a model for the next test period.
What's the 5-fold split for? It's a series of "mini-exams" inside the training data to find the best model settings.
Why doesn't the start date matter? The model only ever sees the recent past, so it's always adapting to current conditions.
How much data is best? At least 3-5 years, to allow for many robust test cycles.
Does old data affect speed? Yes. It makes the first run much slower and the total backtest longer (which is good!).
To answer some of the speed issue, I have introduced a caching where possible as this reduces the time from the initial start to a cycle. The main focus was the Feature Engineering that I'm now using as a template for any other repeatable use cases in the framework. I'll explain below.
A Simple Analogy: The Gourmet Chef's Prep Station
Imagine you're a gourmet chef preparing a complex 10-course meal. The most time-consuming part is the "mise en place" — chopping all the vegetables, making the sauces, simmering the broths, and preparing the proteins. This takes hours.
Now, imagine if after doing all that prep work, you could perfectly preserve it in a special container. The next day, if someone asks for the exact same 10-course meal, you wouldn't spend hours chopping vegetables again. You would simply open your container and start cooking, turning a multi-hour process into a matter of minutes.
The Feature Engineering Cache in V210 is that special container for the "data prep work".
The Problem: Why a Cache is Necessary
The FeatureEngineer in the V210 script is incredibly powerful, but also computationally expensive. For every single candle in your dataset (which can be millions of rows), it calculates a large number of complex features:
In the previous buggy version, macro data was fetched but kept separate from the main features. As a result, the main feature dataframe (full_df) did not include important macro indicator columns like 'VIX' or 'DXY'. This created a data integration gap: the AI was instructed to use data that the pipeline never actually provided. This led to a KeyError crash whenever the trainer tried to access a missing column such as 'VIX'.
In contrast, the V210 corrected version fetches the macro data and immediately merges it into the main feature dataframe using a pd.merge_asof operation. Now, full_df contains columns for all the fetched macro indicators (including 'VIX', 'DXY', etc.). This change closes the data integration gap, allowing the trainer to successfully access the required columns and proceed as intended.
The bug was not in yfinance's data but in the framework's failure to properly integrate that data into the main pipeline.
The V210 Fix: A New Data Integration Step
The V210 code addresses this bug by adding one crucial step: explicitly merging the macro data into the main feature dataframe.
The corrected data flow in run_single_instance:
I have been testing (and still am) V210 which I have made a number of changes that should improve the overall framework experience.
The main issue I am facing is still 'Analysis Paralysis' so one of the improvements does relate to a response below which is the caching.
(I may have answered some questions already but I'm just going through each that I think I haven't)
Here we go...
@amvt
QuoteDislikedI will like to take a few moments of your time to discuss the potential of what you have created and my own inspirational ideas.
So, from what I understand you are leveraging AI, specifically Gemini to do the following:
1. Create Strategies - Simple/Advanced
2. Back-Test Strategies - Refinement and Optimization (This is where Gemini comes in)
If my interpretation is wrong please correct me, otherwise, I'll assume this is your current architecture.
To be very honest with you, this is truly innovative. LeveragingA.I to accelerate...
Your interpretation is fundamentally correct, with the following clarifications:
The frameworks' architecture utilises the Gemini Large Language Model (LLM) as an external, adaptive control unit for several discrete, high-level functions within the lifecycle. These are executed via the GeminiAnalyzer class:
- Initial Configuration (get_initial_run_setup): The LLM synthesizes market context, historical performance data, and my internal strategy playbook to generate the complete initial run configuration, including dynamic feature selection.
- Adaptive Refinement (analyze_cycle_and_suggest_changes): After each walk-forward cycle, the LLM analyzes the performance, PNL, and SHAP feature importance history to propose targeted modifications to my parameters for the next cycle. This is a core feedback loop.
- Generative Strategy Formulation (propose_new_playbook_strategy, define_gene_pool): When a strategy is quarantined for chronic failure or a generative strategy is selected, the LLM is tasked with inventing a novel strategy definition or defining the parameter space (gene pool) for the GeneticProgrammer subroutine.
- Model Selection (select_best_tradeoff): During hyperparameter optimisation, when my ModelTrainer identifies multiple non-dominated solutions (a Pareto front), the LLM is queried to select the single best trial that aligns with the current OperatingState directive (e.g., CONSERVATIVE_BASELINE, DRAWDOWN_CONTROL).
Contribution Proposal?
I am happy for you to contribute however I feel that that it will be more of a time intensive process. I am already taking a lot of time just working out things for myself in this development stage and that's why I haven't had time to reply in detail to more questions.
Your focus on stress-testing, bias elimination, and overall robustness aligns with my design principles.
The specific components responsible for back-testing and for your contribution can focus at the following classes within my source code:
- Backtester class: This is the primary engine for executing trades. It contains the core event loop that iterates through market data, manages positions, and calculates PNL. The method _calculate_realistic_costs is a specific point of interest for stress-testing, as it models slippage and spread—parameters that could be enhanced with more sophisticated, regime-aware logic.
- PerformanceAnalyzer class: This class ingests the raw trade log and equity curve from the Backtester to compute all final performance metrics (MAR Ratio, Sharpe, Sortino, Max Drawdown, etc.). Enhancing this with additional statistical tests for bias, as proposed by Timothy Masters, would be a logical extension.
- run_monte_carlo_simulation function: This is a supplementary function currently used for forecasting. It could be repurposed or expanded to perform Monte Carlo-based stress tests on the final equity curve to assess its fragility.
Your ABM:
Regarding Agent-Based Models (ABMs) I see there is a distinction between the correlational discovery of my existing models ('The What') and the causal exploration of ABMs ('The Why') is computationally sound and identifies a logical path for architectural enhancement.
Integrating an ABM into my framework can be conceptualised via two primary pathways:
- As a Synthetic Data Generator for Stress-Testing: An ABM could be calibrated to generate artificial market data that exhibits specific, rare properties (e.g., liquidity crises, cascading stop-loss events, herd behaviour). It could then ingest this synthetic data and execute the existing strategies against it. This would provide a robust stress test, evaluating my strategies' performance under market conditions not present in the available historical data. This directly aligns with your objective of improving robustness.
- As a Causal Feature Discovery Engine: This is a more deeply integrated approach. An ABM could be run to identify which micro-level agent behaviours (e.g., a shift from value to momentum agents) lead to observable, macro-level market phenomena (e.g., a volatility spike). The output of the ABM would not be the raw price data, but the proxy indicators of that hidden behaviour. These new "causal proxy features" could be engineered and fed into the FeatureEngineer pipeline. The ModelTrainer would then assess their predictive power via SHAP analysis alongside my existing technical and statistical features. This would effectively bridge the gap between correlation and causation, allowing predictive models to learn from the "Why" you described.
This could be a significant expansion of the frameworks' current capabilities. It would necessitate a new primary class (e.g., ABMEngine) and a new strategy type in the playbook with the flag requires_abm: True, analogous to the existing requires_gp: True for genetic programming.
I have been working on the idea of an AI enhanced MT4/MT5 trading bot for a couple of years now but only recently (mainly due to the LLM updates) have been able to bring a lot of the missing pieces together.
Your ABM is a very good example of where I am with my framework project that just started as a passion to develop an idea not just on paper.
In t he way of reading, no, I haven't looked into anything more apart from the Google documentation and the modules / libraries etc when I have issues.
@idiamond
QuoteDislikedNo, I just tried to create simple EA for starters. I was using the Build menu item in the AI Studio. Since my post I was successful generating mq5 code using just the Chat menu item. Although the Build item in AI Studio seems more powerful?? It that correct?
The reason why I suggest AI Studio is that there is no API limits, I do use the Gemini Web App however that sometimes will limit you. Just find a workflow that works for you. I keep it simple by using Notepad++ for the coding and the terminal / powershell to execute the scripts. I do have a addon in Notepad++ that lets me execute script but that's another story.
@preptraditio / @amvt
In relation to yfinance
1. How V210 Pulls Data from this Source
The framework deliberately avoids using yfinance for its primary, high-frequency OHLCV (Open, High, Low, Close, Volume) data. The only place yfinance is used is within the get_macro_context_data function.
- Source: yfinance API (yf.download).
- Data Pulled: It exclusively fetches supplementary, low-frequency macroeconomic data (e.g., VIX, DXY, 10-Year Treasury Yield), not the core trading instrument data. This is a strategic design choice to limit exposure to yfinance's potential unreliability. The primary trading data is loaded locally (see analysis for Statement 3).
2. Built-in Steps/Actions for Reliable Repeated Use
The framework understands the potential issues with yfinance and has built a robust caching and updating mechanism to mitigate them.
- Action 1: Intelligent Caching: The get_macro_context_data function first checks for a local cache file (macro_data.parquet). If a valid cache exists, it avoids calling the API altogether.
- Action 2: Incremental Updates: If the cache is found but is stale (i.e., the last entry is older than yesterday), the script does not re-download the entire 10-year history. Instead, it calculates the missing date range and calls yf.download only for the new, incremental data. This minimizes API usage, making calls faster and far less likely to be rate-limited or fail.
- Action 3: Metadata Validation: The cache's validity is checked against a metadata file (macro_cache_metadata.json). This file stores the list of tickers and the last date cached. If the required tickers change, the framework knows the cache is invalid and triggers a full rebuild.
- Action 4: Graceful Fallback: If the cache is corrupted or missing, the try-except block ensures the script doesn't crash. It simply logs the issue and proceeds to perform a full, one-time download to rebuild the cache from scratch.
V210 is designed with the unreliability of yfinance in mind. It minimises dependency by using it only for non-critical macro data and wraps the calls in a robust caching layer that reduces API interaction by over 99% on subsequent runs.
"there are alternatives like those of google etc, but free is free and will always have issues with pulling data?'
1. How V210 Pulls Data from this Source
The framework does not use the Gemini API for "pulling data" in the traditional sense (like OHLCV). Instead, it uses this powerful "alternative" for complex, high-level tasks that a simple data API cannot perform.
- Source: Google Gemini API via requests.post.
- Data/Information Pulled:
- Strategic Configuration: Generating initial run parameters (get_initial_run_setup).
- Analysis: Interpreting cycle results to suggest changes (analyze_cycle_and_suggest_changes).
- Classification: Identifying asset classes from ticker symbols (classify_asset_symbols).
- Reasoning: Selecting the best model from a set of options based on a strategic goal (select_best_tradeoff).
2. Built-in Steps/Actions for Reliable Repeated Use
The framework anticipates that this external API can be a point of failure and builds in multiple layers of redundancy and error handling.
- Action 1: Automated Model Fallback: In the _call_gemini method, it defines both a primary_model ("gemini-2.0-flash") and a backup_model ("gemini-1.5-flash"). If the call to the primary model fails for any reason, it automatically retries the same request with the backup model.
- Action 2: Exponential Backoff & Retries: For each model, it attempts the API call multiple times with increasing delays ([5, 15, 30] seconds) if it fails. This handles transient network issues or temporary API outages gracefully.
- Action 3: Robust Response Parsing: The script does not assume the API will return perfect JSON. The _extract_json_from_response method uses JSONDecoder.raw_decode to scan the entire text response from the AI, finding and parsing the first valid JSON object it encounters. This prevents failures caused by the AI adding explanatory text before or after the JSON block.
- Action 4: Graceful Degradation & Validation: Every function that uses the Gemini API checks if the response is valid (e.g., is it a dictionary? does it contain the required keys?). If the API call fails after all retries or returns an invalid structure, the script logs an error and falls back to a default, safe behavior instead of crashing. For example, select_relevant_macro_tickers falls back to a core set of tickers, and get_initial_run_setup returns an empty dictionary, which aborts the run cleanly.
V210 uses the Google API not as a "free data pull" but as a powerful analytical engine. It protects against its potential issues with a sophisticated, multi-layered reliability system that includes model fallbacks, request retries, robust response parsing, and graceful degradation.
"It's better to either automate the OHLCV process directly through MT5/MT4 by linking it to excel or setting up an RPA?"
1. How V210 Pulls Data from this Source
The script's primary data source is local files being the CSV that you should download first from MT4/MT5, not a web API.
- Source: Local .csv or .txt files in the BASE_PATH (the script's directory).
- Data Pulled: The DataLoader.load_and_parse_data function scans the directory for files matching a typical trading platform export format (e.g., EURUSD_H1.csv, XAUUSD_M15.csv). It reads these files to get the core OHLCV data for the entire backtest. This design presupposes that the you have already performed the step recommended: exporting data from a reliable source like MT4/MT5.
2. Built-in Steps/Actions for Reliable Repeated Use
Since the source is local, reliability depends on the ability to consistently parse potentially inconsistent file formats. V210 has several features for this.
- Action 1: Flexible File Parsing: The _parse_single_file method is designed to handle common variations in data exports.
- Action 2: Automated Delimiter Detection: It checks whether the file uses tabs (\t) or commas (,) as delimiters.
- Action 3: Intelligent Column Discovery: It finds the date and time columns by searching for keywords ('DATE', 'TIME') rather than assuming fixed column names.
- Action 4: Column Name Normalization: It automatically cleans and standardizes the column headers (Open, High, Low, Close, RealVolume) to ensure consistency for the rest of the framework.
- Action 5: Graceful Error Handling: The entire file parsing process is wrapped in a try-except block. If one data file is corrupted or in an unreadable format, the framework logs the error, skips that specific file, and continues to load the others, preventing a total failure.
The exact methodology V210 employs for its core data. The framework is architected to consume locally exported files from a source like MT4/MT5, completely avoiding the reliability issues of web APIs for its critical trading data. Its built-in actions ensure this local process is robust and repeatable.
@preptraditio
QuoteDislikedStill testing all the code, I seem to be unable to get a completed run, it is very slow but one question I wonder is:
What is the amount of data we should ingest for each pair? Is it really relevant to dumpt tickers from 2020 or should we focus on more recent data? And additionally, does it affect the speed?
For now I am backtesting one pair, EURUSD and added H1, Daily and M15 timeframes. With version v208 it was four days running non-stop and couldn't deliver any output.
For about two weeks I have not completed a full run yet as I am fixing bugs and finding ways to improve the framework experience.
Question / Simple Answer
How does it test? It uses a "Sliding Window" to test on unseen future data, one chunk at a time.
How does it train? It uses only a recent slice of past data to train a model for the next test period.
What's the 5-fold split for? It's a series of "mini-exams" inside the training data to find the best model settings.
Why doesn't the start date matter? The model only ever sees the recent past, so it's always adapting to current conditions.
How much data is best? At least 3-5 years, to allow for many robust test cycles.
Does old data affect speed? Yes. It makes the first run much slower and the total backtest longer (which is good!).
To answer some of the speed issue, I have introduced a caching where possible as this reduces the time from the initial start to a cycle. The main focus was the Feature Engineering that I'm now using as a template for any other repeatable use cases in the framework. I'll explain below.
A Simple Analogy: The Gourmet Chef's Prep Station
Imagine you're a gourmet chef preparing a complex 10-course meal. The most time-consuming part is the "mise en place" — chopping all the vegetables, making the sauces, simmering the broths, and preparing the proteins. This takes hours.
Now, imagine if after doing all that prep work, you could perfectly preserve it in a special container. The next day, if someone asks for the exact same 10-course meal, you wouldn't spend hours chopping vegetables again. You would simply open your container and start cooking, turning a multi-hour process into a matter of minutes.
The Feature Engineering Cache in V210 is that special container for the "data prep work".
The Problem: Why a Cache is Necessary
The FeatureEngineer in the V210 script is incredibly powerful, but also computationally expensive. For every single candle in your dataset (which can be millions of rows), it calculates a large number of complex features:
- Standard Indicators: RSI, MACD, Bollinger Bands, ATR, ADX, etc.
- Statistical Features: Skew, Kurtosis, Rolling Betas, Quantiles.
- Advanced Signal Processing: Hilbert Transforms (for cycle analysis), Fourier Transforms, and Wavelet Transforms (pywt).
- Econometric Models: GARCH volatility models (arch).
- Fractal Analysis: The Hurst Exponent (hurst).
- Contextual Features: Merging data from higher timeframes (e.g., adding Daily context to an H1 chart).
- Dimensionality Reduction: Potentially running PCA on a subset of features.
Running this entire process from scratch on a large dataset can take a significant amount of time—from several minutes to even hours, depending on the data size and CPU power. If you want to run the framework multiple times to test different AI prompts or risk parameters, waiting for this step every single time would be a major bottleneck.
The V210 Solution: How the Cache Works
The caching mechanism is designed to be smart and automatic. It doesn't just save the data; it validates if the saved data is still relevant. Here's the step-by-step process:
Step 1: The First Run (A "Cache Miss")
- Full Process: The first time you run the script (or after you've changed the input data), the framework detects there is no valid cache. It executes the entire FeatureEngineer.create_feature_stack() process.
- Save the Results: Once the final, feature-rich dataframe (full_df) is created, the script saves it to the Cache/ directory as a highly efficient binary file: feature_cache.parquet.
- Create a "Receipt": Crucially, it also saves a "receipt" of this process in feature_cache_metadata.json. This metadata file, generated by _generate_cache_metadata(), contains a snapshot of the exact conditions under which the cache was created.
Step 2: The "Receipt" (Metadata Validation)
The metadata "receipt" stores two critical pieces of information:
- A Fingerprint of the Input Files: It records the name, file size, and last modification time of every single raw data file (.csv, .txt) used.
- A Fingerprint of the Engineering Parameters: It records the key settings in the configuration that would change the output of the feature engineering process (e.g., BOLLINGER_PERIOD, USE_PCA_REDUCTION, HAWKES_KAPPA, timeframe roles, etc.).
Step 3: Subsequent Runs (A "Cache Hit")
On every subsequent run, the framework performs a check before starting the expensive engineering:
- Generate Current Metadata: It creates a new metadata dictionary based on the current state of the data files and configuration.
- Compare Receipts: It compares this new, in-memory metadata with the feature_cache_metadata.json file saved on disk.
- Decision Time:
- IF they are identical: This is a "cache hit". The framework knows with certainty that neither the input data nor the calculation methods have changed. It completely skips the entire create_feature_stack process and instead loads the feature_cache.parquet file directly into memory.
- IF they are different: This is a "cache stale" condition. The framework knows something has changed (e.g., you added new historical data, or the AI changed the BOLLINGER_PERIOD). It discards the old cache as invalid and proceeds to re-run the full feature engineering process, creating a new cache and metadata file at the end.
Why This Helps with Processing and Run Times
The benefits of this caching system are enormous for iterative development and testing:
- Drastic Reduction in Run Time: This is the primary benefit. A process that might take 15 minutes on the first run will take less than 5 seconds on all subsequent runs (as long as the inputs don't change). It transforms the workflow from being painfully slow to nearly instantaneous for repeated tests.
- Faster Iteration and Experimentation: You can now tweak things that happen after feature engineering without penalty. For example, you can:
- Test different AI prompts and analysis_notes.
- Change risk parameters like MAX_DD_PER_CYCLE.
- Experiment with the number of OPTUNA_TRIALS.
- Try different LABELING_METHODs.
All these changes will trigger near-instant runs because the feature engineering step is skipped, allowing for rapid experimentation.
- Reduced CPU and Memory Load: Feature engineering is CPU-intensive. By loading a pre-computed file, the script uses significantly fewer system resources on subsequent runs, making it friendlier to less powerful machines or multi-tasking environments.
- Consistency and Reproducibility: By using a cache, you guarantee that the feature set is identical across multiple runs that share the same inputs. This is scientifically valuable because it removes a source of potential randomness, making it easier to isolate the impact of the changes you are making (e.g., in the model training or backtesting logic).
The benefits of this caching system are enormous for iterative development and testing:
- Drastic Reduction in Run Time: This is the primary benefit. A process that might take 15 minutes on the first run will take less than 5 seconds on all subsequent runs (as long as the inputs don't change). It transforms the workflow from being painfully slow to nearly instantaneous for repeated tests.
- Faster Iteration and Experimentation: You can now tweak things that happen after feature engineering without penalty. For example, you can:
- Test different AI prompts and analysis_notes.
- Change risk parameters like MAX_DD_PER_CYCLE.
- Experiment with the number of OPTUNA_TRIALS.
- Try different LABELING_METHODs.
All these changes will trigger near-instant runs because the feature engineering step is skipped, allowing for rapid experimentation.
- Reduced CPU and Memory Load: Feature engineering is CPU-intensive. By loading a pre-computed file, the script uses significantly fewer system resources on subsequent runs, making it friendlier to less powerful machines or multi-tasking environments.
- Consistency and Reproducibility: By using a cache, you guarantee that the feature set is identical across multiple runs that share the same inputs. This is scientifically valuable because it removes a source of potential randomness, making it easier to isolate the impact of the changes you are making (e.g., in the model training or backtesting logic).
QuoteDislikedWhy VIX is missing
- VIX is only fetched inside the helper get_macro_context_data() (see ~line 3485 ff.).
That function builds a dictionary for reporting to the language-model, but the series is never merged into the price/feature DataFrame that you later pass into train().- When the AI later proposes a new strategy it blindly inserts "VIX" into selected_features, so the config that reaches the trainer contains a column name that the data pipeline never created.
It feels like an underlying bug from YFinance?
In the previous buggy version, macro data was fetched but kept separate from the main features. As a result, the main feature dataframe (full_df) did not include important macro indicator columns like 'VIX' or 'DXY'. This created a data integration gap: the AI was instructed to use data that the pipeline never actually provided. This led to a KeyError crash whenever the trainer tried to access a missing column such as 'VIX'.
In contrast, the V210 corrected version fetches the macro data and immediately merges it into the main feature dataframe using a pd.merge_asof operation. Now, full_df contains columns for all the fetched macro indicators (including 'VIX', 'DXY', etc.). This change closes the data integration gap, allowing the trainer to successfully access the required columns and proceed as intended.
The bug was not in yfinance's data but in the framework's failure to properly integrate that data into the main pipeline.
The V210 Fix: A New Data Integration Step
The V210 code addresses this bug by adding one crucial step: explicitly merging the macro data into the main feature dataframe.
The corrected data flow in run_single_instance:
- Fetch Macro Data (Same as before):
The function is called and returns the macro_df dataframe containing VIX and other tickers.
Inserted Code# V210 - run_single_instance() logger.info("-> Integrating macroeconomic data as features...") macro_df = get_macro_context_data(tickers=ai_selected_tickers, ...) - THE FIX - The Merge Operation (This is the new, critical code):
Immediately after fetching the macro_df, the script now checks if the dataframe is not empty and then performs a time-series-aware merge into full_df.
Inserted Code# V210 - run_single_instance() if not macro_df.empty: full_df.reset_index(inplace=True) # This is the line that fixes the bug. full_df = pd.merge_asof(full_df.sort_values('Timestamp'), macro_df.sort_values('Timestamp'), on='Timestamp', direction='backward') full_df.set_index('Timestamp', inplace=True) else: logger.warning(" - No macro data to merge. Macro features will be unavailable.") - What pd.merge_asof does: It intelligently joins the two dataframes on the Timestamp column. For each row in full_df, it looks backward in time in macro_df and finds the most recent macro data available, joining it as new columns (like 'VIX', 'DXY', etc.). This ensures every candle in the main dataset now has the corresponding macro context as a feature.
- Consequence:
The full_df dataframe, which is the single source of truth for all subsequent training and testing, now actually contains a 'VIX' column. - AI Suggestion (Now Harmless):
When the AI suggests using "VIX" in the selected_features, it is now a completely valid request because the feature exists in the data pipeline. - Successful Training (No Crash):
When the ModelTrainer receives the config and slices the dataframe (df_train[feature_list]), it successfully finds the 'VIX' column, and the program continues without a KeyError.
In summary:
- The previous version failed due to missing integration of macro data, causing crashes.
- The corrected version merges macro data properly, preventing errors and enabling smooth operation.
2
- #92
- Edited 5:53pm Jun 23, 2025 5:06pm | Edited 5:53pm
- Joined Sep 2020 | Status: Strategy and Risk Manager | 164 Posts
The latest updated V210 (24/06/2025)
- End_To_End_Advanced_ML_Trading_Framework_PRO_V210_Linux.py.txt
I am still testing so you may find bugs.
V210 Update: The Proactive & Creative Framework
The upgrade from V209 to V210 marks a fundamental shift in the framework's intelligence. Where V209 reacted to problems, V210 proactively adapts its entire personality based on performance. This is achieved through a major architectural evolution, shifting from a reactive intervention model to a proactive, state-driven framework.
The key theme is the introduction of a Dynamic Operating State Machine, which allows the system to fundamentally alter its risk posture and objectives based on its performance. This new system replaces the more simplistic EarlyInterventionConfig of V209.
Furthermore, V210 has been given a creative spark, introducing entirely new, advanced strategy generation paradigms. This includes the integration of Genetic Programming and MiniRocket (from sktime), greatly expanding the framework's analytical capabilities.
1. A Smarter, More Adaptable Brain: The New Operating State System
This is the most significant enhancement, moving from a simple alarm system to an intelligent, adaptive core that changes its behaviour based on results.
- What Changed? The framework now operates in one of three distinct "moods" or "states". This is managed by a new OperatingState Enum class:
- CONSERVATIVE_BASELINE: The default starting state. Its goal is to be cautious, protect capital, and find a stable, low-risk way to trade.
- AGGRESSIVE_EXPANSION: Activated after the system has been consistently profitable. It becomes more confident, taking on more risk to maximize profits.
- DRAWDOWN_CONTROL: Triggered automatically after a period of losses. It becomes extremely defensive, cutting risk and trade size to protect the account. A new _apply_operating_state_rules function is called each cycle to enforce the rules for the current state.
- Why the Change? The old system was reactive; it only tried to fix things after several failures. The new system is proactive, constantly adjusting its behaviour to match the current performance "temperature." This leads to smarter risk management—preserving capital during rough patches and capitalizing on winning streaks.
- What's the Difference? The framework is no longer a one-size-fits-all tool. It automatically changes key risk parameters like the maximum allowed drawdown (max_dd_per_cycle) and the percentage of capital risked per trade (base_risk_pct) using a new STATE_BASED_CONFIG dictionary. The AI is also aware of these states via a new establish_strategic_directive method, making its strategic advice far more relevant to whether the goal is caution, aggression, or defence.
2. Creative Strategy Generation: Evolving and Discovering New Ideas
The framework is no longer limited to its pre-defined playbook. It can now invent and discover new ways to trade.
- What Changed? Two powerful, cutting-edge strategy methods have been added:
- Genetic Programming: The framework can now invent and evolve its own trading rules from scratch. A new GeneticProgrammer class runs a survival-of-the-fittest simulation, and the AI method define_gene_pool provides the basic building blocks (like "RSI", "ADX", "is greater than") to start the process.
- MiniRocket Integration: A highly advanced technique from the sktime library has been integrated. The MiniRocket transform is exceptionally good at detecting complex patterns in market data that traditional indicators might miss.
- Why the Change? This was done to prevent the system from getting "stuck" using the same strategies that may no longer work in new market conditions. These additions give the framework a creative engine to find novel sources of profit.
- What's the Difference? The framework can now generate entirely new strategies that are not in its original playbook, such as GeneticTrendFollower or MiniRocketVolatility. If it gets into a rut, it can now try to invent its way out, leading to a more resilient and adaptive trading process over the long term.
3. Better Decision-Making: Prioritizing What Matters Most
The way the system optimizes its models has been completely overhauled to find a better balance between risk and reward.
- What Changed? Instead of optimizing for a single goal, the training process in the _optimize_hyperparameters function now balances two objectives at once. For example, it can now be told to find a model that maximizes the F1-score (a measure of accuracy) while also maximizing the number of trades it takes. This is known as multi-objective optimization.
- Why the Change? Real-world trading success is never about just one metric. A strategy that is highly profitable but also extremely risky is often undesirable. This new approach allows the framework to find models that represent a much healthier and more realistic balance.
- What's the Difference? The AI's role has been elevated. When the optimization process finds several good "trade-off" models, a new AI method called select_best_tradeoff is used. The AI analyses the options and picks the single best one that aligns with the framework's current OperatingState. This ensures the final model chosen for trading is perfectly aligned with the strategic goal of the moment.
- Clarification on Performance Metrics (Sharpe vs. Calmar) Calmar Ratio is a key metric. The framework uses both Sharpe and Calmar ratios, but for different purposes:
- Final Reporting: The PerformanceAnalyzer calculates both the Sharpe Ratio and the Calmar Ratio as standard metrics to evaluate the overall performance of a completed run. Both are included in the final text report.
- Model Optimization: The Calmar Ratio can also be used as a primary training objective. The framework's dynamic optimization system can be set to maximize_calmar during the model training phase, depending on the active OperatingState.
4. Streamlining & Spring Cleaning
To make way for the new systems, outdated and overly complex parts of the code were removed.
- What Changed?
- The old, simplistic EarlyInterventionConfig system from V209 was completely removed.
- All code and dependencies related to the experimental LSTM model (tensorflow, keras) have been deleted.
- The psutil library and the complex IncrementalPCA logic for handling very large datasets were removed and simplified.
- Why the Change? The old intervention system was made obsolete by the much more advanced OperatingState machine. Removing unused code makes the framework faster, more reliable, and easier to maintain.
- What's the Difference? The framework is now leaner, more focused, and more efficient. By clearing out the old components, the new, more intelligent systems can operate without being burdened by legacy code.
***EDITED UPDATE***
Smaller update included but not mentioned above
One of the most significant advancements in V210 is the new AI self-diagnostic logic.
The Core Idea: An "AI Doctor" for the Trading Bot
Imagine the trading framework is a high-performance athlete. Sometimes, despite training hard, the athlete gets sick or injured.
- The Old Way: You might just tell the athlete to "try harder" or randomly change their diet, hoping something works. This is inefficient.
- The New V210 Way: The framework now has an "AI Doctor" on call. When the athlete (the framework) starts to fail, it doesn't just guess. It stops, runs a full diagnostic, and sends the "patient chart" to the AI Doctor for an expert opinion and a specific prescription.
This "AI Doctor" is the new self-diagnostic tool. It's not a single function, but a system of intelligent, targeted interventions where the framework and the AI collaborate to solve problems.
The 3-Step Diagnostic Loop
The entire logic follows a simple but powerful three-step loop:
- The Framework Detects a Specific Problem: The Python code is the "medical monitor." It's programmed to recognize specific, known failure patterns (e.g., "the model quality is too low," or "the strategy has stopped trading").
- The Framework Gathers Evidence (The "Patient Chart"): It doesn't just tell the AI "it's broken." It collects all the relevant data about the failure. This includes recent performance metrics, model training statistics, the current configuration, and what the market looks like.
- The AI Diagnoses and Prescribes a Solution (The "Prescription Pad"): The framework sends this "patient chart" to the Gemini AI with a highly structured prompt. Crucially, it asks the AI to choose from a limited menu of specific, actionable solutions. The AI acts as the specialist, using its vast knowledge to analyse the evidence and choose the most logical prescription.
Key Scenarios Where the AI Doctor is Called
Here are the most important real-world examples of this diagnostic logic in action within V210:
Scenario 1: The Mid-Cycle Training Failure
- The Trigger: The ModelTrainer tries to build a new model for the next cycle, but after 2 attempts, the model is still not good enough (its F1 Score is below the quality gate).
- The "Patient Chart" (Evidence Sent to AI):
- Heuristic Pre-Analysis: A summary from the framework saying, "I think the problem is X." For example: "Heuristic Pre-Analysis suggests a Fundamental Labelling Issue: The R:R ratio of 3:1 is too hard and only 1% of data is being labelled as a trade."
- Raw Failure Data: A log of the failed training attempts.
- Current Configuration: The exact parameters that are failing.
- The "Prescription Pad" (AI's Options): The AI is asked to choose one of these actions:
- ADJUST_LABELING_DIFFICULTY: "Make the task easier. Suggest a more achievable R:R ratio (e.g., 1.5:1)."
- REFINE_FEATURE_SET: "The features are too noisy. Suggest a new, cleaner list of features."
- CHANGE_LABELING_METHOD: "The 'standard' way of defining trades isn't working. Switch to a more adaptive method."
- SWITCH_STRATEGY: "This strategy is fundamentally broken for this market. Pick a new, simpler one from the playbook.
- The Outcome: The framework receives the AI's choice (e.g., ADJUST_LABELING_DIFFICULTY with new parameters) and immediately applies it, then retries the training.
Scenario 2: The Baseline Establishment Failure
- The Trigger: The framework is in its initial CONSERVATIVE_BASELINE state, but for 2+ cycles, it has either failed to train a model or the model it trained executed zero trades. This is a critical failure—the system is stuck.
- The "Patient Chart": Similar to the above, but the pre-analysis highlights the "failure to trade" as the primary symptom. This tells the AI that the model is being too conservative.
- The "Prescription Pad": The same options as above, but now the AI is heavily guided to choose an option that encourages more trading, like making the labelling easier or switching to a less complex strategy.
- The Outcome: A major change is applied to the configuration to "un-stick" the framework and get it trading.
Scenario 3: The Chronic Strategy Failure ("Quarantine")
This is a higher-level diagnostic for when a strategy has proven to be a consistent loser over time.
- The Trigger: A strategy has failed so many times that the framework's internal logic adds it to a quarantine_list.
- The "Patient Chart": The AI is shown the history of failures, the list of quarantined strategies, and the playbook of available, healthy strategies it can choose from. It's also shown the "personal best" configuration from the current run, which acts as a safe anchor point.
- The "Prescription Pad":
- Revert: "Play it safe. Go back to the best-performing strategy from this run."
- Explore: "The current approach is flawed. Pick a new, simpler strategy from the available playbook."
- Invent: (A special, advanced option) "This is a chance for innovation. Invent a brand new hybrid strategy from scratch."
- The Outcome: The framework makes a major strategic pivot for the next cycle, either resetting to a known good state or attempting a radical new approach to break the losing streak.
Why This Self-Diagnostic Logic is Powerful
- Moves from Brute-Force to Intelligence: Instead of randomly trying new parameters, the framework makes targeted, evidence-based changes.
- Solves Real Bottlenecks: It directly addresses the most common and frustrating problems in automated strategy development, like models that won't train or strategies that never trade.
- Dynamic and Adaptive: The "prescriptions" are not hard-coded. The AI's decision is based on the current market data and performance history, making the solution relevant to the present moment.
- Creates a Learning Loop: By analysing its own failures and taking corrective action, the framework learns over time what kind of strategies and parameters work, improving its long-term resilience and performance.
Attached File(s)
1
- #93
- Jun 24, 2025 12:36am Jun 24, 2025 12:36am
- Joined Sep 2020 | Status: Strategy and Risk Manager | 164 Posts
DislikedThe latest updated V210 (24/06/2025) End_To_End_Advanced_ML_Trading_Framework_PRO_V210_Linux.py.txt I am still testing so you may find bugs. V210 Update: The Proactive & Creative Framework The upgrade from V209 to V210 marks a fundamental shift in the framework's intelligence. Where V209 reacted to problems, V210 proactively adapts its entire personality based on performance. This is achieved through a major architectural evolution, shifting from a reactive intervention model to a proactive, state-driven framework. The key theme is the introduction...Ignored
Where I am currently...(V210)
After hours of 'testing' and some rest I think I have a solution for the 'no trades' issue, obviously I am testing this theory as I write this - just thinking out loud.
The current framework has created a "never-ending carrot and stick scenario,".
Here is a breakdown of why this analogy is so fitting:
- The "Carrot" (A High F1 Score): The _find_best_threshold function is programmed to be a perfectionist. Its only goal is to find the confidence level that produces the highest possible F1-score (a measure of accuracy) on the validation data.
- The "Stick" (An Unrealistic Gate): To achieve that "perfect" F1-score, the function looks at the model's predictions and realises that the only way to be highly accurate is to ignore most signals and only consider the handful of predictions where the model was exceptionally confident (e.g., >90%). It therefore raises the confidence gate to this extreme level.
- The Result (The Stick Hits): The framework passes the training phase with a theoretically high-quality model. However, when it enters the forward-test with live data, it never finds a new signal that meets this impossibly high confidence bar. The result is zero trades.
The cycle then repeats, constantly chasing the "carrot" of a perfect F1-score, only to be punished by the "stick" of a confidence gate that prevents any real-world action.
The Solution
The proposal is to decouple the gate and use a static, hardcoded value is a way to break this cycle and diagnose the true issue.
By setting a static gate (e.g., 70%), we are no longer asking the framework to be perfect. Instead, we are forcing it to answer a much more direct and useful question:
"Forgetting about a perfect F1-score, is your model fundamentally capable of producing any predictions that clear a reasonable, consistent hurdle of 70% confidence?"
This will give us a clear answer:
- If it starts trading, we know the model itself is viable and the dynamic threshold-finding logic was the component causing the paralysis.
- If it still doesn't trade, we know the problem is deeper—the model is truly incapable of generating any signals with reasonable confidence, and we need to look at the features or strategy again.
Updates to follow.
1
I am getting this error when running with EURUSD datafiles from 2020 exported from MT5.
Update, this can be fixed by replacing _calculate_hurst_exponent function with this one:
Inserted Code
2025-06-24 16:17:35,858 - INFO - Attempting to call Gemini API with model: gemini-2.0-flash 2025-06-24 16:17:36,852 - INFO - Successfully received and extracted text response from model: gemini-2.0-flash 2025-06-24 16:17:36,853 - INFO - Successfully extracted JSON object using JSONDecoder.raw_decode. 2025-06-24 16:17:36,853 - INFO - - AI selected 4 relevant tickers. 2025-06-24 16:17:36,853 - INFO - -> Feature Caching is ENABLED. Checking for a valid cache... 2025-06-24 16:17:36,854 - INFO - - No valid cache found. Engineering features... 2025-06-24 16:17:36,854 - INFO - -> Stage 2: Engineering Features... 2025-06-24 16:17:36,859 - INFO - - [1/1] Processing all features for symbol: EURUSD 2025-06-24 16:20:12,403 - CRITICAL - A critical, unhandled error occurred during run 1: must be real number, not tuple Traceback (most recent call last): File "/home/geminiBot/End_To_End_Advanced_ML_Trading_Framework_PRO_V210_Linux.py", line 4805, in main run_single_instance(fallback_config, framework_history, playbook, nickname_ledger, directives, api_interval_seconds) File "/home/geminiBot/End_To_End_Advanced_ML_Trading_Framework_PRO_V210_Linux.py", line 4315, in run_single_instance full_df = fe.create_feature_stack(data_by_tf) File "/home/geminiBot/End_To_End_Advanced_ML_Trading_Framework_PRO_V210_Linux.py", line 2012, in create_feature_stack processed_symbol_df = self._process_single_symbol_stack(symbol_specific_data) File "/home/geminiBot/End_To_End_Advanced_ML_Trading_Framework_PRO_V210_Linux.py", line 1867, in _process_single_symbol_stack df = self._calculate_hurst_exponent(df) # This now creates 'hurst_exponent' and 'hurst_intercept' File "/home/geminiBot/End_To_End_Advanced_ML_Trading_Framework_PRO_V210_Linux.py", line 1757, in _calculate_hurst_exponent results = df['Close'].rolling(window=window).apply(apply_hurst, raw=False) File "/home/geminiBot/.venv/lib/python3.10/site-packages/pandas/core/window/rolling.py", line 2043, in apply return super().apply( File "/home/geminiBot/.venv/lib/python3.10/site-packages/pandas/core/window/rolling.py", line 1503, in apply return self._apply( File "/home/geminiBot/.venv/lib/python3.10/site-packages/pandas/core/window/rolling.py", line 617, in _apply return self._apply_blockwise(homogeneous_func, name, numeric_only) File "/home/geminiBot/.venv/lib/python3.10/site-packages/pandas/core/window/rolling.py", line 470, in _apply_blockwise return self._apply_series(homogeneous_func, name) File "/home/geminiBot/.venv/lib/python3.10/site-packages/pandas/core/window/rolling.py", line 454, in _apply_series result = homogeneous_func(values) File "/home/geminiBot/.venv/lib/python3.10/site-packages/pandas/core/window/rolling.py", line 612, in homogeneous_func result = calc(values) File "/home/geminiBot/.venv/lib/python3.10/site-packages/pandas/core/window/rolling.py", line 609, in calc return func(x, start, end, min_periods, *numba_args) File "/home/geminiBot/.venv/lib/python3.10/site-packages/pandas/core/window/rolling.py", line 1530, in apply_func return window_func(values, begin, end, min_periods) File "aggregations.pyx", line 1423, in pandas._libs.window.aggregations.roll_apply TypeError: must be real number, not tuple
Update, this can be fixed by replacing _calculate_hurst_exponent function with this one:
Inserted Code
def _calculate_hurst_exponent(self, df: pd.DataFrame, window: int = 100) -> pd.DataFrame:
"""
Calculates the Hurst Exponent (H) and the intercept (c) on a rolling basis.
H indicates the tendency of a time series (trending vs. mean-reverting).
c (the intercept) can be a supplementary feature.
"""
if not HURST_AVAILABLE: return df
# Initialize columns
df['hurst_exponent'] = np.nan
df['hurst_intercept'] = np.nan
# Get the Close prices
close_prices = df['Close'].values
# Only process if we have enough data
if len(close_prices) < window:
return df
# Calculate Hurst for each window
for i in range(window, len(close_prices) + 1):
try:
window_data = close_prices[i-window:i]
H, c, _ = compute_Hc(window_data, kind='price', simplified=True)
df.iloc[i-1, df.columns.get_loc('hurst_exponent')] = H
df.iloc[i-1, df.columns.get_loc('hurst_intercept')] = c
except Exception as e:
# Skip this window if there's an error
continue
# Forward fill any NaN values
df['hurst_exponent'].fillna(method='ffill', inplace=True)
df['hurst_intercept'].fillna(method='ffill', inplace=True)
return df - #95
- Jun 24, 2025 4:08pm Jun 24, 2025 4:08pm
- Joined Sep 2020 | Status: Strategy and Risk Manager | 164 Posts
DislikedI am getting this error when running with EURUSD datafiles from 2020 exported from MT5. 2025-06-24 16:17:35,858 - INFO - Attempting to call Gemini API with model: gemini-2.0-flash 2025-06-24 16:17:36,852 - INFO - Successfully received and extracted text response from model: gemini-2.0-flash 2025-06-24 16:17:36,853 - INFO - Successfully extracted JSON object using JSONDecoder.raw_decode. 2025-06-24 16:17:36,853 - INFO - - AI selected 4 relevant tickers. 2025-06-24 16:17:36,853 - INFO - -> Feature Caching is ENABLED. Checking for a valid cache......Ignored
Other things you may or may not find are float32 and tuple errors but from my testing I haven't had any of these since V208.
Unfortunately the framework (the version I am testing) has over 5000 lines of code and I miss a few things every now and then as I am still trying to improve the signal quality to at least trade after the training.
1
Disliked{quote} The hurst was calculation was one of the issues I had too initially and to keep an eye on as you further develop the Feature Engineer. Other things you may or may not find are float32 and tuple errors but from my testing I haven't had any of these since V208. Unfortunately the framework (the version I am testing) has over 5000 lines of code and I miss a few things every now and then as I am still trying to improve the signal quality to at least trade after the training.Ignored
if you haven't come accross yet, maybe replace plot_shap_summary with:
Inserted Code
def plot_shap_summary(self, shap_summary: pd.DataFrame):
plt.style.use('seaborn-v0_8-darkgrid')
plt.figure(figsize=(12, 10))
# Check if shap_summary is empty or has no data
if shap_summary.empty or len(shap_summary) == 0:
plt.text(0.5, 0.5, 'No SHAP data available\n(No trades were executed)',
ha='center', va='center', transform=plt.gca().transAxes,
fontsize=14, color='gray')
plt.axis('off')
else:
# Only plot if we have data
shap_summary.head(20).sort_values(by='SHAP_Importance').plot(kind='barh', legend=False, color='mediumseagreen')
plt.xlabel("Mean Absolute SHAP Value", fontsize=12)
plt.ylabel("Feature", fontsize=12)
title_str = f"{self.config.nickname or self.config.REPORT_LABEL} ({self.config.strategy_name}) - Aggregated Feature Importance"
plt.title(title_str, fontsize=16, weight='bold')
plt.tight_layout()
try:
plt.savefig(self.config.SHAP_PLOT_PATH)
plt.close()
logger.info(f" - SHAP summary plot saved to: {self.config.SHAP_PLOT_PATH}")
except Exception as e:
logger.error(f" - Failed to save SHAP plot: {e}") And then:
Inserted Code
pd.DataFrame.from_dict(shap_history, orient='index').mean(axis=1).sort_values(ascending=False).to_frame('SHAP_Importance') With:
Inserted Code
pd.DataFrame.from_dict(shap_history, orient='index').mean(axis=1).sort_values(ascending=False).to_frame('SHAP_Importance') if shap_history else pd.DataFrame(columns=['SHAP_Importance']) Will rerun it, also, aren't you maybe considering that with the amount of models it run maybe the issue is not to make it faster by reducing the threshold for the carrot-stick-scenario but to increase the computing power, seeing this probably needs like 32CPUs to properly run quick. We could use Spot VM for example in Google Cloud to execute the script download the output and just pay for the time it is executing, that way we could setup easily a VM with like 32-64CPUs and 64GB RAM, maybe with that it actually manages to generate an output?
- #97
- Edited 3:22am Jun 25, 2025 1:18am | Edited 3:22am
- Joined Sep 2020 | Status: Strategy and Risk Manager | 164 Posts
Disliked{quote} Where I am currently...(V210) After hours of 'testing' and some rest I think I have a solution for the 'no trades' issue, obviously I am testing this theory as I write this - just thinking out loud. The current framework has created a "never-ending carrot and stick scenario,". Here is a breakdown of why this analogy is so fitting: The "Carrot" (A High F1 Score): The _find_best_threshold function is programmed to be a perfectionist. Its only goal is to find the confidence level that produces the highest possible F1-score (a measure of accuracy)...Ignored
This probably should be V211 but it is what it is.
I've been rubbish with the naming convention but what happened was I was testing on my Windows PC and then switched to my Linux VPS. Hence the 'Linux' but the .py is actually cross-platform.
There's a few new imports I've been testing which is why I haven't had an update. My VPS only has 8Gb of RAM so the whole project is based on building as much as I can into a small system with efficiency in mind and to ensure that most people can run this framework without getting OOM (Out Of Memory Errors).
- End_To_End_Advanced_ML_Trading_Framework_PRO_V210.py.txt
This V210 release further refines and expands upon that foundation, focusing on deeper market understanding, more robust model training, increased operational intelligence, and enhanced realism.
I. Enhanced Adaptability & Intelligence
The framework's ability to adapt to market conditions and internal states has been significantly upgraded:
- Advanced Operating States:
- What's Different: The OperatingState Enum has been expanded beyond CONSERVATIVE_BASELINE, AGGRESSIVE_EXPANSION, and DRAWDOWN_CONTROL. Two new states have been added:
- OPPORTUNISTIC_SURGE: To capitalize on sudden market volatility spikes.
- MAINTENANCE_DORMANCY: To pause trading during predictable low-liquidity periods (e.g., weekends, year-end holidays) or for system maintenance.
- How It Affects: The framework can now enter more nuanced behavioral modes. For example, it can temporarily increase risk and trade frequency during a detected volatility surge or automatically pause operations during unfavorable market hours.
- Why Made: To allow more granular control over the framework's risk-taking and activity levels, aligning its behavior more closely with real-world market dynamics and operational best practices.
- What to Expect: More intelligent risk management, potential for capturing short-term opportunities in volatile markets, and increased operational stability by avoiding trading in known adverse conditions.
- What's Different: The OperatingState Enum has been expanded beyond CONSERVATIVE_BASELINE, AGGRESSIVE_EXPANSION, and DRAWDOWN_CONTROL. Two new states have been added:
- "AI Doctor": Advanced Root-Cause Analysis for Training Failures:
- What's Different: The GeminiAnalyzer.propose_mid_cycle_intervention method has been overhauled. When training failures occur, the framework now:
- Conducts feature learnability tests (using Mutual Information) to see if selected features have predictive power for the current labels.
- Generates a label distribution report to identify severe class imbalances.
- Passes this detailed diagnostic summary to the AI.
- The AI's prompt is redesigned to act as an "AI Doctor," guiding it to diagnose the root cause and prescribe more targeted interventions (e.g., RUN_DIAGNOSTIC_ENSEMBLE to test baseline learnability, ADJUST_LABELING_DIFFICULTY, TEST_SHORT_HORIZON_LABELS).
- How It Affects: Instead of generic retries or simple parameter tweaks, the AI can now make more informed decisions based on data-driven diagnostics about why a model might be failing to train.
- Why Made: To significantly improve the framework's ability to self-heal and overcome persistent training issues, moving beyond simple trial-and-error.
- What to Expect: Fewer "stuck" runs where the model repeatedly fails to train. The AI is more likely to identify and suggest fixes for fundamental issues like poor feature relevance or problematic label definitions. This is a unique feature enhancing the framework's autonomy.
- What's Different: The GeminiAnalyzer.propose_mid_cycle_intervention method has been overhauled. When training failures occur, the framework now:
- Dynamic Indicator Parameterization:
- What's Different: A new DYNAMIC_INDICATOR_PARAMS dictionary in ConfigModel allows core indicator parameters (like Bollinger Band period/std-dev, RSI period) to be automatically adjusted based on the prevailing market regime (a combination of volatility and trend, e.g., "HighVolatility_Trending", "LowVolatility_Ranging"). The FeatureEngineer now uses these dynamic parameters.
- How It Affects: Indicators like RSI and Bollinger Bands will behave differently in different market types, potentially making them more effective.
- Why Made: Standard fixed-parameter indicators often perform poorly across diverse market conditions. This change allows indicators to self-optimize their sensitivity.
- What to Expect: Features generated by these indicators should be more adaptive and potentially more predictive, as their lookback periods and sensitivity levels will better match the current market "personality".
II. Sophisticated Feature Engineering & Selection
Major improvements have been made to how features are created, selected, and managed:
- New Microstructure & Advanced Volatility Features:
- What's Different: The FeatureEngineer now calculates a suite of new, more advanced features:
- Microstructure: Volatility Displacement (_calculate_displacement), Price Gaps (_calculate_gaps), Candle Info (_calculate_candle_info).
- Alternative Volatility Estimators: Parkinson Volatility, Yang-Zhang Volatility.
- Trend/Reversal Proxies: KAMA-based Trend (_calculate_kama_regime), Trend Pullbacks (_calculate_trend_pullback_features), Momentum Divergences (_calculate_divergence_features).
- How It Affects: The model has access to a much richer and more nuanced set of inputs, capturing subtle market dynamics beyond standard technical indicators.
- Why Made: To provide the ML models with deeper insights into market structure, short-term price action, and true volatility, aiming for more robust signal generation.
- What to Expect: Potentially improved model performance due to more informative features. The feature set is now significantly more comprehensive.
- What's Different: The FeatureEngineer now calculates a suite of new, more advanced features:
- Kalman Filtering for Signal Smoothing:
- What's Different: Key indicators like RSI, ADX, and Stochastic %K are now smoothed using a Kalman Filter (_apply_kalman_filter in FeatureEngineer).
- How It Affects: This reduces noise in these indicators, potentially leading to more stable and reliable signals derived from them.
- Why Made: Raw indicators can be choppy and generate false signals. Kalman filtering provides an adaptive way to denoise them.
- What to Expect: Features derived from these smoothed indicators might be less prone to whipsaws and could improve model stability. This is a unique signal processing step integrated into the feature pipeline.
- Advanced Feature Selection Methods:
- What's Different:
- A new FEATURE_SELECTION_METHOD configuration option allows choosing between 'trex' (TRexSelector algorithm) or 'mutual_info' (existing MI-based selection).
- TRexSelector (_select_features_with_trex in ModelTrainer) has been integrated, offering a sophisticated, FDR-controlled method for identifying relevant features.
- The Mutual Information based selection (_select_elite_features) has also been refined.
- How It Affects: The framework can now employ more advanced techniques to automatically select the most potent features for the model, reducing noise and dimensionality.
- Why Made: To improve model generalization, reduce overfitting, and potentially speed up training by focusing on the most impactful features.
- What to Expect: Models might be trained on smaller, more powerful feature sets, potentially leading to better out-of-sample performance. Users can experiment with different advanced selection algorithms.
- What's Different:
- Reliable Feature Caching with Script Integrity Check:
- What's Different: The feature cache validation logic (_generate_cache_metadata) now includes:
- A SHA256 hash of the running script file.
- The DYNAMIC_INDICATOR_PARAMS configuration.
- How It Affects: The feature cache will be automatically invalidated not only if data files or key parameters change, but also if the underlying feature engineering code in the script itself is modified.
- Why Made: To prevent using stale cached features when the logic for generating them has changed, ensuring data integrity and preventing subtle bugs.
- What to Expect: More robust and reliable feature caching. Users can confidently modify feature engineering code, knowing the cache will update correctly.
- What's Different: The feature cache validation logic (_generate_cache_metadata) now includes:
III. Refined Model Training & Backtesting
The processes for training models and evaluating their performance have been enhanced:
- Improved Confidence Gate Control:
- What's Different: A new USE_STATIC_CONFIDENCE_GATE (default True) and STATIC_CONFIDENCE_GATE parameter in ConfigModel have been introduced. If enabled, this static gate overrides the previously dynamic threshold adjustments made by the confidence_gate_modifier in STATE_BASED_CONFIG.
- How It Affects: Provides a more stable and explicit way to control the minimum confidence required for a trade signal, reducing reliance on AI to tune this critical parameter.
- Why Made: The dynamic confidence gate modifier could sometimes lead to overly restrictive or overly lax thresholds. A static gate offers more predictable behavior, especially during baseline establishment. The dynamic threshold from _find_best_threshold is still calculated and logged but might not be used for entries if the static gate is active.
- What to Expect: More consistent entry criteria for models. The confidence_gate_modifier within STATE_BASED_CONFIG will have no effect if USE_STATIC_CONFIDENCE_GATE is true.
- Enhanced Backtesting Realism (Latency Simulation):
- What's Different: The Backtester now includes a _calculate_latency_cost method. This simulates the adverse price movement that can occur due to execution latency between signal generation and order fulfillment. This cost is factored into the entry price.
- How It Affects: Backtest results become more conservative and realistic by accounting for an often-overlooked source of transaction costs.
- Why Made: To bridge the gap between simulated performance and potential live trading results, where even small delays can impact profitability.
- What to Expect: Potentially lower reported backtest PNL and metrics, but these results will be a more accurate reflection of achievable performance.
IV. Framework Robustness & Usability
General improvements to make the framework more resilient and easier to manage:
- Smarter Playbook Management:
- What's Different: The initialize_playbook function now ensures that every strategy definition in strategy_playbook.json includes a default selected_features list.
- How It Affects: If the AI fails to provide a feature list for a chosen strategy, the framework can fall back to a sensible default, preventing errors.
- Why Made: To make strategies more self-contained and the framework more resilient to incomplete AI suggestions.
- What to Expect: Increased stability during initial setup. The framework is less likely to fail if the AI omits the selected_features parameter.
- Dependency Reduction (Manual KAMA):
- What's Different: The Kaufman's Adaptive Moving Average (KAMA) is now calculated manually within FeatureEngineer (_calculate_kama_manual), removing the previous reliance on an external ta library for this specific indicator.
- How It Affects: Reduces one external dependency, simplifying setup and reducing potential points of failure or version conflicts.
- Why Made: To increase the self-contained nature of the framework and ensure consistent KAMA calculation.
- What to Expect: No direct change in KAMA feature values if the logic is identical, but a more streamlined framework.
V. Summary of Key Benefits Over V210-Old
- Increased Intelligence: More sophisticated operating states, AI-driven diagnostics for training failures ("AI Doctor"), and dynamic adaptation of indicator parameters.
- Superior Feature Power: A richer feature set with microstructure insights, advanced volatility measures, Kalman smoothing, and state-of-the-art feature selection techniques.
- Enhanced Realism: More accurate backtesting through latency cost simulation and more stable confidence gate management.
- Greater Robustness: More reliable feature caching, improved playbook fallbacks, and reduced external dependencies.
- Deeper Market Analysis: The framework now has more tools to understand and adapt to different market regimes and specific price action patterns.
VI. Deprecated/Replaced Items (from OLD V210 to current V210)
- Implicit Dynamic Confidence Gate Adjustment: While confidence_gate_modifier still exists in STATE_BASED_CONFIG, its primary role in dynamically adjusting the entry threshold during backtesting is superseded by the new USE_STATIC_CONFIDENCE_GATE system (which is enabled by default). The fine-tuning of the entry threshold is now more explicitly controlled.
- KAMA Calculation via ta library: Replaced with an internal, manual calculation for better control and fewer dependencies.
The removal of LSTM models and the initial introduction of the 3-state operating system were features of the "V210-Old" version itself and are thus considered baseline for this changelog.
So, Why Still No Forward Trades?
This is no longer a story about a failing model; it's a story about a disciplined one.
The issue now is a simple mismatch between the backtesting rule and the market conditions of the forward-test period.
- The Rule: The backtest was configured to only take trades with a confidence score above a static threshold of 70% (Using STATIC confidence gate for backtest: 0.70).
- The Reality: While the model is now much more confident on average, the specific high-probability patterns it learned simply may not have appeared in the forward-testing data with enough clarity to surpass the strict 70% confidence bar.
Think of it this way: The framework has learned to only bet on "A+" grade trade setups. The forward test period may have only presented "B+" setups. The model correctly identified them as decent but not "A+" opportunities, assigned them a confidence of perhaps 65-68%, and—as per your rules—correctly chose not to risk any capital.
This is not a failure of the framework. It is the framework operating exactly as a disciplined, automated system should. It protected the capital by refusing to take trades that did not meet the high standard of conviction.
Next Step: The framework is now producing a high-quality model. The next logical step is to experiment with the STATIC_CONFIDENCE_GATE itself. I have now lowered it slightly to 0.65 in the next run to see if it begins to capture these high-quality "B+" trades.
***updated edit***
Here’s a breakdown of why this specific part of the update is so valuable.
Based on testing NDX100 (M15,H1,D1), US30(M15,H1,D1), XAUUSD(M15,H1,D1).
The Unique Edge: Moving Beyond Indicators to Market Structure
Most trading systems, even advanced ones, operate on a vertical level: they look at a single asset and apply a stack of indicators (RSI, MACD, Bollinger Bands, etc.) to it. This framework now does something fundamentally different.
1. The "Confluence Engine" (_calculate_meta_features)
This is the foundation of the edge. Instead of just calculating indicators, this function explicitly creates features that represent proven, multi-condition trading setups.
Why it's Unique: Most machine learning models are given raw indicators and are expected to learn the complex interactions themselves. By pre-engineering features like bullish_pullback_divergence_combo or strong_uptrend_mtf_aligned, you are injecting decades of trading domain knowledge directly into the model. You are not asking the model "What does RSI say?"; you are asking it, "Is there a pullback, in a confirmed trend, with multi-timeframe agreement, and supportive volume?"
How it "Could Be Very Profitable?": This directly solves the "low confidence, no trades" problem. A single indicator might only give the model 45-55% confidence. A confluence feature that confirms multiple conditions are met simultaneously can give the model the 70-80%+ conviction it needs to execute a trade. High conviction on high-quality A+ setups is the most direct path to consistent profitability.
2. Dynamic Graph Centrality & Stochastic Trend Features
This is the most sophisticated and unique part of the framework, elevating it far beyond typical retail or "prosumer" systems.
Why it's Unique: This feature set stops looking at assets in isolation and starts modelling the entire market as an interconnected, dynamic network.
The Stochastic Trend Identification (num_common_trends) gives you a "god-level" view of the market's personality. Is everything moving together in a risk-on/risk-off wave (1-2 common trends), or is it a chaotic market where every asset is doing its own thing (5+ common trends)?
The Graph Centrality (graph_centrality) feature then identifies which asset is the "main character" or "epicenter" of the market right now. It's a real-time measure of influence.
How it "Could Make Millions": This provides a powerful leading indicator for capital rotation. Imagine a scenario:
The framework detects that the num_common_trends has dropped from 5 to 2, indicating the market is starting to move in unison.
Simultaneously, the graph_centrality score for GOLD begins to spike dramatically, while the centrality for NDX100 starts to fall.
This tells you that capital is likely rotating out of risk assets and into safe havens before the trend is obvious on a simple price chart.
The confluence engine can then use this information, find a bullish_pullback_normal_condition on Gold, and take a high-conviction long position with a much greater degree of certainty than a system that was just looking at Gold's RSI value.
The "Million-Dollar" Synergy
The true, unique edge lies in the synergy of these components. The system created:
Understands Macro Structure: It first assesses the entire market's personality and identifies the most influential assets using dynamic graph and trend analysis.
Identifies High-Probability Setups: It then zooms in on individual assets and uses the robust confluence engine to find classic, high-probability trade setups (like a pullback in a trend).
Executes with Conviction: Because the trade setup is validated by both the macro-level network analysis and the micro-level confluence check, the model can generate the high-confidence signal needed to act decisively.
This multi-layered, top-down approach is exceptionally robust and mimics the analytical process of a sophisticated institutional trading desk. It's this unique combination of a deep structural understanding of the market with proven, heuristic trade setups that provides the framework with its competitive advantage and significant profit potential.
Attached File(s)
- #98
- Edited 3:09am Jun 25, 2025 1:33am | Edited 3:09am
- Joined Sep 2020 | Status: Strategy and Risk Manager | 164 Posts
Disliked{quote} Yeah, I also came up with an error when the model has been successfully trained but didn't generate any trades, it returned an empty array that throws an error when trying to pass it to the dataframe. if you haven't come accross yet, maybe replace plot_shap_summary with: def plot_shap_summary(self, shap_summary: pd.DataFrame): plt.style.use('seaborn-v0_8-darkgrid') plt.figure(figsize=(12, 10)) # Check if shap_summary is empty or has no data if shap_summary.empty or len(shap_summary) == 0: plt.text(0.5, 0.5, 'No SHAP data available\n(No trades...Ignored
I am only testing with 4 assets with ~5 years of history. I have tried 8 assets but on my VPS I get the OOM errors which is expected as that takes up ~7Gb of RAM to process and as you can see below, the VPS doesn't have that to spare.
If I decide to commercialise the framework then hosting on AWS or even Google might be looked at but I wouldn't think about that for awhile as I there's a lot more testing to do and we don't want to do the same thing that happened with Quantopian.
The VPS I am running is with Hetzner and have designed this framework for almost the minimal of systems. If I decide to commercialise this later then I'd look at getting a better server.
CPU:
- Model: AMD EPYC-Milan
- Cores: 4
- Threads: 4 (1 thread per core)
- Sockets: 1
RAM:
- Total: 7.8 GiB (~8 GB)
Storage (Disk):
- Main partition (/): 8.8 GB total, 860 MB free
- Boot partition (/boot): 974 MB total, 654 MB free
Other Details:
- Architecture: x86_64 (64-bit)
1
Thank you for the last version, totally get what you mean, maybe consider this two fixes as it will come accross as a bug
Use ge instead of greater than
Inserted Code
TP_ATR_MULTIPLIER: confloat(ge=0.5, le=10.0) = 2.0
SL_ATR_MULTIPLIER: confloat(ge=0.5, le=10.0) = 1.5 Disliked{quote} I have just posted the newer V210 update. I am only testing with 4 assets with ~5 years of history. I have tried 8 assets but on my VPS I get the OOM errors which is expected as that takes up ~7Gb of RAM to process and as you can see below, the VPS doesn't have that to spare. If I decide to commercialise the framework then hosting on AWS or even Google might be looked at but I wouldn't think about that for awhile as I there's a lot more testing to do and we don't want to do the same thing that happened with Quantopian. The VPS I am running...Ignored
I switched to root-server because of performance and so on
I use netcup machines ..
Maybe it will help you.
Regards
Mucky