Overfitting and Parameter Selection in Trading Strategies

From Overfitting to Robustness in Quant Trading

The risk of overfitting is serious and can lead to significant losses. It has been discussed in previous issues of this newsletter. In this edition, we revisit the topic, given its continued relevance to quantitative strategy development.

In this issue:

Latest Posts

  • Volatility Risk Premium and Clustering: Intraday vs Overnight Dynamics (8 min)

  • Large Language Models in Trading: Models and Market Dynamics (9 min)

  • Evaluating Option-Based Strategies and Dollar-Cost Averaging (10 min)

  • Machine Learning for Derivative Pricing and Crash Prediction (12 min)

  • Do Options Exhibit Momentum? (10 min)

Formal Study of Overfitting in Trading System Design

A serious problem when designing a trading system is the overfitting phenomenon, wherein the system is excessively tuned to historical data. Overfitting occurs when a trading strategy performs exceptionally well on past data but fails to generalize to new, unseen data. This can lead to false positives and inflated expectations, as the system may appear profitable due to chance rather than true predictive power.

Reference [1] formally studied this issue, using analytical approximations for the in-sample and out-of-sample Sharpe ratios of portfolios.

Findings

  • The paper analyzes how the in-sample performance of trading strategies based on linear predictive models deteriorates out-of-sample due to overfitting.

  • It develops closed-form approximations for both in-sample and out-of-sample Sharpe ratios by modeling the means and variances of strategy PnLs.

  • The results show that strategies using a large number of assets and weak signals experience a significant decline in out-of-sample performance.

  • In contrast, strategies relying on fewer but stronger signals tend to exhibit more stable and replicable results.

  • Increasing the size of the training dataset improves the out-of-sample replication ratio and reduces overfitting risk.

  • Signals with low true Sharpe ratios are particularly prone to overfitting, leading to inflated in-sample performance that does not persist.

  • Simulation and empirical studies, including applications to commodity futures, confirm the magnitude and robustness of these effects.

  • The findings also show that incorporating more realistic signal dynamics does not materially alter the main conclusions.

  • The replication ratio is largely determined by the true out-of-sample Sharpe ratio rather than specific model assumptions.

  • Overall, the study suggests that controlling model complexity and maximizing data usage are key to mitigating overfitting in predictive trading strategies.

In summary, the paper formally demonstrated that to minimize the risk of overfitting, one should,

1. Keep models as simple as possible,

2. Use the longest sensible backtest period available,

3. Develop systems with high Sharpe ratios, and

4. Rely on fewer signals.

From our experience, we have reservations about points #3 and #4, while agreeing with points #1 and #2. What do you think?

Reference

[1] Antoine Jacquier, Johannes Muhle-Karbe, Joseph Mulligan, In-Sample and Out-of-Sample Sharpe Ratios for Linear Predictive Models, 2025, arXiv:2501.03938

Avoiding Overfitting: Searching for Parameter Plateau

To mitigate the risk of overfitting, system developers often employ techniques such as cross-validation and out-of-sample testing to ensure that their strategies remain robust across various market conditions and time periods.

Another technique to prevent overfitting involves selecting a parameter region, often referred to as a “plateau,” where the trading system maintains stable performance. Reference [2] introduced a method for quantifying this plateau and utilized particle-swarm optimization to search for it.

Findings

  • The study highlights that quantitative trading performance depends heavily on parameter selection and is vulnerable to overfitting.

  • It introduces the concept of a parameter plateau to identify stable and robust parameter regions rather than single optimal points.

  • A plateau score algorithm is developed to replace the conventional approach of selecting the best in-sample parameters.

  • The results show that parameters with high plateau scores exhibit more stable and consistent out-of-sample performance.

  • The approach helps avoid “parameter islands” that perform well in-sample but fail out-of-sample.

  • To improve search efficiency, the study applies particle swarm optimization instead of brute-force methods.

  • Particle swarm optimization enables faster exploration of high-dimensional parameter spaces.

  • Experiments demonstrate that the combined plateau and optimization approach improves both robustness and profitability.

  • The method remains effective as strategy complexity increases from low- to high-dimensional parameter settings.

  • The study also proposes suitable hyperparameter ranges for particle swarm optimization in this framework.

In short, the extent of plateau stability is quantified, and an efficient optimization algorithm is utilized to search for it. The out-of-sample test results show promise.

Reference

[2] Jimmy Ming-Tai Wu, Wen-Yu Lin, Ko-Wei Huang, Mu-En Wu, On the design of searching algorithm for parameter plateau in quantitative trading strategies using particle swarm optimization, Knowledge-Based Systems, Volume 293, 7 June 2024, 111630

Closing Thoughts

Taken together, these studies highlight that both model design and parameter selection are key sources of fragility in quantitative strategies. Overfitting arises not only from using too many weak signals but also from selecting unstable parameter configurations that fail to generalize out-of-sample. Approaches such as reducing model complexity, increasing data, and focusing on stable parameter regions through the concept of parameter plateaus offer practical ways to improve robustness. Overall, the evidence suggests that consistent performance depends less on optimizing in-sample results and more on ensuring stability across regimes and datasets.

Additional Reading

For further discussion on the risk of overfitting, please refer to previous issues:

Educational Video

Bootstrapping for Overfitting Detection in Algorithmic Trading

This video discusses the use of bootstrapping as a statistical tool to improve the reliability of algorithmic trading strategies. The focus is on addressing overfitting, where strategies perform well in backtests but fail in live markets. Bootstrapping is presented as a method to resample historical data and test strategies across different scenarios, allowing practitioners to assess the stability of profitability, risk, and Sharpe ratios. The analysis shows that many seemingly strong strategies deteriorate significantly when subjected to resampling, indicating that part of their performance may be driven by luck rather than skill.

The video also highlights important limitations of bootstrapping in financial applications. In particular, it assumes independence in data, which is often violated in time series, and it cannot capture rare extreme events absent from historical data. Different bootstrap methods may produce varying results, and practical trading frictions such as transaction costs and market impact are not fully incorporated. Overall, the discussion emphasizes that while bootstrapping is a useful tool for detecting overfitting and validating strategy robustness, it is not sufficient on its own and requires further development and integration with real-world considerations.

Volatility Weekly Recap

The figure below shows the term structures for the VIX futures (in colour) and the spot VIX (in grey).

Volatility was elevated during March and the first part of April. The figure below shows the short-term futures basis, calculated as the second-month VIX futures minus the front month. The basis was recently below zero (the red line), indicating that the term structure was in backwardation. It has since recovered and is now above zero, suggesting a return to contango.

In parallel, the roll yield was negative and has now turned positive. These developments indicate that VIX-related products are still recovering. While the S&P 500 has reached a new high, VXX (not shown) has not reached a new low, reflecting the earlier backwardation and negative roll yield.

Around the Quantosphere

  • Hedge Fund Collapse Sparks Global Hunt for Almost $600 Million (bloomberg)

  • The highly attractive hedge fund manager and the $11m breakup fee. How to use your CFA to get $5m while doing no work (efinancialcareers)

  • Man Group, Biggest Listed Hedge-Fund Firm, Tumbles After Client Yanks $6 Billion (wsj)

  • The Billionaire Math Geek Who Turned AI Into a Money-Printing Machine (wsj)

  • Wall Street Brings Sophisticated Quant Trading to the Masses (wsj)

  • The Hot Hedge Fund Strategy Triggering a Pay Bonanza for Traders (bloomberg)

  • Hedge Funds Are Becoming Colder, Darker, Less Forgiving (efinancialcareers)

  • Trading Places: Meet the Physicists Turned Analysts Driving Finance (physicsworld)

  • AI Agents Are Becoming Day Traders, But Gains Are Elusive (bloomberg)

  • Podcast: The Quiet AI Trade That’s Raking in Billions (bloomberg)

Disclaimer

This newsletter is not investment advice. It is provided solely for entertainment and educational purposes. Always consult a financial professional before making any investment decisions.

We are not responsible for any outcomes arising from the use of the content and codes provided in the outbound links. By continuing to read this newsletter, you acknowledge and agree to this disclaimer.