System Architecture & Research Foundation

A systematic futures trading system built on 14 peer-reviewed research papers spanning market microstructure, time-series momentum, volatility forecasting, and statistical validation.

Section I

Core Axioms

Four non-negotiable principles that constrain every design decision in the system.

Reality-First Analysis

Price is the only ground truth. When any analytical output contradicts observed price behavior, the analysis is wrong. Informed by Bouchaud, Farmer & Lillo (2009): most market information comes from supply and demand dynamics, not external news.

Survival Precedes Profit

The primary objective is surviving long enough for edge to compound. Every strategy must pass the Deflated Sharpe Ratio test (Bailey & López de Prado, 2014), which corrects for selection bias under multiple testing. Position sizing uses a conservative fraction of the Kelly Criterion—maintaining a safety margin against edge overestimation.

Modularity & Isolation

Every component is independently testable and replaceable. The processing interface remains invariant whether consuming historical data or live market feeds. Validated through CPCV (López de Prado, 2018), which demands performance be testable across combinatorial data splits.

Let Winners Run

A portion of each position takes a fixed exit for base-rate profitability. The remainder trails with a wider stop, capturing the full extent of trending moves. Consistent with Moskowitz, Ooi & Pedersen (2012): time-series momentum in futures persists for 1–12 months. The system’s job is to stay in the trade long enough to capture that persistence.

Section II

Decision Architecture

Every trade passes through a multi-layer pipeline. No single signal triggers execution—multiple independent conditions must converge before capital is deployed. The majority of candidate signals are rejected at the first layer.

Higher-Timeframe Directional Gating

Multiple higher timeframes must confirm directional alignment before the execution layer considers any entry. This gate draws from research on time-series momentum persistence in futures markets (Moskowitz, Ooi & Pedersen, 2012) and the empirical finding that order flow exhibits long memory across timescales (Bouchaud, Farmer & Lillo, 2009). The gate is the system’s most powerful filter—the majority of candidate signals never pass it.

Entry Trigger Detection

Once directional alignment is confirmed, the system evaluates structural price action patterns for entry quality. Each candidate trigger is scored against multiple criteria. Only patterns meeting the quality threshold advance to confluence evaluation.

Confluence Scoring

Multiple independent signals must converge before capital is deployed. Each signal contributes to a weighted confluence assessment; the system requires sufficient agreement before acting. This multi-factor approach reduces the probability of acting on spurious signals (Gu, Kelly & Xiu, 2020).

Risk Calibration

Stop-loss placement and position sizing are calibrated to current market volatility, not fixed parameters. The system uses ATR-based calculations that adapt dynamically, ensuring consistent risk per trade in dollar terms regardless of market conditions (Corsi, 2009).

Section III

Multi-Timeframe Analysis

The system analyzes price across a hierarchy of timeframes, from macro trend down to execution context. Higher timeframes carry more authority—the execution layer only acts when the broader hierarchy confirms directional bias.

The gate detects the observable footprint of institutional order splitting. Large parent orders from institutional participants are broken into thousands of child orders executed over extended periods. This creates a long-memory property in order flow (Bouchaud, Farmer & Lillo, 2009)—meaning directional bias on higher timeframes has genuine predictive power for the execution timeframe. Not because of pattern recurrence, but because of persistent supply-and-demand imbalance that takes time to fully absorb.

This is consistent with research showing time-series momentum persists for 1–12 months across dozens of futures markets (Moskowitz, Ooi & Pedersen, 2012). The system does not attempt to predict when momentum will end—it participates while the higher-timeframe hierarchy confirms the trend is intact, and exits when it doesn’t.

Section IV

Signal Generation

Within the decision pipeline, specific signal types are evaluated at the trigger and confluence layers. Each signal type is grounded in market microstructure research.

Liquidity Sweep Detection

Price sweeps above known highs or below known lows trigger clusters of resting stop orders, providing counterparty liquidity for institutional entries. These events create a temporary dislocation between price and underlying order flow. The mechanics are consistent with Kyle’s (1985) model of informed trading and price impact, and with the Almgren-Chriss (2001) framework for temporary versus permanent impact decomposition. When a sweep exhausts available liquidity at a level, the resulting price movement carries information about the true supply-demand balance.

Order Flow Imbalance

The linear relationship between order flow imbalance and contemporaneous price changes is one of the strongest regularities in market microstructure. Cont, Kukanov & Stoikov (2014), published in the Journal of Financial Econometrics, established that order flow imbalance dominates raw volume as a short-horizon price predictor. The system leverages this relationship to assess the directional conviction behind observed price movements.

Institutional Footprints

Large parent orders from institutional participants are split into thousands of child orders, creating a distinctive pattern of persistent directional flow. Lillo & Farmer (2004) documented this long-memory property, and Bouchaud, Farmer & Lillo (2009) showed that markets “slowly digest” these supply-and-demand changes over extended periods. The system identifies the observable signatures of this institutional splitting process across multiple timeframes.

Section V

Scale-Out Architecture

Every position is divided into multiple contracts with distinct exit strategies, balancing the fundamental tension between reliability and magnitude.

The fixed-target contract reduces variance by locking in gains at a predefined, volatility-adjusted level. This establishes the system’s base-rate profitability—even if the trailing portion is stopped out at breakeven, the fixed exit has already captured value.

The trailing-runner contract captures the full extent of trending moves. Research on time-series momentum (Moskowitz, Ooi & Pedersen, 2012) shows that trend persistence in futures markets is both statistically significant and economically meaningful across dozens of instruments over decades. The runner’s wider, adaptive stop is designed to stay in the trade long enough to capture this persistence—it converts occasional large winners into the primary driver of portfolio returns.

Section VI

Risk Architecture

Risk management is not a feature of this system—it is the system. Every profit-generating mechanism operates within hard constraints that cannot be overridden by signal strength, conviction, or any other factor.

Fractional Kelly Sizing

The Kelly Criterion (Kelly, 1956) defines the theoretically optimal bet size for maximizing long-term geometric growth. However, Thorp (2008) demonstrated that even modest overestimation of edge at full Kelly produces catastrophic drawdowns. The system uses a conservative fraction of the Kelly-optimal size, deliberately sacrificing expected growth rate in exchange for materially lower variance and reduced probability of ruin. The Kelly fraction is re-estimated periodically from realized trade statistics.

Layered Drawdown Limits

Per-trade, daily, and weekly drawdown limits form nested containment layers. Each layer operates independently—a breach at any level triggers automatic protective action regardless of what other layers indicate. When a daily limit is hit, all positions are closed. When a weekly limit is hit, the system enters observation-only mode. There is no override mechanism and no “one more trade” logic.

Dynamic Volatility Scaling

Stop-loss distances and position sizes adapt to current market volatility using the HAR-RV model (Corsi, 2009). This model captures volatility persistence across daily, weekly, and monthly horizons, producing more accurate forecasts than single-horizon approaches (Andersen, Bollerslev, Diebold & Labys, 2003). In high-volatility regimes, positions are smaller and stops are wider; in low-volatility regimes, the inverse applies. This ensures consistent risk per trade in dollar terms.

Section VII

Volatility & Regime Detection

Market behavior is not stationary. The system classifies prevailing conditions along two axes—trend strength and volatility level—and adapts its parameters accordingly.

The HAR-RV model (Corsi, 2009) decomposes realized volatility into daily, weekly, and monthly components, capturing the multi-timescale structure of volatility clustering. This produces superior forecasts compared to single-horizon GARCH-family models, particularly during regime transitions. Research by Andersen, Bollerslev, Diebold & Labys (2003) established that realized volatility computed from high-frequency data provides a more accurate measure of true latent volatility than daily-close estimators.

The system adapts stops, sizing, and entry thresholds based on which regime quadrant is detected. In trending, low-volatility environments, standard parameters apply. In ranging, high-volatility environments, the system stands aside entirely—the expected cost of whipsaw losses exceeds the expected benefit of attempted trades. The jump-diffusion decomposition (Andersen, Bollerslev & Diebold, 2007) further separates the continuous volatility component (which drives predictability) from the jump component (which does not).

Section VIII

Backtest Integrity

A backtest is only as trustworthy as the assumptions embedded in it. Look-ahead bias is the most common source of inflated historical performance—and the hardest to detect. The system’s causal replay engine is designed to eliminate it structurally.

The causal replay engine builds higher-timeframe bars incrementally from raw data—a bar only “closes” when all its constituent data has been processed, exactly as it would in real time. At no point does the engine have access to a completed higher-timeframe bar before the underlying data has arrived. Signals generated at bar N produce entries at bar N+1; there is no same-bar execution.

All fills assume adverse-direction slippage with a realistic commission structure. This means backtest results represent a conservative estimate of achievable performance. This approach produces lower performance metrics compared to pre-computed data, which is the expected and correct behavior when artificial information advantage is removed.

Section IX

Validation Standards

Statistical significance in backtesting is necessary but not sufficient. The system must demonstrate robustness across multiple independent validation frameworks before live capital is deployed.

252+ days of paper trading — Paper trading uses the identical engine and data pipeline as the backtest, executing in real time against live market data with no fill assumptions. This duration represents a full market year, capturing seasonal patterns and multiple volatility regimes. Currently in progress.
Deflated Sharpe Ratio > 0 at p < 0.05 — The DSR (Bailey & López de Prado, 2014) corrects the observed Sharpe ratio for the number of strategies tested, non-normality of returns (skewness and kurtosis), and finite sample size. When many parameter combinations are evaluated, the best performer will appear significant by chance; the DSR accounts for this selection bias.
Probability of Backtest Overfit < 50% — Assessed via Combinatorial Purged Cross-Validation (López de Prado, 2018; Bailey, Borwein, López de Prado & Zhu, 2017). CPCV tests performance across all possible combinatorial train/test data splits, providing a distribution of out-of-sample performance rather than a single point estimate. This reveals whether the strategy’s edge is robust or path-dependent.
Paper-to-backtest replication within statistical tolerance — Paper trading results must match backtest expectations within defined confidence intervals. Systematic deviation in either direction triggers a full review of assumptions. This is the final gate before live deployment.
Maximum drawdown within defined limits — Both peak-to-trough drawdown and drawdown duration must remain within pre-established boundaries throughout the paper trading period. A single violation resets the validation clock.

Section X

Research Bibliography

The following papers form the empirical foundation of this system. Each citation includes its relevance to the architecture described above.

Kyle, A.S. (1985). “Continuous Auctions and Insider Trading.” Econometrica, Vol. 53, No. 6, pp. 1315–1335.

Price impact model; lambda measures market depth and informed flow detection.
Cont, R., Kukanov, A. & Stoikov, S. (2014). “The Price Impact of Order Book Events.” Journal of Financial Econometrics, Vol. 12, No. 1, pp. 47–88.

OFI-price linearity; order flow imbalance dominates volume as a predictor.
Moskowitz, T.J., Ooi, Y.H. & Pedersen, L.H. (2012). “Time Series Momentum.” Journal of Financial Economics, Vol. 104, No. 2, pp. 228–250.

Trend persistence 1–12 months in 58 futures contracts; justifies multi-timeframe gating.
Bouchaud, J.-P., Farmer, J.D. & Lillo, F. (2009). “How Markets Slowly Digest Changes in Supply and Demand.” Handbook of Financial Markets: Dynamics and Evolution, Elsevier.

Long-memory order flow from institutional splitting; directional bias persistence.
Corsi, F. (2009). “A Simple Approximate Long-Memory Model of Realized Volatility.” Journal of Financial Econometrics, Vol. 147, No. 1, pp. 116–126.

HAR-RV model; daily/weekly/monthly volatility components for regime detection.
Andersen, T.G., Bollerslev, T., Diebold, F.X. & Labys, P. (2003). “Modeling and Forecasting Realized Volatility.” Econometrica, Vol. 71, No. 2, pp. 579–625.

5-minute optimal sampling; realized volatility from high-frequency data.
Andersen, T.G., Bollerslev, T. & Diebold, F.X. (2007). “Roughing It Up: Including Jump Components in the Measurement, Modeling, and Forecasting of Return Volatility.” Review of Economics and Statistics, Vol. 89, No. 4, pp. 701–720.

Jump-continuous decomposition; continuous component drives predictability.
Baltussen, G., Da, Z., Lammers, S. & Martens, M. (2021). “Hedging Demand and Market Intraday Momentum.” Journal of Financial Economics, Vol. 142, Issue 1, pp. 377–403.

Rest-of-day return predicts last-half-hour; Sharpe 0.87–1.73; gamma-hedging mechanism.
Lucca, D.O. & Moench, E. (2015). “The Pre-FOMC Announcement Drift.” Journal of Finance, Vol. 70, No. 1, pp. 329–371.

49 bps average pre-FOMC return; 80% of equity premium in 24-hour window.
Gu, S., Kelly, B. & Xiu, D. (2020). “Empirical Asset Pricing via Machine Learning.” Review of Financial Studies, Vol. 33, No. 5, pp. 2223–2273.

Trees and neural networks outperform linear methods; momentum, liquidity, volatility dominate.
Bailey, D.H. & López de Prado, M. (2014). “The Deflated Sharpe Ratio: Correcting for Selection Bias, Backtest Overfitting, and Non-Normality.” Journal of Portfolio Management, Vol. 40, No. 5, pp. 94–107.

Corrects observed Sharpe for multiple testing, non-normality, sample size.
López de Prado, M. (2018). Advances in Financial Machine Learning. Wiley.

CPCV validation; PBO metric; gold standard for backtest overfitting detection.
Kelly, J.L. (1956). “A New Interpretation of Information Rate.” Bell System Technical Journal, Vol. 35, No. 4, pp. 917–926.

Optimal bet sizing for geometric growth rate maximization.
Almgren, R. & Chriss, N. (2001). “Optimal Execution of Portfolio Transactions.” Journal of Risk, Vol. 3, No. 2, pp. 5–39.

Permanent vs. temporary impact decomposition; trade trajectory optimization.
Kirilenko, A., Kyle, A.S., Samadi, M. & Tuzun, T. (2017). “The Flash Crash: High-Frequency Trading in an Electronic Market.” Journal of Finance, Vol. 72, No. 3, pp. 967–998.

HFT behavior under stress; liquidity evaporation cascades in index futures.

Section VI

Backtest Proof

Every claim on this site is backed by auditable data. These are the headline metrics from the causal backtest engine, validated across 38 months and 4,740+ trades on two timeframes with zero look-ahead bias.

4,740+

Total Trades

2.62

Profit Factor

53%

Win Rate

2.29%

Max Drawdown

38

Months of Data

Validation Standards

Combinatorial Purged Cross-Validation (CPCV): Backtest results tested across combinatorial data splits to detect overfitting. The gold standard from López de Prado (2018).

Deflated Sharpe Ratio: Corrects observed Sharpe for selection bias, non-normality, and multiple testing (Bailey & López de Prado, 2014). Our results survive this correction.

Zero Look-Ahead Bias: The causal engine processes bars in strict chronological order. No future data is accessible at any decision point. Every signal is evaluated using only information available at that moment.

Calibrated Slippage & Commissions: All backtest results include realistic execution costs: $1.29 per contract per side (MNQ) and 0.5–2 tick slippage modeled from empirical fill data.

Implementation Roadmap

Phase 1

Paper Trading

Validate signals against live market data with simulated fills. Build operational familiarity.

Phase 2

Validation

Compare paper results against backtest baseline. Minimum 50 trades and 4 weeks before proceeding.

Phase 3

Live Trading

Small account with conservative sizing (0.8% risk per trade). Three consecutive profitable months before scaling.

Phase 4

Scale

Increase account size by 25–50% after proving consistency. Or pursue prop firm funding via Module 8.

Advanced tools: Trade Forensic Analyzer → Weekly Analytics Dashboard →

Powered by 14 peer-reviewed papers Every decision logged Live-verified performance Zero look-ahead bias