Empirical Analysis of Stock Index Futures Arbitrage Strategies Using Deep Reinforcement Learning on High-Frequency Trading Data
Abstract
Stock index futures arbitrage is central to market efficiency but has become increasingly difficult in high-frequency trading environments characterized by non-stationary dynamics and execution frictions. Traditional econometric arbitrage models rely on static assumptions that limit their effectiveness under rapidly changing market conditions. This study addresses this limitation by formulating index futures arbitrage as a sequential decision-making problem within a deep reinforcement learning framework. The arbitrage process is modeled as a Markov Decision Process and implemented using a Proximal Policy Optimization algorithm combined with a Long Short-Term Memory network. The proposed agent incorporates Level 2 order book information and is trained in a high-fidelity virtual exchange that simulates latency, transaction costs, and multi-level order matching. Empirical results based on high-frequency CSI 300 index futures data indicate that the proposed approach outperforms E-GARCH and VECM benchmarks in cumulative returns, drawdown control, and risk-adjusted performance. Moreover, the agent exhibits liquidity-aware and defensive behavior during high-volatility regimes. These findings demonstrate that deep reinforcement learning offers a robust framework for index futures arbitrage under realistic market conditions.
