智能科学与技术学报 ›› 2024, Vol. 6 ›› Issue (2): 220-231.doi: 10.11959/j.issn.2096-6652.202402

• 学术论文 • 上一篇    

面向数字货币量化交易的OAC模型研究

许波1,2(), 贺一峻1,2, 李祥霞1,2   

  1. 1.广东财经大学信息学院,广东 广州 510320
    2.广东省智能商务工程技术研究中心,广东 广州 510320
  • 收稿日期:2023-10-21 修回日期:2024-02-05 出版日期:2024-06-15 发布日期:2024-07-31
  • 通讯作者: 许波 E-mail:xubo807127940@163.com
  • 作者简介:许波(1982- ),男,博士,广东财经大学副教授、硕士生导师,CCF高级会员,主要研究方向为机器学习、量化交易。
    贺一峻(1998- ),男,广东财经大学硕士生,CCF学生会员,主要研究方向为机器学习、量化交易。
    李祥霞(1988- ),女,博士,广东财经大学讲师、硕士生导师,主要研究方向为机器学习、深度学习等。
  • 基金资助:
    广东省哲学社会科学规划项目(GD24CGL08);广东省普通高校重点领域专项(2021ZDZX3006);广州市科技计划项目(202201011651)

Research on OAC model for quantitative trading of digital currency

Bo XU1,2(), Yijun HE1,2, Xiangxia LI1,2   

  1. 1.School of Information Science, Guangdong University of Finance & Economics, Guangzhou 510320, China
    2.Guangdong Intelligent Business Engineering Technology Research Center, Guangzhou 510320, China
  • Received:2023-10-21 Revised:2024-02-05 Online:2024-06-15 Published:2024-07-31
  • Contact: Bo XU E-mail:xubo807127940@163.com
  • Supported by:
    Project of Philosophy and Social Science Planning of Guangdong(GD24CGL08);Special Projects in Key Fields of Colleges and Universities in Guangdong Province(2021ZDZX3006);Science and Technology Projects in Guangzhou(202201011651)

摘要:

针对数字货币量化交易中存在的问题,即大量且复杂因子以及因子状态空间维度较高,导致交易模型制定策略的准确性和风险控制能力难以兼顾,提出了一种改进的OAC模型——OAC_LSTM_ATT。该模型采用了LSTM和多头注意力机制来优化OAC的网络结构,从而提高OAC对时间序列数据的建模能力和泛化能力。通过这种融合,智能体在量化交易环境中可以更加灵活和准确地做出交易决策,进一步提高交易策略的质量和效果。实验结果显示,在比特币市场中,累计收益率达到了16.36%,最大回撤率为9.08%,夏普比为0.014,波动率为13.09%。在以太坊市场中,对应的指标为16.30%、8.56%、0.014和13.42%。与PPO、LSTM_PPO和A2C等模型相比,OAC_LSTM_ATT在有效性和稳定性方面具有一定优势,为量化交易策略制定提供了有价值的参考。

关键词: 量化交易, 深度强化学习, 注意力机制长, 短期记忆网络, 数字货币

Abstract:

In response to the challenges encountered in quantitative trading of digital currencies, characterized by the presence of a multitude of intricate factors and a high-dimensional factor state space, an enhanced optimistic actor-critic(OAC) model, referred to as OAC_LSTM_ATT, had been proposed. This model incorporated long short-term memory (LSTM) and a multi-head attention mechanism to optimize the network architecture of OAC, thereby augmenting its capacity for modeling time-series data and generalization. Through this integration, the intelligent agent operating in the quantitative trading environment was capable of making more adaptable and precise trading decisions, consequently elevating the quality and efficacy of trading strategies. Experimental findings revealed that, in the Bitcoin market, the cumulative return achieved was 16.36%, with a maximum drawdown of 9.08%, a Sharpe ratio of 0.014, and a volatility of 13.09%. Corresponding metrics in the Ethereum market amounted to 16.30%, 8.56%, 0.014, and 13.42%. When compared to models such as PPO, LSTM_PPO, A2C, OAC_LSTM_ATT demonstrates superior performance in terms of both effectiveness and stability, thereby offering valuable insights for the development of quantitative trading strategies.

Key words: quantitative trading, deep reinforcement learning, attention mechanism, long short-term memory, digital currency

中图分类号: 

No Suggested Reading articles found!