智能科学与技术学报 ›› 2022, Vol. 4 ›› Issue (3): 426-444.doi: 10.11959/j.issn.2096-6652.202208

• 学术论文 • 上一篇    下一篇

基于双池DQN的HVAC无模型优化控制方法

马帅1,2,3, 傅启明1,2,3, 陈建平1,2,3, 冯帆4, 陆悠1,2,3, 李铮伟5,6, 裘舒年5,6   

  1. 1 苏州科技大学电子与信息工程学院,江苏 苏州 215009
    2 江苏省智慧建筑节能重点实验室,江苏 苏州 215009
    3 苏州移动网络及应用技术重点实验室,江苏 苏州 215009
    4 德州农工大学,美国 得克萨斯州,TX 77843
    5 同济大学机械与能源学院,上海 200092
    6 同济大学工程结构性能演化与控制教育部重点实验室,上海 200092
  • 修回日期:2021-08-28 出版日期:2022-09-15 发布日期:2022-09-01
  • 作者简介:马帅(1997- ),男,苏州科技大学电子与信息工程学院硕士生,主要研究方向为强化学习在智慧建筑中的应用
    傅启明(1985- ),男,博士,苏州科技大学电子与信息工程学院副教授,主要研究方向为强化学习、深度学习和建筑节能
    陈建平(1963- ),男,博士,苏州科技大学电子与信息工程学院教授,研究方向为大数据分析、建筑节能和智能信息
    冯帆(1993- ),男,德州农工大学博士生,主要研究方向为HVAC优化控制
    陆悠(1977- ),男,博士,苏州科技大学电子与信息工程学院副教授,主要研究方向为下一代网络、网路管理和机器学习
    李铮伟(1981- ),男,博士,同济大学机械与能源学院教授,主要研究方向为建筑能源系统的优化运行及故障诊断等
    裘舒年(1995- ),男,同济大学机械与能源学院博士生,主要研究方向为强化学习在HVAC中的应用
  • 基金资助:
    国家重点研发计划基金资助项目(2020YFC2006602);国家自然科学基金资助项目(62072324);国家自然科学基金资助项目(61876217);国家自然科学基金资助项目(61876121);国家自然科学基金资助项目(61772357);江苏省重点研发计划基金资助项目(BE2020026)

HVAC model-free optimal control method based on double-pools DQN

Shuai MA1,2,3, Qiming FU1,2,3, Jianping CHEN1,2,3, Fan FENG4, You LU1,2,3, Zhengwei LI5,6, Shunian QIU5,6   

  1. 1 School of Electronics &Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
    2 Jiangsu Province Key Laboratory Intelligent Building Energy Efficiency,Suzhou University of Science and Technology, Suzhou 215009, China
    3 Suzhou Key Laboratory of Mobile Networking and Applied Technology, Suzhou 215009, China
    4 Texas A&M University, College Station TX 77843, USA
    5 School of Mechanical Engineering, Tongji University, Shanghai 200092, China
    6 Key Laboratory of Performance Evolution and Control for Engineering Structures of Ministry of Education, Tongji University, Shanghai 200092, China
  • Revised:2021-08-28 Online:2022-09-15 Published:2022-09-01
  • Supported by:
    The National Key Research and Development Program of China(2020YFC2006602);The National Natural Science Foundation of China(62072324);The National Natural Science Foundation of China(61876217);The National Natural Science Foundation of China(61876121);The National Natural Science Foundation of China(61772357);The Research and Development Program of Jiangsu Province(BE2020026)

摘要:

在 HVAC 控制领域,基于模型的最优控制方法得到了学者的广泛研究与验证,但是该方法高度依赖模型的准确性、大量历史数据的收集以及传感器的部署。针对上述问题,结合 EnergyPlus、实际系统参数以及历史数据,构建 HVAC 优化控制模型,并提出一种改进的双池 DQN 算法,最后将其应用于 HVAC 系统中不同型号冷机的负荷分配、冷却塔风机频率以及冷却水泵频率的组合优化控制。基于所构建的问题模型,针对决策优化过程中存在的样本不平衡的问题,该算法在 DQN 的基础上,建立两个独立的经验池,分别存储负荷分配与非负荷分配样本,在训练过程中,遵循一定的比例从经验池中进行采样,以加快算法收敛。将所提出的方法与基于模型的控制方法及基线方法进行比较,实验结果表明,与基线方法相比,基于模型的HVAC控制器能够节能11.5% (最优节能效率),而基于双池DQN的HVAC控制器在第1年就能够节能7.5%,同时,随着系统运行,该控制器在第8年左右即可获得接近最优节能效率的结果。此外,与基于模型的HVAC控制器相比,该控制器不依赖于系统模型,且在在线控制过程中,所需的先验知识及传感器较少,在实际的工程应用中更具价值。

关键词: 深度强化学习, 无模型优化控制, HVAC系统, 建筑节能

Abstract:

In the field of HVAC (heating, ventilation and air conditioning) control, the model-based optimal control method has been extensively studied and verified by scholars, but this method highly depends on the accuracy of the model, the collection of a large amount of historical data, and the deployment of sensors.In response to the above problems,combined with EnergyPlus, actual system parameters and historical data, the HVAC optimized control model was constructed, and an improved double pools-based DQN (DPs-DQN) algorithm was proposed.Finally, it was applied to the load distribution of different types of chillers, the combined optimal control of cooling tower fan frequency and cooling water pump frequency in HVAC system.Based on the constructed problem model, aiming at the problem of sample imbalance in the decision-making optimization process, the algorithm established two independent experience pools on the basis of DQN to store load distribution and non load distribution samples respectively.During the training process, followed a certain ratio to sample from the experience pool to speed up the algorithm convergence.The proposed method was compared with the model-based control method and the baseline method.The experimental results show that compared with the baseline method, the model-based HVAC controller can save 11.5% (optimal energy-saving efficiency), while the DPs-DQN can save energy by 7.5% in the first year.At the same time, as the system runs, the controller can obtain results close to the optimal energy saving efficiency in the eighth year.In addition, compared with the model-based HVAC controller, the controller does not depend on the system model, and requires less prior knowledge and sensors in the online control process, which is more valuable in actual engineering applications.

Key words: deep reinforcement learning, model-free optimal control, HVAC system, building energy saving

中图分类号: 

No Suggested Reading articles found!