Chinese Journal of Intelligent Science and Technology


HVAC model-free optimal control method based on double-pools DQN

MA Shuai1, 2,3, FU Qiming1,2,3, CHEN Jianping2,3, FENG Fan4, LU You1,2,3,#br# LI Zhenwei5,6, QIU Shunian5,6#br#   

  1. 1.School of Electronics & Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
    2. Jiangsu Province Key Laboratory IntelligentBuilding Energy Efficiency,Suzhou University of Science and Technology, Suzhou 215009, China
    3.Suzhou Key Laboratory of Mobile Networking and Applied Technology, Suzhou 215009, China
    4. Texas A&M University, College Station TX 77843, USA
    5.School of Mechanical Engineering, Tongji University, Shanghai 200092, China
    6.Key Laboratory of Performance Evolution and Control for Engineering Structures of Ministry of Education, Tongji University, Shanghai 200092, China

Abstract: In the field of HVAC (heating, ventilation and air conditioning) control, the model-based optimal control method has been extensively studied and verified by scholars, but this method highly depends on the accuracy of the model, the collection of a large amount of historical data, and the deployment of sensors. In response to the above problems, combined with EnergyPlus, actual system parameters and historical data, the HVAC optimized control model was constructed, and an improved double pools-based DQN (DPs-DQN) algorithm was proposed. Finally, it is applied to the load distribution of different types of chillers, the combined optimal control of cooling tower fan frequency and cooling water pump frequency in HVAC system. Based on the constructed problem model, aiming at the problem of sample imbalance in the decision-making optimization process, the algorithm establishes two independent experience pools on the basis of DQN to store load distribution and non load distribution samples respectively. During the training process, follow a certain ratio to sample from the experience pool to speed up the algorithm convergence. The proposed method is compared with the model-based control method and the baseline method. The experimental results show that compared with the baseline method, the model-based HVAC controller can save 11.5% (optimal energy-saving efficiency), while the DPs-DQN can save energy by 9% in the first year. At the same time, as the system runs, the controller can obtain results close to the optimal energy saving efficiency in the eighth year. In addition, compared with the model-based HVAC controller, the controller does not depend on the system model, and requires less prior knowledge and sensors in the online control process, which is more valuable in actual engineering applications.

Key words: deep reinforcement learning, model-free optimal control, HVAC system, building energy saving

[1] . [J]. Telecommunications Science, 2009, 25(11): 74 -77 .
[2] . [J]. Telecommunications Science, 2009, 25(11): 81 -85 .
[3] . [J]. Telecommunications Science, 2009, 25(11): 86 -88 .
[4] . [J]. Telecommunications Science, 2009, 25(11): 88 -90 .
[5] . [J]. Telecommunications Science, 2009, 25(11): 91 -93 .
[6] . [J]. Telecommunications Science, 2009, 25(11): 93 -94 .
[7] . [J]. Telecommunications Science, 2009, 25(11): 95 -97 .
[8] . [J]. Telecommunications Science, 2009, 25(11): 97 -101 .
[9] . [J]. Telecommunications Science, 2009, 25(11): 102 -104 .
[10] . [J]. Telecommunications Science, 2009, 25(11): 104 -105 .