增量式双自然策略梯度的行动者评论家算法
章鹏,刘全,钟珊,翟建伟,钱炜晟
Actor-critic algorithm with incremental dual natural policy gradient
Peng ZHANG,Quan LIU,Shan ZHONG,Jian-wei ZHAI,Wei-sheng QIAN
通信学报 . 2017, (4): 166 -177 .  DOI: 10.11959/j.issn.1000-436x.2017089