最小状态变元平均奖赏的强化学习方法
刘全,傅启明,龚声蓉,伏玉琛,崔志明
Reinforcement learning algorithm based on minimum state method and average reward
Quan LIU,Qi-ming FU,Sheng-rong GONG,Yu-chen FU,Zhi-ming CUI
通信学报 . 2011, (1): 66 -71 .  DOI: 1000-436X(2011)01-0066-06