Journal on Communications

Previous Articles     Next Articles

TD algorithm based on double-layer fuzzy partitioning

  

  • Online:2013-10-25 Published:2013-10-15

Abstract: When dealing with the continuous space problems, the traditional Q-iteration algorithms based on lookup-table or function approximation converge slowly and are difficult to get a continuous policy. To overcome the above weaknesses, an on-policy TD algorithm named DFP-OPTD was proposed based on double-layer fuzzy partitioning and its convergence was proved. The first layer of fuzzy partitioning was applied for state space, the second layer of fuzzy partitioning was applied for action space, and Q-value functions were computed by the combination of the two layer fuzzy partitioning. Based on the Q-value function, the consequent parameters of fuzzy rules were updated by gradient descent method. Applying DFP-OPTD on two classical reinforcement learning problems, experimental results show that the algorithm not only can be used to get a continuous action policy, but also has a better convergence performance.

No Suggested Reading articles found!