With the problem of slow convergence for traditional Sarsa algorithm,an improved heuristic Sarsa algorithm based on value function transfer was proposed.The algorithm combined traditional Sarsa algorithm and value function transfer method,and the algorithm introduced bisimulation metric and used it to measure the similarity between new tasks and historical tasks in which those two tasks had the same state space and action space and speed up the algorithm convergence.In addition,combined with heuristic exploration method,the algorithm introduced Bayesian inference and used variational inference to measure information gain.Finally,using the obtained information gain to build intrinsic reward function model as exploring factors,to speed up the convergence of the algorithm.Applying the proposed algorithm to the traditional Grid World problem,and compared with the traditional Sarsa algorithm,the Q-Learning algorithm,and the VFT-Sarsa algorithm,the IGP-Sarsa algorithm with better convergence performance,the experiment results show that the proposed algorithm has faster convergence speed and better convergence stability.