Telecommunications Science ›› 2021, Vol. 37 ›› Issue (2): 82-98.doi: 10.11959/j.issn.1000-0801.2021031

• Research and Development • Previous Articles     Next Articles

Fusion of auto encoders and multi-modal data based video recommendation method

Qiuyang GU1, Chunhua JU2, Gongxing WU2   

  1. 1 Zhejiang University of Technology, School of Management, Hangzhou 310023, China
    2 Zhejiang Gongshang University, Hangzhou 310018, China
  • Revised:2021-01-30 Online:2021-02-20 Published:2021-02-01
  • Supported by:
    The National Natural Science Foundation of China(71571162);The Social Science Planning Key Project of Zhejiang Province(20NDJC10Z);The National Social Science Fund Emergency Management System Construction Research Projec(20VYJ073);Zhejiang Philosophy and Social Science Major Project(20YSXK02ZD)

Abstract:

Nowadays, the commonly used linear structure video recommendation methods have the problems of non-personalized recommendation results and low accuracy, so it is extremely urgent to develop high-precision personalized video recommendation method.A video recommendation method based on the fusion of autoencoders and multi-modal data was presented.This method fused two data including text and vision for video recommendation.To be specific, the method proposed firstly used bag of words and TF-IDF methods to describe text data, and then fused the obtained features with deep convolutional descriptors extracted from visual data, so that each video document could get a multi-modal descriptors, and constructed low-dimensional sparse representation by autoencoders.Experiments were performed on the proposed model by using three real data sets.The result shows that compared with the single-modal recommendation method, the recommendation results of the proposed method are significantly improved, and the performance is better than the reference method.

Key words: autoencoder, multi-modal representation, data fusion, video recommendation

CLC Number: 

No Suggested Reading articles found!