通信学报 ›› 2022, Vol. 43 ›› Issue (9): 194-208.doi: 10.11959/j.issn.1000-436x.2022178

• 综述 • 上一篇    下一篇

基于生成模型的视频图像重建方法综述

王延文, 雷为民, 张伟, 孟欢, 陈新怡, 叶文慧, 景庆阳   

  1. 东北大学计算机科学与工程学院,辽宁 沈阳 110169
  • 修回日期:2022-08-22 出版日期:2022-09-25 发布日期:2022-09-01
  • 作者简介:王延文(1998- ),女,辽宁辽阳人,东北大学博士生,主要研究方向为计算机视觉、视频图像压缩编码
    雷为民(1969- ),男,山西平遥人,博士,东北大学教授,主要研究方向为多媒体智能信号处理、网络多径传输优化和工业实时通信技术
    张伟(1980- ),女,山东济宁人,博士,东北大学讲师、硕士生导师,主要研究方向为多媒体智能信号处理、网络多径传输优化和工业实时通信技术
    孟欢(1998- ),女,辽宁锦州人,东北大学硕士生,主要研究方向为计算机视觉、视频图像压缩编码
    陈新怡(1994- ),女,河北承德人,东北大学博士生,主要研究方向为计算机视觉、视频图像压缩编码
    叶文慧(1991- ),女,山东烟台人,东北大学博士生,主要研究方向为计算机视觉、视频图像压缩编码
    景庆阳(1994- ),女,辽宁沈阳人,东北大学博士生,主要研究方向为计算机视觉、视频图像压缩编码
  • 基金资助:
    中央高校基本科研业务费专项资金资助项目(N2216010);国家重点研发计划基金资助项目(2018YFB1702000)

Survey on video image reconstruction method based on generative model

Yanwen WANG, Weimin LEI, Wei ZHANG, Huan MENG, Xinyi CHEN, Wenhui YE, Qingyang JING   

  1. School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China
  • Revised:2022-08-22 Online:2022-09-25 Published:2022-09-01
  • Supported by:
    The Fundamental Research Funds for the Central Universities of China(N2216010);The National Key Research and Development Program of China(2018YFB1702000)

摘要:

基于像素相关性的传统视频压缩技术性能提升空间受限,语义压缩成为视频压缩编码的新方向,视频图像重建是语义压缩编码的关键环节。首先介绍了针对传统编码优化的视频图像重建方法,包括如何利用深度学习提升预测精度和利用超分辨率技术增强重建质量;其次讨论了基于变分自编码器、基于生成对抗网络、基于自回归模型和基于 Transformer 模型的视频图像重建方法,并根据图像的不同语义表征对模型进行分类,对比了各类方法的优缺点及其适用场景;最后总结了现有视频图像重建存在的问题,并进一步展望研究方向。

关键词: 视频压缩编码, 图像重建, 生成对抗网络, 变分自编码器, Transformer模型

Abstract:

Traditional video compression technology based on pixel correlation has limited performance improvement space, semantic compression has become the new direction of video compression coding, and video image reconstruction is the key link of semantic compression coding.First, the video image reconstruction methods for traditional coding optimization were introduced, including how to use deep learning to improve prediction accuracy and enhance reconstruction quality with super-resolution techniques.Second, the video image reconstruction methods based on variational auto-encoders, generative adversarial networks, autoregressive models and transformer models were discussed emphatically.Then, the models were classified according to different semantic representations of images.The advantages, disadvantages, and applicable scenarios of various methods were compared.Finally, the existing problems of video image reconstruction were summarized, and the further research directions were prospected.

Key words: video compression coding, image reconstruction, generative adversarial network, variational auto-encoder, Transformer model

中图分类号: 

No Suggested Reading articles found!