生成式对抗网络研究进展

doi:10.11959/j.issn.1000-436x.2018032

摘要/Abstract

摘要：

生成式对抗网络（GAN,generative adversarial network）对生成式模型的发展具有深远意义，自提出后立刻受到人工智能学术界和工业界的广泛研究与高度关注，随着深度学习的技术发展，生成式对抗模型在理论和应用上得到不断推进。首先，阐述生成对抗模型的研究背景与意义，然后，详细论述生成式对抗网络在建模、架构、训练和性能评估方面的研究进展及其具体应用现状，最后，进行分析与总结，指出生成式对抗网络研究中亟待解决的问题以及未来的研究方向。

关键词: 深度学习, 生成式对抗网络, 卷积神经网络, 自动编码器, 对抗训练

Abstract:

Generative adversarial network (GAN) have swiftly become the focus of considerable research in generative models soon after its emergence,whose academic research and industry applications have yielded a stream of further progress along with the remarkable achievements of deep learning.A broad survey of the recent advances in generative adversarial network was provided.Firstly,the research background and motivation of GAN was introduced.Then the recent theoretical advances of GAN on modeling,architectures,training and evaluation metrics were reviewed.Its state-of-the-art applications and the extensively used open source tools for GAN were introduced.Finally,issues that require urgent solutions and works that deserve further investigation were discussed.

Key words: deep learning,generative adversarial network, convolutional neural network, auto-encoder, adversarial training

中图分类号:

TP183

王万良,李卓蓉. 生成式对抗网络研究进展[J]. 通信学报, 2018, 39(2): 135-148.

Wanliang WANG,Zhuorong LI. Advances in generative adversarial network[J]. Journal on Communications, 2018, 39(2): 135-148.

图/表 7

图1

图2

图3

图4

图5

图6

图7

参考文献 97

[1]	LI Y , HE K , SUN J . R-fcn:object detection via region-based fully convolutional networks[C]// The Advances in Neural Information Processing Systems. 2016: 379-387.
[2]	HONG S , ROH B , KIM K H ,et al. PVANet:lightweight deep neural networks for real-time object detection[J]. arXiv:arXiv1611.08588, 2016.
[3]	LI X , QIN T , YANG J ,et al. LightRNN:memory and computation-efficient recurrent neural networks[J]. arXiv:arXiv1610.09893, 2016.
[4]	DAUPHIN Y N , FAN A , AULI M ,et al. Language modeling with gated convolutional networks[J]. arXiv:arXiv1609.03499, 2016.
[5]	OORD A V D , DIELEMAN S , ZEN H ,et al. WaveNet:a generative model for raw audio[J]. arXiv:arXiv1609.03499, 2016.
[6]	BENGIO Y . Learning deep architectures for AI[J]. Foundations ＆Trends? in Machine Learning, 2009,2(1): 1-127.
[7]	王万良 . 人工智能及其应用（第三版）[M]. 北京: 高等教育出版社, 2016.
	WANG W L . Artificial intelligence:principles and applications (third edition)[M]. Beijing: Higher Education Press, 2016.
[8]	周昌令, 栾兴龙, 肖建国 . 基于深度学习的域名查询行为向量空间嵌入[J]. 通信学报, 2016,37(3): 165-174.
	ZHOU C L , LUAN X L , XIAO J G . Vector space embedding of DNS query behaviors by deep learning[J]. Journal on Communications, 2016,37(3): 165-174.
[9]	HINTON G , DENG L , YU D ,et al. Deep neural networks for acoustic modeling in speech recognition:the shared views of four research groups[J]. IEEE Signal Processing Magazine, 2012,29(6): 82-97.
[10]	KRIZHEVSKY A , SUTSKEVER I , HINTON G E . ImageNet classification with deep convolutional neural networks[C]// The International Conference on Neural Information Processing Systems. 2012: 1097-1105.
[11]	RUMELHART D E , HINTON G E , WILLIAMS R J . Learning representations by back-propagating errors[J]. Nature, 1986,323(6088): 533-536.
[12]	HINTON G E , SRIVASTAVA N , KRIZHEVSKY A ,et al. Improving neural networks by preventing co-adaptation of feature detectors[J]. Computer Science, 2012,3(4): 212-223.
[13]	LIN M , CHEN Q , YAN S . Network in network[J]. arXiv:arXiv1312.4400, 2013.
[14]	王坤峰, 苟超, 段艳杰 ,等. 生成式对抗网络GAN的研究进展与展望[J]. 自动化学报, 2017,43(3): 321-332.
	WANG K F , GOU C , DUAN Y J ,et al. Generative adversarial networks:the state of the art and beyond[J]. ACTA Automatica Sinica, 2017,43(3): 321-332.
[15]	REZENDE D J , MOHAMED S , WIERSTRA D . Stochastic backpropagation and approximate inference in deep generative models[J]. Eprint Arxiv, 2014: 1278-1286.
[16]	HINTON G E , OSINDERO S , TEH Y W . A fast learning algorithm for deep belief nets[J]. Neural Computation, 1989,18(7): 1527-1554.
[17]	SALAKHUTDINOV R , HINTON G . Deep boltzmann machines[J]. Journal of Machine Learning Research, 2009,5(2): 1967-2006.
[18]	KINGMA D P , WELLING M . Auto-encoding variational bayes[J]. arXiv:arXiv1312.6114, 2013.
[19]	OORD A V D , KALCHBRENNER N , KAVUKCUOGLU K . Pixel recurrent neural networks[C]// The International Conference on Machine Learning, 2016: 1747-1756.
[20]	LECUN Y , BOTTOU L , BENGIO Y ,et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998,86(11): 2278-2324.
[21]	KRIZHEVSKY A , HINTON G E . Learning multiple layers of features from tiny images[R]. University of Toronto,Technical Report, 2009.
[22]	GOODFELLOW I J , POUGET-ABADIE J , MIRZA M ,et al. Generative adversarial nets[C]// International Conference on Neural Information Processing Systems. 2014: 2672-2680.
[23]	GOODFELLOW I . Generative adversarial networks[J]. arXiv:arXiv 1701.00160, 2017.
[24]	SALIMANS T , GOODFELLOW I , ZAREMBA W ,et al. Improved techniques for training gans[J]. arXiv:arXiv1606.03498, 2016.
[25]	ZHAO J , MATHIEU M , LECUN Y . Energy-based generative adversarial network[J]. arXiv:arXiv 1609.03126, 2016.
[26]	LEDIG C , THEIS L , HUSZAR F ,et al. Photo-realistic single image super-resolution using a generative adversarial network[J]. arXiv:arXiv1609.04802, 2016.
[27]	ODENA A , OLAH C , SHLENS J . Conditional image synthesis with auxiliary classifier GANs[J]. arXiv:arXiv1610.09585, 2016.
[28]	ZHU W , MIAO J , QING L ,et al. Unsupervised representation learning with deep convolutional generative adversarial networks.computer science[J]. arXiv:arXiv1511.06434, 2015.
[29]	REED S , AKATA Z , YAN X ,et al. Generative adversarial text to image synthesis[C]// International Conference on Machine Learning, 2016: 1060-1069.
[30]	GADELHA M , MAJI S , WANG R . 3D shape induction from 2D views of multiple objects[J]. arXiv:arXiv1612.05872, 2016.
[31]	MATHIEU M , COUPRIE C , LECUN Y . Deep multi-scale video prediction beyond mean square error[J]. arXiv:arXiv1511.05440, 2015.
[32]	VONDRICK C , PIRSIAVASH H , TORRALBA A . Generating videos with scene dynamics[C]// Conferrence on Neural Information Processing Systems. 2016: 613-621.
[33]	FINN C , CHRISTIANO P , ABBEEL P ,et al. A connection between generative adversarial networks,inverse reinforcement learning,and energy-based models[J]. arXiv:arXiv1611.03852, 2016.
[34]	HO J , ERMON S . Generative adversarial imitation learning[C]// Advances in Neural Information Processing Systems. 2016: 4565-4573.
[35]	PFAU D , VINYALS O . Connecting generative adversarial networks and actor-critic methods[J]. arXiv:arXiv1610.01945, 2016.
[36]	KARPATHY A , LI F F . Deep visual-semantic alignments for generating image descriptions[C]// Computer Vision and Pattern Recognition. 2015: 3128-3137.
[37]	GOODFELLOW I J . On distinguishability criteria for estimating generative models[J]. arXiv:arXiv1412.6515, 2014.
[38]	S?NDERBY C K , CABALLERO J , THEIS L ,et al. Amortised map inference for image super-resolution[J]. arXiv:arXiv1610.04490, 2016.
[39]	KIM T , BENGIO Y . Deep directed generative models with energy-based probability estimation[J]. arXiv:arXiv1606.03439, 2016.
[40]	NOWOZIN S , CSEKE B , TOMIOKA R . F-gan:training generative neural samplers using variational divergence minimization[C]// Advances in Neural Information Processing Systems. 2016: 271-279.
[41]	ZHANG Y Z , GAN Z , CARIN L . Generating text via adversarial training[C]// In Neural Information Processing Systems Workshop on Adversarial Training. 2016.
[42]	YU L , ZHANG W , WANG J ,et al. SeqGAN:sequence generative adversarial nets with policy gradient[J]. arXiv:arXiv1609.05473, 2016.
[43]	MIRZA M , OSINDERO S . Conditional generative adversarial nets[J]. Computer Science, 2014: 2672-2680.
[44]	GAUTHIER J . Conditional generative adversarial nets for convolutional face generation[Z]. Class Project for Stanford CS231N:Convolutional Neural Networks for Visual Recognition,Winter semester, 2014(5): 2.
[45]	DENTON E , CHINTALA S , SZLAM A ,et al. Deep generative image models using a Laplacian pyramid of adversarial networks[C]// Conferrence on Neural Information Processing Systems. 2015: 1486-1494.
[46]	HUANG X , LI Y , POURSAEED O ,et al. Stacked generative adversarial networks[J]. arXiv:arXiv1612.04357, 2016.
[47]	ZHANG H , XU T , LI H ,et al. StackGAN:text to photo-realistic image synthesis with stacked generative adversarial networks[J]. arXiv:arXiv1612.03242, 2016.
[48]	CHEN X , DUAN Y , HOUTHOOFT R ,et al. InfoGAN:interpretable representation learning by information maximizing generative adversarial nets[C]// Advances in Neural Information Processing Systems. 2016: 2172-2180.
[49]	LAMB A , DUMOULIN V , COURVILLE A . Discriminative regularization for generative models[J]. arXiv:arXiv1602.03220, 2016.
[50]	DONAHUE J , KR?HENBüHL P , DARRELL T . Adversarial feature learning[J]. arXiv:arXiv1605.09782, 2016.
[51]	DUMOULIN V , BELGHAZI I , POOLE B ,et al. Adversarially learned inference[J]. arXiv:arXiv1606.00704, 2016.
[52]	THEIS L , OORD A , BETHGE M . A note on the evaluation of generative models[J]. arXiv:arXiv1511.01844, 2015.
[53]	BROCK A , LIM T , RITCHIE JM ,et al. Neural photo editing with introspective adversarial networks[J]. arXiv:arXiv1609.07093, 2016.
[54]	LARSEN A B L , S?NDERBY S K , LAROCHELLE H ,et al. Autoencoding beyond pixels using a learned similarity metric[J]. arXiv:arXiv1512.09300, 2015.
[55]	CHE T , LI Y , JACOB A P ,et al. Mode regularized generative adversarial networks[J]. arXiv:arXiv1612.02136, 2016.
[56]	MAKHZANI A , SHLENS J , JAITLY N ,et al. Adversarial autoencoders[J]. arXiv:arXiv1511.05644, 2015.
[57]	WANG Y , ZHANG L , JOOST V D W . Ensembles of generative adversarial networks[J]. arXiv:arXiv1612.00991, 2016.
[58]	LIU M Y , TUZEL O . Coupled generative adversarial networks[C]// Advances in Neural Information Processing Systems, 2016: 469-477.
[59]	IM D J , MA H , KIM C D ,et al. Generative adversarial parallelization[J]. arXiv:arXiv1612.04021, 2016.
[60]	ZHU J Y , PARK T , ISOLA P ,et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[J]. arXiv:arXiv 1703.10593, 2017.
[61]	LI C , XU K , ZHU J ,et al. Triple generative adversarial nets[J]. arXiv:arXiv1703.02291, 2017.
[62]	ARJOVSKY M , CHINTALA S , BOTTOU L . Wasserstein GAN[J]. arXiv:arXiv1701.07875, 2017.
[63]	GULRAJANI I , AHMED F , ARJOVSKY M ,et al. Improved training of wasserstein GANs[J]. arXiv:arXiv1704.00028, 2017.
[64]	QI G J . Loss-sensitive generative adversarial networks on lipschitz densities[J]. arXiv:arXiv1701.06264, 2017.
[65]	METZ L , POOLE B , PFAU D ,et al. Unrolled generative adversarial networks[J]. arXiv:arXiv1611.02163, 2016.
[66]	WARDE-FARLEY D ,and GOODFELLOW I . Adversarial perturbations of deep neural networks[C]// Perturbations,Optimization,and Statistics. 2016:311.
[67]	IOFFE S , SZEGEDY C . Batch normalization:accelerating deep network training by reducing internal covariate shift[C]// International Conference on Machine Learning. 2015: 448-456.
[68]	SPRINGENBERG J T . Unsupervised and semi-supervised learning with categorical generative adversarial networks[J]. arXiv:arXiv1511.06390, 2015.
[69]	SZEGEDY C , VANHOUCKE V , IOFFE S ,et al. Rethinking the inception architecture for computer vision[C]// The IEEE Conference on Computer Vision and Pattern Recognition. 2016: 2818-2826.
[70]	ISOLA P , ZHU J Y , ZHOU T ,et al. Image-to-image translation with conditional adversarial networks[J]. arXiv:arXiv1611.07004, 2016.
[71]	ZHU J Y , ZHANG R , PATHAK D ,et al. Toward multimodal image-to-image translation[C]// Advances in Neural Information Processing Systems. 2017: 465-476.
[72]	YI Z , ZHANG H , GONG PT . DualGAN:unsupervised dual learning for image-to-image translation[J]. arXiv:arXiv1704.02510, 2017.
[73]	PATHAK D , KRAHENBUHL P , DONAHUE J ,et al. Context encoders:feature learning by inpainting[C]// The IEEE Conference on Computer Vision and Pattern Recognition. 2016: 2536-2544.
[74]	LI C , LIU H , CHEN C ,et al. Alice:towards understanding adversarial learning for joint distribution matching[C]// Advances in Neural Information Processing Systems. 2017: 5501-5509.
[75]	PERARNAU G , VAN DE WEIJER J , RADUCANU B ,et al. Invertible conditional GANs for image editing[J]. arXiv:arXiv1611.06355, 2016.
[76]	CRESWELL A , BHARATH A A . Inverting the generator of a generative adversarial network[J]. arXiv:arXiv1611.05644, 2016.
[77]	ZHOU S , XIAO T , YANG Y ,et al. GeneGAN:learning object transfiguration and attribute subspace from unpaired data[J]. arXiv:arXiv1705.04932, 2017.
[78]	KIM T , CHA M , KIM H ,et al. Learning to discover cross-domain relations with generative adversarial networks[J]. arXiv:arXiv1703.05192, 2017.
[79]	WANG C , WANG C , XU C ,et al. Tag disentangled generative adversarial network for object image re-rendering[C]// The Twenty-Sixth International Joint Conference on Artificial Intelligence. 2017: 2901-2907.
[80]	ANTIPOV G , BACCOUCHE M , DUGELAY JL . Face aging with conditional generative adversarial networks[J]. arXiv:arXiv1702.01983, 2017.
[81]	ZHAO F , FENG J , ZHAO J ,et al. Robust LSTM-autoencoders for face de-occlusion in the wild[J]. arXiv:arXiv1612.08534, 2016.
[82]	YU F , SEFF A , ZHANG Y ,et al. Lsun:construction of a large-scale image dataset using deep learning with humans in the loop[J]. arXiv:arXiv1506.03365, 2015.
[83]	SHRIVASTAVA A , PFISTER T , TUZEL O ,et al. Learning from simulated and unsupervised images through adversarial training[J]. arXiv:arXiv1612.07828, 2016.
[84]	LIU Z , LUO P , WANG X ,et al. Deep learning face attributes in the wild[C]// The IEEE International Conference on Computer Vision. 2015: 3730-3738.
[85]	TAN WR , CHAN CS , AGUIRRE H ,et al. ArtGAN:artwork synthesis with conditional categorial GANs[J]. arXiv:arXiv1702.03410, 2017.
[86]	LI J , MONROE W , SHI T ,et al. Adversarial learning for neural dialogue generation[J]. arXiv:arXiv1701.06547, 2017.
[87]	KUSNER M J , HERNáNDEZLOBATO J M . GANS for sequences of discrete elements with the gumbel-softmax distribution[J]. arXiv:arXiv1611.04051, 2016.
[88]	DENTON E , GROSS S , FERGUS R . Semi-supervised learning with context-conditional generative adversarial networks[J]. arXiv:arXiv1611.06430, 2016.
[89]	SUTSKEVER I , JOZEFOWICZ R , GREGOR K ,et al. Towards principled unsupervised learning[J]. arXiv:arXiv1511.06440, 2015.
[90]	ODENA A . Semi-supervised learning with generative adversarial networks[J]. arXiv:arXiv1606.01583, 2016.
[91]	SANTANA E , HOTZ G . Learning a driving simulator[J]. arXiv:arXiv1608.01230, 2016.
[92]	WU L , XIA Y , ZHAO L ,et al. Adversarial neural machine translation[J]. arXiv:arXiv1704.06933, 2017.
[93]	SCHLEGL T , SEEB?CK P , WALDSTEIN S M ,et al. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery[J]. arXiv:arXiv1703.05921, 2017.
[94]	HU W W , TAN Y . Generating adversarial malware examples for black-box attacks based on GAN[J]. arXiv:arXiv1702.05983, 2017.
[95]	CHIDAMBARAM M , QI Y J . Style transfer generative adversarial networks:learning to play chess differently[J]. arXiv:arXiv1702.06762, 2017.
[96]	ZHAI S , CHENG Y , FERIS R ,et al. Generative adversarial networks as variational training of energy based models[J]. arXiv:arXiv1611.01799, 2016.
[97]	LECUN Y , CHOPRA S , HADSELL R ,et al. A tutorial on energy-based learning[M]. Predicting Structured Data:MIT Press. 2006.