虚拟人形象合成技术综述

doi:10.11959/j.issn.2096-0271.2022081

大数据 ›› 2023, Vol. 9 ›› Issue (3): 114-139.doi: 10.11959/j.issn.2096-0271.2022081

虚拟人形象合成技术综述

邓钇敏¹^,², 张旭龙¹, 司世景¹^,³, 王健宗¹, 肖京¹

¹ 平安科技（深圳）有限公司，广东深圳 518063
² 中国科学技术大学，安徽合肥 230026
³ 上海外国语大学国际金融贸易学院，上海 200083

出版日期:2023-05-15 发布日期:2023-05-01
作者简介:邓钇敏（1999- ），女，中国科学技术大学硕士生，中国计算机学会会员，主要研究方向为深度学习、计算机视觉、元宇宙等。
张旭龙（1988- ），男，博士，平安科技（深圳）有限公司高级算法研究员，主要研究方向为语音合成、语音转换、音乐信息检索、机器学习和深度学习方法在人工智能领域应用。
司世景（1988- ），男，博士，平安科技（深圳）有限公司资深算法研究员，深圳市海外高层次人才。美国杜克大学人工智能博士后，中国计算机学会会员，主要研究方向为机器学习和及其在人工智能领域应用。
王健宗（1983- ），男，博士，平安科技（深圳）有限公司副总工程师，资深人工智能总监，联邦学习技术部总经理。美国佛罗里达大学人工智能博士后，中国计算机学会高级会员，中国计算机学会大数据专家委员会委员，曾任美国莱斯大学电子与计算机工程系研究员，主要研究方向为联邦学习和人工智能等。
肖京（1972- ），男，博士，中国平安集团首席科学家，2019年吴文俊人工智能杰出贡献奖获得者，中国计算机学会深圳分部副主席，主要研究方向为计算机图形学学科、自动驾驶、3D显示、医疗诊断、联邦学习等。
基金资助:
广东省重点领域研发计划“新一代人工智能”重大专项”(2021B0101400003)

Human avatars synthesis technologies: a survey

Yimin DENG¹^,², Xulong ZHANG¹, Shijing SI¹^,³, Jianzong WANG¹, Jing XIAO¹

¹ Ping An Technology (Shenzhen) Co., Ltd., Shenzhen 518063, China
² University of Science and Technology of China, Hefei 230026, China
³ School of Economics and Finance, Shanghai International Studies University, Shanghai 200083, China

Online:2023-05-15 Published:2023-05-01
Supported by:
The Key Research and Development Program of Guangdong Province(2021B0101400003)

摘要/Abstract

摘要：

随着元宇宙兴起，针对虚拟人形象化高效建模的需求日益迫切。从人类图像数据集中构建人类模型一直是计算机视觉的热门话题，其中3D虚拟人合成可以视作三维重建的子模块，重点在于对复杂的人体结构和表面细节的还原。对近年来虚拟人形象构建相关文献进行了全面调研，研究范围覆盖了全身形象、头部形象以及衣物建模等领域。分析归纳构建工作的基本原理，从各自技术路线层面出发将虚拟人合成方法分为基于网格、基于图像、基于体素、基于隐式表示、混合表示5类。首先介绍各类方法的基本原理，然后结合现有工作讨论具体技术，并指出各类方法的优缺点。此外还介绍了部分常见的模型质量评估的数据集和评价指标，简要介绍了虚拟人的常见应用。最后对虚拟人合成技术未来发展方向进行了展望，以合成高质量、高保真度、低延迟的虚拟人形象。

关键词: 元宇宙, 虚拟人, 三维人体重建, 计算机视觉, 深度学习, 人脸合成

Abstract:

Nowadays, the demand for efficient human avatars modeling is becoming increasingly urgent since metaverse has attracted more and more attention.Creating human avatars from human image datasets has always been a popular topic in the field of computer vision.3D human avatars synthesis can be regarded as a sub-module of 3D reconstruction focusing on reproducing the complex articulated body and surface details of human.A comprehensive survey of the literature related to the human reconstruction in recent years was conducted, including the work of full-body avatars, talking-head and clothing modeling.By analyzing and summarizing existing work, human avatars synthesis technologies were divided into five categories: mesh-based methods, image-based methods, voxel-based methods, implicit methods and hybrid methods due to the features of their pipelines.Firstly, the basic principles of them were introduced respectively.Secondly, the realization based on related work was discussed and then the advantages and disadvantages of methods respectively were pointed out.Thirdly, the datasets and metrics for model quality evaluation were introduced.Besides, an overview of various applications was given.Finally, the future directions of human avatars synthesis technology were prospected to synthesize high-quality, high-fidelity and low-latency human avatars.

Key words: metaverse, human avatars, three-dimensional human reconstruction, computer vision, deep learning, face synthesis

中图分类号:

TP391

邓钇敏, 张旭龙, 司世景, 王健宗, 肖京. 虚拟人形象合成技术综述[J]. 大数据, 2023, 9(3): 114-139.

Yimin DENG, Xulong ZHANG, Shijing SI, Jianzong WANG, Jing XIAO. Human avatars synthesis technologies: a survey[J]. Big Data Research, 2023, 9(3): 114-139.

图/表 8

图1

图2

图3

图4

图5

表1

表2

表3

参考文献 67

[68]	DENG B Y , LEWIS J P , JERUZALSKI T ,et al. NASA Neural articulated shape approximation[C]// Proceedings of European Conference on Computer Vision. Cham:Springer, 2020: 612-628.
[69]	CAO Y K , CHEN G Y , HAN K ,et al. JIFF:jointly-aligned implicit face function for high quality single view clothed human reconstruction[C]// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2022: 2719-2729.
[70]	BHATNAGAR B L , SMINCHISESCU C , THEOBALT C ,et al. Combining implicit function learning and parametric models for 3D human reconstruction[C]// Proceedings of European Conference on Computer Vision. Cham:Springer, 2020: 311-329.
[71]	SAITO S , YANG J L , MA Q L ,et al. SCANimate:weakly supervised learning of skinned clothed avatar networks[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 2885-2896.
[72]	XIU Y L , YANG J L , TZIONAS D ,et al. ICON:implicit clothed humans obtained from normals[C]// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2022: 13286-13296.
[73]	ZHENG Z R , HUANG H , YU T ,et al. Structured local radiance fields for human avatar modeling[C]// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2022: 15872-15882.
[74]	XU T H , FUJITA Y , MATSUMOTO E . Surface-aligned neural radiance fields for controllable 3D human synthesis[C]// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2022: 15862-15871.
[75]	LIU W , PIAO Z X , MIN J ,et al. Liquid warping GAN:a unified framework for human motion imitation,appearance transfer and novel view synthesis[C]// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2020: 5903-5912.
[76]	GRIGOREV A , ISKAKOV K , IANINA A ,et al. StylePeople:a generative model of fullbody human avatars[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 5147-5156.
[77]	RAJ A , ZOLLH?FER M , SIMON T ,et al. Pixel-aligned volumetric avatars[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 11728-11737.
[78]	GAFNI G , THIES J , ZOLLH?FER M ,et al. Dynamic neural radiance fields for monocular 4D facial avatar reconstruction[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 8645-8654.
[79]	ZHENG Z R , YU T , LIU Y B ,et al. PaMIR:parametric model-conditioned implicit representation for imagebased human reconstruction[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022,44(6): 31703184.
[80]	ZHENG Y , SHAO R Z , ZHANG Y X ,et al. DeepMultiCap:performance capture of multiple characters using sparse multiview cameras[C]// Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2022: 6219-6229.
[81]	YANG Z , WANG S L , MANIVASAGAM S ,et al. S3:neural shape,skeleton,and skinning fields for 3D human modeling[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 13279-13288.
[82]	PENG S D , ZHANG Y Q , XU Y H ,et al. Neural body:implicit neural representations with structured latent codes for novel view synthesis of dynamic humans[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 9050-9059.
[83]	ZHENG L , SHEN L Y , TIAN L ,et al. Scalable person re-identification:a benchmark[C]// Proceedings of 2015 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2016: 1116-1124.
[84]	IONESCU C , PAPAVA D , OLARU V ,et al. Human3.6M:large scale datasets and predictive methods for 3D human sensing in natural environments[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014,36(7): 1325-1339.
[85]	JING X Y , FENG Q , LAI Y K ,et al. STATE:learning structure and texture representations for novel view synthesis[C]// Proceedings of IEEE International Conference on Computer Vision.[S.l.:s.n.], 2022.
[86]	PATEL P , HUANG C H P , TESCH J ,et al. AGORA:avatars in geography optimized for regression analysis[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 13463-13473.
[87]	ALLDIECK T , MAGNOR M , XU W P ,et al. Video based reconstruction of 3D people models[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 8387-8397.
[88]	ZHANG R , ISOLA P , EFROS A A ,et al. The unreasonable effectiveness of deep features as a perceptual metric[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 586-595.
[89]	BROWNLEE J How to implement the frechet inception distance (FID) for evaluating GANs[Z]. 2019.
[90]	SETIADI D R I M . PSNR vs SSIM:imperceptibility quality assessment for image steganography[J]. Multimedia Tools and Applications, 2021,80(6): 84238444.
[1]	中国人工智能产业发展联盟总体组, 中关村数智人工智能产业联盟数字人工作委员会. 2020年虚拟数字人发展白皮书[R]. 2020.
	Artificial Intelligence Industry Alliance, Digital Human Work Committee of Zhongguancun Shuzhi Artificial Intelligence Industry Alliance. 2020 virtual digital human development white paper[R]. 2020.
[91]	LAZOVA V , INSAFUTDINOV E , PONSMOLL G . 360-degree textures of people in clothing from a single image[C]// Proceedings of 2019 International Conference on 3D Vision. Piscataway:IEEE Press, 2019: 643-653.
[92]	SIAROHIN A , LATHUILIèRE S , TULYAKOV S ,et al. First order motion model for image animation[J]. Advances in Neural Information Processing Systems, 2019,32.
[2]	FU K , PENG J S , HE Q W ,et al. Single image 3D object reconstruction based on deep learning:a review[J]. Multimedia Tools and Applications, 2021,80(1): 463-498.
[3]	SHA T , ZHANG W , SHEN T ,et al. Deep person generation:a survey from the perspective of face,pose and cloth synthesis[J]. arXiv preprint, 2021,arXiv:2109.02081.
[92]	SIAROHIN A , LATHUILIèRE S , TULYAKOV S ,et al. First order motion model for image animation[J]. arXiv preprint, 2020,arXiv:2003.00196.
[93]	KIM H , GARRIDO P , TEWARI A ,et al. Deep video portraits[J]. ACM Transactions on Graphics, 2018,37(4): 1-14.
[94]	ZHAO F Q , YANG W , ZHANG J K ,et al. HumanNeRF:efficiently generated human radiance field from sparse inputs[C]// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2022: 7733-7743.
[4]	CHEN L , PENG S D , ZHOU X W . Towards efficient and photorealistic 3D human reconstruction:a brief survey[J]. Visual Informatics, 2021,5(4): 11-19.
[5]	JOEYDEVRIES. Textures[Z]. 2022.
[95]	ESSER P , SUTTER E . A variational U-net for conditional appearance and shape generation[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 8857-8866.
[96]	REN Y R , YU X M , CHEN J M ,et al. Deep image spatial transformation for person image generation[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 7687-7696.
[6]	ZENG W , OUYANG W L , LUO P ,et al. 3D human mesh regression with dense correspondence[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 7052-7061.
[7]	GATYS L A , ECKER A S , BETHGE M . Texture synthesis using convolutional neural networks[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems New York:ACM Press, 2015: 262-270.
[97]	SIAROHIN A , SANGINETO E , LATHUILIèRE S ,et al. Deformable GANs for pose-based human image generation[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 3408-3416.
[98]	LIU M C , WANG K J , JI R H ,et al. Pose transfer generation with semantic parsing attention network for person re-identification[J]. Knowledge-Based Systems, 2021,223.
[8]	RISSER E , WILMOT P , BARNES C . Stable and controllable neural texture synthesis and style transfer using histogram losses[J]. arXiv preprint, 2017,arXiv:1701.08893.
[9]	OECHSLE M , MESCHEDER L , NIEMEYER M ,et al. Texture fields:learning texture representations in function space[C]// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2020: 4530-4539.
[99]	ZHU Z , HUANG T T , SHI B G ,et al. Progressive pose attention transfer for person image generation[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 2342-2351.
[100]	OLSZEWSKI K , TULYAKOV S , WOODFORD O ,et al. Transformable bottleneck networks[C]// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2020: 7647-7656.
[10]	阿娣提·玛珠德, M.戈皮 . 视觉计算基础:计算机视觉、图形学和图像处理的核心概念[M]. 赵启军,涂欢,梁洁,译.. 北京: 机械工业出版社, 2019.
	MAJUMDER A , GOPI M . Introduction to visual computing:core concepts in computer vision,graphics,and image processing[M]. Translated by ZHAO Q J,XU H,LIANG J. Beijing: China Machine Press, 2019.
[101]	YU A , YE V , TANCIK M ,et al. pixelNeRF:neural radiance fields from one or few images[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 4576-4585.
[102]	FANG Z X , CAI L B , WANG G . MetaHuman creator the starting point of the metaverse[C]// Proceedings of 2021 International Symposium on Computer Technology and Information Science. Piscataway:IEEE Press, 2021: 154-157.
[11]	JONES A , GARDNER A , BOLAS M ,et al. Simulating spatially varying lighting on a live performance[C]// Proceedings of 3rd European Conference on Visual Media Production and the 2nd Multimedia Conference 2006.[S.l.:s.n.], 2006: 127-133.
[12]	PHONG B T . Illumination for computer generated pictures[J]. Communications of the ACM, 1975,18(6): 311-317.
[103]	PATARANUTAPORN P , DANRY V , LEONG J ,et al. AI-generated characters for supporting personalized learning and well-being[J]. Nature Machine Intelligence, 2021,3(12): 1013-1022.
[104]	PATARANUTAPORN P , DANRY V , MAES P . Machinoia,machine of multiple me:integrating with past,future and alternative selves[C]// Proceedings of Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. New York:ACM Press, 2021: 1-7.
[13]	JOEYDEVRIES. Normal mapping[Z]. 2022.
[14]	JOEYDEVRIES. PBR:theory[Z]. 2022.
[105]	KATO R , KIKUCHI Y , YEM V ,et al. Reality Avatar for Customer Conversation in the Metaverse[C]// Proceedings of International Conference on HumanComputer Interaction. Cham:Springer, 2022: 131-145.
[106]	CONTI M , GATHANI J , TRICOMI P P . Virtual influencers in online social media[J]. IEEE Communications Magazine, 2022,60(8): 86-91.
[15]	洪锋, 梅炯, 李明禄 . 医学图象三维重建技术综述[J]. 中国图象图形学报(A辑), 2003,8(z1): 784-791.
	HONG F , MEI J , LI M L . Study on the techniques for 3D reconstruction of medical images[J]. Journal of Image and Graphics, 2003,8(z1): 784-791.
[107]	SILVA E S , BONETTI F . Digital humans in fashion:will consumers interact?[J]. Journal of Retailing and Consumer Services, 2021,60.
[108]	KáDEKOVá I Z , HOLIEN?INOVá I M . Influencer marketing as a modern phenomenon creating a new frontier of virtual opportunities[J]. Communication Today, 2018,9(2): 90-105.
[16]	MILDENHALL B , SRINIVASAN P P , TANCIK M ,et al. NeRF:representing scenes as neural radiance fields for view synthesis[C]// Proceedings of 2020 European Conference on Computer Vision. Cham:Springer, 2020: 405-421.
[17]	XU Q G , XU Z X , PHILIP J ,et al. PointNeRF:point-based neural radiance fields[C]// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2022: 5428-5438.
[109]	沈浩, 刘亭利 . 虚实共融，若即若离：全面进击的虚拟数字人[J]. 视听界, 2022(3): 5-10.
	SHEN H , LIU T L . Integration of reality and reality,at arm’s length:an all-round attack on virtual digital people[J]. Broadcasting Realm, 2022(3): 5-10.
[110]	PARK I , SAH Y J , LEE S ,et al. Avatarmediated communication in video conferencing:effect of self-affirmation on debating participation focusing on moderation effect of avatar[J]. International Journal of Human-Computer Interaction, 2023,39(3): 464-475.
[111]	TAKANO M , YOKOTANI K . Online social support via avatar communication buffers harmful effects of offline bullying victimization[J]. Proceedings of the International AAAI Conference on Web and Social Media, 2022,16: 980-992.
[112]	CHEONG B C . Avatars in the metaverse:potential legal issues and remedies[J]. International Cybersecurity Law Review, 2022,3(2): 467-494.
[18]	ANGUELOV D , SRINIVASAN P , KOLLER D ,et al. SCAPE:shape completion and animation of people[J]. ACM Transactions on Graphics, 2005,24(3): 408-416.
[19]	KAVAN L , COLLINS S , ?áRA J , ,et al. Geometric skinning with approximate dual quaternion blending[J]. ACM Transactions on Graphics, 2008,27(4): 1-23.
[20]	JACOBSON A , BARAN I , POPOVI? J , ,et al. Bounded biharmonic weights for realtime deformation[J]. ACM Transactions on Graphics, 2011,30(4): 1-8.
[21]	LOPER M , MAHMOOD N , ROMERO J ,et al. SMPL:a skinned multi-person linear model[J]. ACM Transactions on Graphics, 2015,34(6): 1-16.
[22]	PAVLAKOS G , CHOUTAS V , GHORBANI N ,et al. Expressive body capture:3D hands,face,and body from a single image[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 10967-10977.
[23]	WU S Z , JIN S , LIU W T ,et al. Graphbased 3D multi-person pose estimation using multi-view images[C]// Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2022: 11128-11137.
[24]	JIANG B Y , ZHANG Y D , WEI X K ,et al. H4D:human 4D modeling by learning neural compositional representation[C]// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2022: 19333-19343.
[25]	OSMAN A A A , BOLKART T , BLACK M J . STAR:sparse trained articulated human body regressor[C]// Proceedings of European Conference on Computer Vision. Cham:Springer, 2020: 598-613.
[26]	XU H Y , BAZAVAN E G , ZANFIR A ,et al. GHUM ＆ GHUML:generative 3D human shape and articulated pose models[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 6183-6192.
[27]	KINGMA D P , WELLING M . Autoencoding variational bayes[J]. arXiv preprint, 2013,arXiv:1312.6114.
[28]	REZENDE D J , MOHAMED S . Variational inference with normalizing flows[J]. arXiv preprint, 2015,arXiv:1505.05770.
[29]	BHATNAGAR B , TIWARI G , THEOBALT C ,et al. Multi-garment Net:learning to dress 3D people from images[C]// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2020: 5419-5429.
[30]	ALLDIECK T , PONS-MOLL G , THEOBALT C , et al . Tex2Shape:detailed full human body geometry from a single image[C]// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2020: 2293-2303.
[31]	WENG C Y , CURLESS B , KEMELMACHER-SHLIZERMAN I . Photo wake-up:3D character animation from a single photo[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 5901-5910.
[32]	ALLDIECK T , MAGNOR M , XU W P ,et al. Detailed human avatars from monocular video[C]// Proceedings of 2018 International Conference on 3D Vision. Piscataway:IEEE Press, 2018: 98-109.
[33]	MA Q L , YANG J L , RANJAN A ,et al. Learning to dress 3D people in generative clothing[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 6468-6477.
[34]	ALLDIECK T , MAGNOR M , BHATNAGAR B L ,et al. Learning to reconstruct people in clothing from a single RGB camera[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 1175-1186.
[35]	JIANG B Y , ZHANG J Y , HONG Y ,et al. BCNet:learning body and cloth shape from a single image[C]// Proceedings of European Conference on Computer Vision. Cham:Springer, 2020: 18-35.
[36]	WEI W L , LIN J C , LIU T L ,et al. Capturing humans in motion:temporalattentive 3D human pose and shape estimation from monocular video[C]// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2022: 13201-13210.
[37]	BLANZ V , VETTER T . A morphable model for the synthesis of 3D faces[C]// Proceedings of the 26th annual conference on Computer graphics and interactive techniques. New York:ACM Press, 1999: 187-194.
[38]	LATTAS A , MOSCHOGLOU S , GECER B ,et al. AvatarMe:realistically renderable 3D facial reconstruction “In-thewild”[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 757-766.
[39]	ZHENG M W , YANG H Y , HUANG D ,et al. ImFace:a nonlinear 3D morphable face model with implicit neural representations[C]// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2022: 20311-20320.
[40]	ZHENG Y F , ABREVAYA V F , BüHLER M C , et al . I M avatar:implicit morphable head avatars from videos[C]// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2022: 13535-13545.
[41]	GECER B , PLOUMPIS S , KOTSIA I ,et al. GANFIT:generative adversarial network fitting for high fidelity 3D face reconstruction[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 1155-1164.
[42]	KARRAS T , LAINE S , AILA T M . A style-based generator architecture for generative adversarial networks[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 4396-4405.
[43]	TEWARI A , ELGHARIB M , BHARAJ G ,et al. StyleRig:rigging StyleGAN for 3D control over portrait images[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 6141-6150.
[113]	LI J X , FENG Z J , SHE Q ,et al. MINE:towards continuous depth MPI with NeRF for novel view synthesis[C]// Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2022: 12558-12568.
[44]	KARRAS T , LAINE S , AITTALA M ,et al. Analyzing and improving the image quality of StyleGAN[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 8107-8116.
[45]	LUO H W , NAGANO K , KUNG H W ,et al. Normalized avatar synthesis using StyleGAN and perceptual refinement[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 11657-11667.
[46]	SHEN Y , LIANG J B , LIN M C . GANbased garment generation using sewing pattern images[C]// Proceedings of European Conference on Computer Vision. Cham:Springer, 2020: 225-247.
[47]	RAFFIEE A H , SOLLAMI M . GarmentGAN:photo-realistic adversarial fashion transfer[C]// Proceedings of 2020 25th International Conference on Pattern Recognition. Piscataway:IEEE Press, 2021: 3923-3930.
[48]	CURLESS B , LEVOY M . A volumetric method for building complex models from ange images[C]// Proceedings of the 23rd Annual Conference on Computer graphics and Interactive Techniques. New York:ACM Press, 1996: 303-312.
[49]	IZADI S , KIM D , HILLIGES O ,et al. KinectFusion:real-time 3D reconstruction and interaction using a moving depth camera[C]// Proceedings of the 24th annual ACM symposium on User Interface Software and Technology. New York:ACM Press, 2011: 559-568.
[50]	DAI A , NIE?NER M , ZOLLH?FER M ,et al. BundleFusion:real-time globally consistent 3D reconstruction using onthe-fly surface reintegration[J]. ACM Transactions on Graphics, 2017,36(4): 76a.
[51]	SITZMANN V , THIES J , HEIDE F ,et al. DeepVoxels:learning persistent 3D feature embeddings[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 2432-2441.
[52]	MA X X , SU J J , WANG C Y ,et al. Context modeling in 3D human pose estimation:a unified perspective[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 6234-6243.
[53]	ZHENG Z R , YU T , WEI Y X ,et al. DeepHuman:3D human reconstruction from a single image[C]// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2020: 7738-7748.
[54]	LOMBARDI S , SIMON T , SARAGIH J ,et al. Neural volumes:learning dynamic renderable volumes from images[J]. ACM Transactions on Graphics, 2019,38(4): 1-14.
[55]	MESCHEDER L , OECHSLE M , NIEMEYER M ,et al. Occupancy networks:learning 3D reconstruction in function space[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 4455-4465.
[56]	PARK J J , FLORENCE P , STRAUB J ,et al. DeepSDF:learning continuous signed distance functions for shape representation[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 165-174.
[57]	CHEN Z Q , ZHANG H . Learning implicit fields for generative shape modeling[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 5932-5941.
[58]	SITZMANN V , ZOLLH?FER M , WETZSTEIN G . Scene representation networks:continuous 3D-structureaware neural scene representations[J]. arXiv preprint, 2019,arXiv:1906.01618.
[59]	YANG G S , VO M , NEVEROVA N ,et al. BANMo:building animatable 3D neural models from many casual videos[C]// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2022: 2853-2863.
[60]	NEVEROVA N , NOVOTNY D , KHALIDOV V ,et al. Continuous surface embeddings[J]. arXiv preprint, 2020,arXiv:2011.12438.
[61]	PARK J J , FLORENCE P , STRAUB J ,et al. DeepSDF:learning continuous signed distance functions for shape representation[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 165-174.
[62]	BO?I? A , PALAFOX P , ZOLLH?FER M ,et al. Neural deformation graphs for globally-consistent non-rigid reconstruction[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2021: 1450-1459.
[63]	MITTAL P , CHENG Y C , SINGH M ,et al. AutoSDF:shape priors for 3D completion,reconstruction and generation[C]// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2022: 306-315.
[64]	SAITO S , HUANG Z , NATSUME R ,et al. PIFu:pixel-aligned implicit function for high-resolution clothed human digitization[C]// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2020: 2304-2314.
[65]	SAITO S , SIMON T , SARAGIH J ,et al. PIFuHD:multi-level pixel-aligned implicit function for high-resolution 3D human digitization[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 81-90.
[66]	JADHAV O , PATIL A , SAM J ,et al. Virtual dressing using augmented reality[J]. ITM Web of Conferences, 2021,40.
[67]	ZHU X , LIAO T , LYU J ,et al. MVPhuman dataset for 3D human avatar reconstruction from unconstrained frames[J]. arXiv preprint, 2022,arXiv:2204.11184.

数据集	时间	实验数据规模	类型	信息维度
People Snapshot^[87]	2018年	24个视频序列，11个人	视频	2D
短多目RGB视频序列^[78]	2021年	2分钟6 000帧，分辨率为512×512	视频	2D
ZJU-MoCap^[82]	2021年	9个动态人类视频	视频	3D
Market-1501^[83]	2015年	32 668张，1 501人	图片	2D
Human^[85]	2021年	立体模型	模型	3D

方法	种类	IS	FID	LPIPS	SSIM	PSNR
			People Snapshot (one shot)^[87]
StylePeople^[76]	网格-图像	1.7469^[76]	272.1^[76]	0.0836^[76]	0.9012^[76]	-
LWGAN ^[75]	图像	1.7159^[76]	1771.9^[76]	0.2727^[76]	0.2876^[76]	-
360Degree ^[91]	网格	1.8643^[76]	1383.1^[76]	0.2123^[76]	0.8079^[76]	-
			短多目RGB视频序列^[78]
DA ^[78]	体素-隐式	-	-	0.06^[78]	0.95^[78]	26.85^[78]
FOMM ^[92]	图像	-	-	0.16^[78]	0.91^[78]	23.77^[78]
DVP ^[93]	图像	-	-	0.10^[78]	0.93^[78]	25.67^[78]
			ZJU-MoCap^[82]
SANeRF^[74]	网格-隐式	-	-	-	0.902^[74]	24.42^[74]
NB ^[82]	网格-隐式	-	-	0.0762^[94]	0.885^[74]	23.49^[74]
NV ^[54]	体素	-	-	0.0999^[94]	0.821^[74]	21.39^[74]
NeRF^[16]	隐式	-	-	-	0.885^[74]	23.41^[74]
			Market-1501^[83]
VU-Net ^[95]	图像	3.214^[3]	20.144^[96]	-	0.353^[3]	-
DGANs ^[97]	图像	3.185^[3]	25.364^[3]	-	0.290^[3]	-
PSG ^[98]	图像	3.750^[3]	16.742^[3]	-	0.732^[3]	-
PPAT ^[99]	图像	3.323^[3]	22.657^[96]	-	0.311^[3]	-
			Human^[85]
TBN ^[100]	图像	-	52.262^[85]	0.080^[85]	-	-
pixelNeRF^[101]	图像-隐式	-	61.453^[85]	0.068^[85]	-	-
STATE ^[85]	图像	-	57.055^[85]	0.068^[85]	-	-

方法	种类	目标部位	是否使用时间信息	保真度	泛化性
SMPL^[21]	网格	身体	否	低	低
H4D^[24]	网格	身体	是	中	中
GANFit^[41]	图像	身体	否	中	低
NA^[45]	图像	头部	否	中	低
Deep Human^[53]	体素	身体	否	低	中
NV^[54]	体素	头部	是	中	中
PIFuHD^[65]	隐函数	身体	否	高	低
BANMo^[59]	隐函数	身体	是	高	高
SCANImate^[71]	网格-隐函数	身体	否	中	中
ICON^[72]	网格-隐函数	身体	否	中	中
SANeRF^[74]	网格-隐函数	身体	否	低	高
StylePeople^[76]	网格-图像	身体	是	高	中
LWGAN^[75]	网格-图像	身体	是	中	中
DA^[78]	体素-隐函数	头部	是	高	中
PVA^[77]	体素-隐函数	头部	否	高	低
S3^[81]	网格-体素-隐函数	身体	是	高	高
PaMIR^[79]	网格-体素-隐函数	身体	否	高	低

虚拟人形象合成技术综述

Human avatars synthesis technologies: a survey

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 8

参考文献 67

相关文章 15

Metrics

推荐阅读 0

[1]	刘震, 赵嵩, 杨涛, 蔡太伟. 基于深度学习的施工安全隐患整改智能推荐系统[J]. 大数据, 2023, 9(6): 124-136.
[2]	陈峻, 宁思衡. 长短期记忆网络在虚拟电厂数据中心的应用[J]. 大数据, 2023, 9(6): 160-173.
[3]	刘聪, 吕雪峰, 王宏林, 王晓伟, 陆瑾, 孙顺, 胡松奇. 基于概率分布差异的医学命名实体识别方法[J]. 大数据, 2023, 9(4): 159-171.
[4]	王皓, 潘昱杉, 潘毅. 生成式人工智能大模型赋能的元宇宙生命体：前瞻和挑战[J]. 大数据, 2023, 9(3): 85-96.
[5]	吴亚东, 陈家鸣, 罗焱, 王学锋, 黄德春, 倪超, 蓝集明, 李随群, 张巍瀚, 代唯. 彩灯元宇宙研究综述[J]. 大数据, 2023, 9(3): 97-113.
[6]	刘烨, 成伟, 李焱, 尹依梦, 孙慧杰. 元宇宙视域下教育社区构建研究[J]. 大数据, 2023, 9(1): 78-86.
[7]	贺亚运, 彭俊清, 王健宗, 肖京. 节奏舞者：基于关键动作转换图和有条件姿态插值网络的3D舞蹈生成方法研究[J]. 大数据, 2023, 9(1): 23-37.
[8]	彭一非, 袁贞, 张旭龙, 姜桂林, 刘逾江. 基于数字孪生技术的元宇宙空气污染物浓度推断模型[J]. 大数据, 2023, 9(1): 38-50.
[9]	王子航, 禹向群, 斯洪标, 傅思敏, 张旭龙, 彭绍亮. 基于算力网络的元宇宙分层处理模型设计[J]. 大数据, 2023, 9(1): 51-62.
[10]	朱锐, 王宏志, 崔双双, 张恺欣, 燕钰. 面向元宇宙的云边端协同大数据管理[J]. 大数据, 2023, 9(1): 63-77.
[11]	何波. 元宇宙的法律难题与规制思路研究[J]. 大数据, 2023, 9(1): 87-102.
[12]	沈阳, 余梦珑. 元宇宙与大数据：时空智能中的数据洞察与价值连接[J]. 大数据, 2023, 9(1): 103-110.
[13]	崔雨萌, 王靖亚, 闫尚义, 陶知众. 基于深度学习的警情记录关键信息自动抽取[J]. 大数据, 2022, 8(6): 127-142.
[14]	朱智韬, 司世景, 王健宗, 肖京. 联邦推荐系统综述[J]. 大数据, 2022, 8(4): 105-132.
[15]	王杰, 张松岩, 梁吉业. 融合一致性正则与流形正则的半监督深度学习算法[J]. 大数据, 2022, 8(3): 103-114.