參考文獻 |
[1]Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black.
”SMPL: A Skinned Multi-Person Linear Model”Seminal Graphics Papers: Pushing the Bound
aries, Volume 2 (1st ed.). Association for Computing Machinery, New York, NY, USA, Article 88,
pp.851–866,Aug. 2023.
[2]Ben Mildenhall et al.,“NeRF: Representing Scenes as Neural Radiance Fields for View Synthe
sis”arXiv:2003.08934,Aug ,2020.
[3]Bernhard Kerbl, Georgios Kopanas, Thomas Leimkuehler, and George Drettakis.” 3D Gaussian
Splatting for Real-TimeRadianceFieldRendering”. ACMTransactionsonGraphics,volume42(4),
Jul.2023.
[4]G. Pavlakos, X. Zhou and K. Daniilidis”Ordinal Depth Supervision for 3D Human Pose Estima
tion,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt
Lake City, UT, USA,pp. 7307-7316.Jun. 2018.
[5]H. Nam,D.S.Jung,Y.OhandK.M.Lee,“CyclicTest-TimeAdaptationonMonocularVideofor
3DHumanMeshReconstruction,”2023 IEEE/CVF International Conference on Computer Vision
(ICCV), Paris, France,pp. 14783-14793.Oct.2023.
[6]Diogo Luvizon et al.,“Scene-Aware 3D Multi-Human Motion Capture from a Single Camera”
arXiv:2301.05175.Mar,2023.
[7]Tianye Li, Timo Bolkart, Michael J. Black, Hao Li, and Javier Romero. 2017. “Learning a
model of facial shape and expression from 4D scans.”ACM Trans. Graph. 36, 6, Article 194, 17
pages.Dec.2017
[8]Yao Feng, Haiwen Feng, Michael J. Black, and Timo Bolkart. 2021.“Learning an animatable
detailed 3D face model from in-the-wild images. ”ACM Trans. Graph. 40, 4, Article 88, 13 pages.Aug.2021.
[9]Daněček, R., Black, M.J., Bolkart, T. (2022).“EMOCA: Emotion Driven Monocular Face Cap
ture and Animation. ”2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition
(CVPR),pp. 20279-20290.Jun.2022.
[10]H. Xu, E. G. Bazavan, A. Zanfir, W. T. Freeman, R. Sukthankar and C. Sminchisescu, ”GHUM
GHUML: Generative 3D Human Shape and Articulated Pose Models,” 2020 IEEE/CVF Confer
ence on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 6183
6192.June.2020.
[11]Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani, Timo Bolkart, Ahmed A. A. Osman,
Dimitrios Tzionas, Michael J. Black, ”Expressive BodyCapture: 3DHands, Face, andBodyFroma
Single Image,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Long Beach, CA, USA, 2019, pp. 10967-10977.Jun.2019.
[12]Javier Romero, Dimitrios Tzionas, and Michael J. Black. 2017.“Embodied hands: mod
eling and capturing hands and bodies together.”ACM Trans. Graph. 36, 6, Article 245, 17
pages.Dec.2017.
[13]A. Boukhayma, R. de Bem and P. H. S. Torr, ”3D Hand Shape and Pose From Images in the
Wild,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long
Beach, CA, USA, 2019, pp. 10835-10844, Jun.2019.
[14]Fangzhou Honget al.,“EVA3D: Compositional 3D Human Generation from 2D Image Collec
tions”arXiv:2210.04888.Oct ,2022.
[15]C. Patel, Z. Liao and G. Pons-Moll, ”TailorNet: Predicting Clothing in 3D as a Function of
Human Pose, Shape and Garment Style,” in 2020 IEEE/CVF Conference on Computer Vision and
Pattern Recognition (CVPR), Seattle, WA, USA, 2020 pp. 7363-7373.Jun.2020.
[16]Sahib Majithia, Sandeep N. Parameswaran, Sadbhavana Babar, Vikram Garg, Astitva Srivas
tava, Avinash Sharma, ”Robust 3D Garment Digitization from Monocular 2D Images for 3D Vir
tual Try-On Systems,” in 2022 IEEE/CVF Winter Conference on Applications of Computer Vision
(WACV), Waikoloa, HI, USA, 2022 pp. 1411-1421.Jan.2022.
[17]C. Guo, T. Jiang, X. Chen, J. Song and O. Hilliges, ”Vid2Avatar: 3D Avatar Reconstruction
fromVideosintheWildviaSelf-supervised SceneDecomposition,”in2023IEEE/CVFConference
on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 2023 pp. 12858
12868.Jun.2023.
[18]A. Vaswani et al., “Attention Is All You Need.”arXiv:1706.03762.Dec.2017 .
[19]Yuxuan Wang i et al.,“Toward end-to-end speech synthesis”arXiv:1703.10135.Apr 2017.
[20]Karen Simonyan, Andrew Zisserman,“Very Deep Convolutional Networks for Large-Scale
Image Recognition”arxiv:1409.1556,Apr.2015.
[21]Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov,
Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, ”Going deeper with convolutions,” 2015
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA,pp.
1-9,Jun.2015.
[22]Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, ”Gradient-based learning applied to document
recognition,” in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.
[23]K. He, X. Zhang, S. Ren and J. Sun, ”Deep Residual Learning for Image Recognition,” 2016
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA,
2016, pp. 770-778.Jun.2016.
[24]J. Park, P. Florence, J. Straub, R. Newcombe and S. Lovegrove, ”DeepSDF: Learning Continu
ous Signed Distance Functions for Shape Representation,” in 2019 IEEE/CVF Conference on Com
puter Vision and Pattern Recognition (CVPR), Long Beach, CA, USA,2019pp. 165-174.Jun.2019.
[25]M. Li, Y. Duan, J. Zhou and J. Lu, ”Diffusion-SDF: Text-to-Shape via Voxelized Diffusion,”
in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver,
BC, Canada, 2023 pp. 12642-12651.Jun.2023.
[26]Ben Mildenhall et al.,“NeRF: Representing Scenes as Neural Radiance Fields for View Syn
thesis”arXiv:2003.08934.Aug ,2020.
[27]ShengjieMaetal.,“3DGaussianBlendshapesforHeadAvatarAnimation”arXiv:2404.19398.May.2024.
[28]Jiefeng Li et al.,“HybrIK: Hybrid Analytical-Neural Inverse Kinematics for Body Mesh Recovery”arXiv:2304.05690.Apr.2023.
[29]Mingyi Shi et al.,“MotioNet: 3D Human Motion Reconstruction from Monocular Video with
Skeleton Consistency”arXiv:2006.12075.Jun.2020.
[30]Z. Tang, Z. Qiu, Y. Hao, R. Hong and T. Yao, ”3D Human Pose Estimation with Spatio
Temporal Criss-Cross Attention,” 2023 IEEE/CVF Conference on Computer Vision and Pattern
Recognition (CVPR), Vancouver, BC, Canada, 2023, pp. 4790-4799.Jun.2023.
[31]X. Zhou, Q. Huang, X. Sun, X. Xue and Y. Wei, ”Towards 3D Human Pose Estimation in the
Wild: A Weakly-Supervised Approach,” 2017 IEEE International Conference on Computer Vision
(ICCV), Venice, Italy, 2017, pp. 398-407.Oct.2017.
[32]Dushyant Mehta et al.,“VNect: Real-time 3D Human Pose Estimation with a Single RGB
Camera”arXiv:1705.01583.May.2017.
[33]D. Azinovic, R. Martin-Brualla, D. Goldman, M. Niebner and J. Thies, ”Neural RGB-D Sur
face Reconstruction,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition
(CVPR), New Orleans, LA, USA, 2022 pp. 6280-6291.Jun.2022.
[34]Soyong Shin et al.,“WHAM: Reconstructing World-grounded Humans with Accurate 3D Mo
tion”arXiv:2312.07531.Apr.2024.
[35]Y. Sun, Q. Bao, W. Liu, T. Mei and M. J. Black, ”TRACE: 5D Temporal Regression of Avatars
with Dynamic Cameras in 3D Environments,” 2023 IEEE/CVF Conference on Computer Vision
and Pattern Recognition (CVPR), Vancouver, BC, Canada, 2023, pp. 8856-8866.Jun.2023.
[36]E. Gartner, M. Andriluka, H. Xu and C. Sminchisescu, ”Trajectory Optimization for Physics
Based Reconstruction of 3d Human Pose from Monocular Video,” in 2022 IEEE/CVF Conference
on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022 pp. 13096
13105.Jun.2022.
[37]Nikhila Ravi et al.,“Accelerating 3d deep learning with pytorch3d”arXiv:2007.08501.Jul
,2022.
[38]H.-S.Fangetal., ”AlphaPose: Whole-BodyRegionalMulti-PersonPoseEstimationandTrack
ing in Real-Time,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no.6, pp. 7157-7173, 1.Jun. 2023.
[39]R. Ranftl, A. Bochkovskiy and V. Koltun, ”Vision Transformers for Dense Prediction,” 2021
IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 2021,
pp. 12159-12168.Oct.2021.
[40]B. Cheng, I. Misra, A. G. Schwing, A. Kirillov and R. Girdhar, ”Masked-attention Mask Trans
former for Universal Image Segmentation,” 2022 IEEE/CVF Conference on Computer Vision and
Pattern Recognition (CVPR), New Orleans, LA, USA, 2022, pp. 1280-1289.Jun.2022.
[41]Y. Sun, Q. Bao, W. Liu, Y. Fu, M. J. Black and T. Mei, ”Monocular, One-stage, Regression
of Multiple 3D People,” 2021 IEEE/CVF International Conference on Computer Vision (ICCV),
Montreal, QC, Canada, 2021, pp. 11159-11168.Oct.2021.
[42]Géry Casiez, Nicolas Roussel, and Daniel Vogel. 2012. 1 € filter: a simple speed-based low
pass filter for noisy input in interactive systems. In Proceedings of the SIGCHI Conference on
Human Factors in Computing Systems (CHI ’12). Association for Computing Machinery, New
York, NY, USA,pp. 2527–2530.May 2012.
[43]D. Mehta et al., ”Single-Shot Multi-person 3D Pose Estimation from Monocular RGB,” 2018
International Conference on 3D Vision (3DV), Verona, Italy, 2018, pp. 120-130.Sep.2018. |