使用OSNet結合人體視角估測的行人重識別

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：23

、訪客IP：3.16.70.247

姓名

張予鴻(Chang Yu Hung) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

使用OSNet結合人體視角估測的行人重識別
(Person Re-Identification Using OSNet Combined with Human Body Orientation Estimation)

相關論文

★ 整合GRAFCET虛擬機器的智慧型控制器開發平台	★ 分散式工業電子看板網路系統設計與實作
★ 設計與實作一個基於雙攝影機視覺系統的雙點觸控螢幕	★ 智慧型機器人的嵌入式計算平台
★ 一個即時移動物偵測與追蹤的嵌入式系統	★ 一個固態硬碟的多處理器架構與分散式控制演算法
★ 基於立體視覺手勢辨識的人機互動系統	★ 整合仿生智慧行為控制的機器人系統晶片設計
★ 嵌入式無線影像感測網路的設計與實作	★ 以雙核心處理器為基礎之車牌辨識系統
★ 基於立體視覺的連續三維手勢辨識	★ 微型、超低功耗無線感測網路控制器設計與硬體實作
★ 串流影像之即時人臉偵測、追蹤與辨識─嵌入式系統設計	★ 一個快速立體視覺系統的嵌入式硬體設計
★ 即時連續影像接合系統設計與實作	★ 基於雙核心平台的嵌入式步態辨識系統

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2026-8-3以後開放)

摘要(中)

本研究提出一個創新的行人重識別方法，為了解決行人偵測所面臨的問題，在切割行人影像後，使用OpenPose關節點擷取方法，保留完整的行人影像，並計算左右腳踝關節點座標距離，取其距離最大值的影像作為代表影像，減少雙腳重疊所造成的遮擋。再來為了解決人體視角變化因素的影響，本研究將行人重識別結合人體視角估測，使用ResNet18模型預先將行人影像做人體視角分類，以提高後續識別結果的準確率。本研究採用OSNet模型來擷取行人特徵，該模型最大特點就是考慮了行人全尺度特徵，且在近年行人重識別領域中有相當好的識別效果。最後，本研究以自行建立的MIAT多視角行人資料集進行實驗，結合視角估測分類的方法其Rank-1為81%，mAP則為85%，相較於未經視角估測分類的方法，Rank-1提高22%，mAP則是提高17%。

摘要(英)

In this study, an innovative Person Re-Identification method is proposed. To solve the problem of pedestrian detection, after cutting the pedestrian image, the OpenPose method is used to retain the complete pedestrian image and calculate the coordinates distance between the left and right ankle keypoints, and the image with the maximum distance is taken as the representative image to reduce the occlusion caused by the overlapping of the two legs. In order to solve the influence of viewpoint variation, this study combines Person Re-Identification with human body orientation estimation, and uses the ResNet18 model to pre-classify pedestrian images by human viewpoints to improve the accuracy of recognition results. This study uses the OSNet model to capture pedestrian features. This model considers the omni-scale features of pedestrians and has good performance in the field of Person Re-Identification in recent years. Finally, this study uses the self-established MIAT multi-view person dataset to conduct experiments, and combined viewpoint estimation method, the Rank-1 is 81% and the mAP is 85%, which is 22% higher in Rank-1 and 17% higher in mAP than the method without viewpoint estimation method.

關鍵字(中)

★ 行人重識別
★ 人體視角估測
★ 人體姿態估測

關鍵字(英)

★ Person Re-Identification
★ OSNet
★ Human Body Orientation Estimation
★ Pose Estimation

論文目次

摘要 I
Abstract II
謝誌 III
目錄 IV
圖目錄 VI
表目錄 VIII
第一章、緒論 1
1.1 研究背景 1
1.2 研究目的 2
1.3 論文架構 2
第二章、文獻回顧 4
2.1 物件偵測與物件追蹤 4
2.1.1 YOLOv4物件偵測 4
2.1.2 DeepSORT物件追蹤 5
2.2 人體姿態估測 5
2.2.1 AlphaPose 6
2.2.2 OpenPose 7
2.3 行人重識別模型 9
2.3.1 OSNet 10
第三章、行人重識別系統設計 12
3.1 行人重識別系統架構設計 12
3.2 行人重識別系統階層式模組化設計 14
3.2.1 YOLOv4行人偵測模組 16
3.2.2 OpenPose人體關節點擷取模組 17
3.2.3 ResNet18人體視角估測模組 19
3.2.4 OSNet行人重識別模組 20
3.3 行人重識別系統離散事件建模 20
3.3.1 YOLOv4行人偵測模組離散事件建模 23
3.3.2 OpenPose人體關節點擷取模組離散事件建模 23
3.3.3 ResNet18人體視角估測模組離散事件建模 24
3.3.4 OSNet行人重識別模組離散事件建模 25
第四章、行人重識別系統實驗 28
4.1 實驗環境 28
4.2 行人重識別資料集 30
4.2.1 SQ11 MINI DV攝影機 30
4.2.2 MIAT多視角行人資料集 31
4.2.3 大型行人重識別資料集 33
4.3 ResNet18人體視角估測模組驗證 34
4.4 各視角查詢集的識別準確率評估實驗 36
4.5 訓練樣本數比較實驗 38
第五章、結論與未來展望 43
參考文獻 44

參考文獻

[1] M.O.Almasawa, L.A.Elrefaei, and K.Moria, “A survey on deep learning-based person re-identification systems,” IEEE Access, vol. 7, pp. 175228–175247, 2019.
[2] O.Camps et al., “From the Lab to the Real World: Re-identification in an Airport Camera Network,” IEEE Trans. Circuits Syst. Video Technol., vol. 27, no. 3, pp. 540–553, 2017.
[3] R.Iguernaissi, D.Merad, K.Aziz, and P.Drap, “People tracking in multi-camera systems: a review,” Multimed. Tools Appl., vol. 78, no. 8, pp. 10773–10793, 2019.
[4] X.Sun and L.Zheng, “Dissecting person re-identification from the viewpoint of viewpoint,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 608–617.
[5] D.Xu, J.Chen, C.Liang, Z.Wang, and R.Hu, “Cross-view identical part area alignment for person re-identification,” in ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 2462–2466.
[6] Y.Wang et al., “Resource aware person re-identification across multiple resolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 8042–8051.
[7] Y.Huang, Z.-J.Zha, X.Fu, and W.Zhang, “Illumination-invariant person re-identification,” in Proceedings of the 27th ACM International Conference on Multimedia, 2019, pp. 365–373.
[8] M. S.Sarfraz, A.Schumann, A.Eberle, and R.Stiefelhagen, “A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 420–429.
[9] J.Miao, Y.Wu, P.Liu, Y.Ding, and Y.Yang, “Pose-guided feature alignment for occluded person re-identification,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 542–551.
[10] H.Huang, D.Li, Z.Zhang, X.Chen, and K.Huang, “Adversarially occluded samples for person re-identification,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 5098–5107.
[11] R.Hou, B.Ma, H.Chang, X.Gu, S.Shan, and X.Chen, “Vrstc: Occlusion-free video person re-identification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 7183–7192.
[12] C.Song, Y.Huang, W.Ouyang, and L.Wang, “Mask-guided contrastive attention model for person re-identification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 1179–1188.
[13] L.Zheng, Y.Yang, and A. G.Hauptmann, “Person re-identification: Past, present and future,” arXiv Prepr. arXiv1610.02984, 2016.
[14] L.Zheng, L.Shen, L.Tian, S.Wang, J.Wang, and Q.Tian, “Scalable person re-identification: A benchmark,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1116–1124.
[15] L.Liao et al., “A half-precision compressive sensing framework for end-to-end person re-identification,” Neural Comput. Appl., vol. 32, no. 4, pp. 1141–1155, 2020.
[16] K.He, X.Zhang, S.Ren, and J.Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
[17] N.Mathur, S.Mathur, D.Mathur, and P.Dadheech, “A Brief Survey of Deep Learning Techniques for Person Re-identification,” in 2020 3rd International Conference on Emerging Technologies in Computer Engineering: Machine Learning and Internet of Things (ICETCE), 2020, pp. 129–138.
[18] K.Zhou, Y.Yang, A.Cavallaro, and T.Xiang, “Omni-scale feature learning for person re-identification,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 3702–3712.
[19] A.Bochkovskiy, C.-Y.Wang, and H.-Y. M.Liao, “Yolov4: Optimal speed and accuracy of object detection,” arXiv Prepr. arXiv2004.10934, 2020.
[20] N.Wojke, A.Bewley, and D.Paulus, “Simple online and realtime tracking with a deep association metric,” in 2017 IEEE international conference on image processing (ICIP), 2017, pp. 3645–3649.
[21] S.Liu, L.Qi, H.Qin, J.Shi, and J.Jia, “Path aggregation network for instance segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8759–8768.
[22] T. L.Munea, Y. Z.Jembre, H. T.Weldegebriel, L.Chen, C.Huang, and C.Yang, “The progress of human pose estimation: a survey and taxonomy of models applied in 2D human pose estimation,” IEEE Access, vol. 8, pp. 133330–133348, 2020.
[23] H.Liu, H.Nie, Z.Zhang, and Y.-F.Li, “Anisotropic angle distribution learning for head pose estimation and attention understanding in human-computer interaction,” Neurocomputing, vol. 433, pp. 310–322, 2021.
[24] E.Marchand, H.Uchiyama, and F.Spindler, “Pose estimation for augmented reality: a hands-on survey,” IEEE Trans. Vis. Comput. Graph., vol. 22, no. 12, pp. 2633–2651, 2015.
[25] Q.Dang, J.Yin, B.Wang, and W.Zheng, “Deep learning based 2d human pose estimation: A survey,” Tsinghua Sci. Technol., vol. 24, no. 6, pp. 663–676, 2019.
[26] C.Zheng et al., “Deep learning-based human pose estimation: A survey,” arXiv Prepr. arXiv2012.13392, 2020.
[27] H.-S.Fang, S.Xie, Y.-W.Tai, and C.Lu, “Rmpe: Regional multi-person pose estimation,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2334–2343.
[28] M.Jaderberg, K.Simonyan, and A.Zisserman, “Spatial transformer networks,” Adv. Neural Inf. Process. Syst., vol. 28, pp. 2017–2025, 2015.
[29] Z.Cao, G.Hidalgo, T.Simon, S.-E.Wei, and Y.Sheikh, “OpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 1, pp. 172–186, 2019.
[30] K.Simonyan and A.Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv Prepr. arXiv1409.1556, 2014.
[31] M.Ye, J.Shen, G.Lin, T.Xiang, L.Shao, and S. C. H.Hoi, “Deep learning for person re-identification: A survey and outlook,” IEEE Trans. Pattern Anal. Mach. Intell., 2021.
[32] G.Huang, Z.Liu, L.Van DerMaaten, and K. Q.Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700–4708.
[33] J.Hu, L.Shen, and G.Sun, “Squeeze-and-excitation networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7132–7141.
[34] Y.Sun, L.Zheng, Y.Yang, Q.Tian, and S.Wang, “Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline),” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 480–496.
[35] X.Chang, T. M.Hospedales, and T.Xiang, “Multi-level factorisation net for person re-identification,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 2109–2118.
[36] F.Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1251–1258.
[37] C.-H.Chen, M.-Y.Lin, and X.-C.Guo, “High-level modeling and synthesis of smart sensor networks for Industrial Internet of Things,” Comput. Electr. Eng., vol. 61, pp. 48–66, 2017.
[38] R. J.Mayer, “IDEF0 function modeling,” Air Force Syst. Command, 1992.
[39] P. F.Felzenszwalb, R. B.Girshick, D.McAllester, and D.Ramanan, “Object detection with discriminatively trained part-based models,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 9, pp. 1627–1645, 2009.
[40] M.Everingham, S. M. A.Eslami, L.VanGool, C. K. I.Williams, J.Winn, and A.Zisserman, “The pascal visual object classes challenge: A retrospective,” Int. J. Comput. Vis., vol. 111, no. 1, pp. 98–136, 2015.

指導教授

陳慶瀚

審核日期

2021-9-23

推文