基於聯合嵌入之雙手配對與追蹤系統

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：27

、訪客IP：3.145.61.87

姓名

沈桓慶(Huan-Ching Shen) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

基於聯合嵌入之雙手配對與追蹤系統
(A Hands Pairing and Tracking System base on Associative Embedding)

相關論文

★ 使用視位與語音生物特徵作即時線上身分辨識	★ 以影像為基礎之SMD包裝料帶對位系統
★ 手持式行動裝置內容偽變造偵測暨刪除內容資料復原的研究	★ 基於SIFT演算法進行車牌認證
★ 基於動態線性決策函數之區域圖樣特徵於人臉辨識應用	★ 基於GPU的SAR資料庫模擬器：SAR回波訊號與影像資料庫平行化架構 (PASSED)
★ 利用掌紋作個人身份之確認	★ 利用色彩統計與鏡頭運鏡方式作視訊索引
★ 利用欄位群聚特徵和四個方向相鄰樹作表格文件分類	★ 筆劃特徵用於離線中文字的辨認
★ 利用可調式區塊比對並結合多圖像資訊之影像運動向量估測	★ 彩色影像分析及其應用於色彩量化影像搜尋及人臉偵測
★ 中英文名片商標的擷取及辨識	★ 利用虛筆資訊特徵作中文簽名確認
★ 基於三角幾何學及顏色特徵作人臉偵測、人臉角度分類與人臉辨識	★ 一個以膚色為基礎之互補人臉偵測策略

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

手部追蹤旨在預測影像序列中多個手的軌跡，對於空中手寫、手語辨識及手勢辨識等應用具有重要的意義，而將雙手分組可以使上述應用實現更複雜的功能。
本論文提出基於YOLOv3和聯合嵌入的方法，整合多目標追蹤和關節點檢測的單階段類神經網路模型和演算法，實現實時的多人雙手追蹤。

摘要(英)

Hand tracking aims to predict the trajectory of multiple hands in an image sequence, which is of great significance for applications such as air handwriting, sign language recognition and gesture recognition, and grouping the hands can enable the above applications to achieve more complex functions.
This paper proposes a single-stage neural network model and algorithm based on YOLOv3 and associative embedding, integrating multi-target tracking and joint point detection, to achieve real-time multi-person hand tracking.

關鍵字(中)

★ 深度學習
★ 偵測系統
★ 手勢追蹤
★ 物件偵測
★ 類神經網路
★ 聯合嵌入

關鍵字(英)

★ deep learning
★ detection system
★ gesture tracking
★ object detection
★ neural network
★ associative embedding

論文目次

頁次
中文摘要......................................................................................................................... iii
英文摘要......................................................................................................................... v
謝誌................................................................................................................................. vii
目錄................................................................................................................................. ix
圖目錄............................................................................................................................. xi
表目錄.............................................................................................................................xiii
一、緒論......................................................................................................... 1
1.1 研究動機 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 研究目的與方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 論文架構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
二、相關文獻與背景知識............................................................................. 3
2.1 物件偵測 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 非極大值抑制 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 聯合嵌入 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
三、研究架構................................................................................................. 9
3.1 研究流程 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 模型結構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 雙手偵測 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.4 雙手配對與追蹤 . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
四、實驗結果................................................................................................. 15
4.1 實驗環境 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 測試結果 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.3 實驗數據 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
五、結論......................................................................................................... 19
參考文獻......................................................................................................................... 21
附錄一............................................................................................................................. 23

參考文獻

[1] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali
Farhadi. You only look once: Unified, real-time object detection.
In Proceedings of the IEEE conference on computer vision and pattern
recognition, pages 779–788, 2016.
[2] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy,
Scott Reed, Cheng-Yang Fu, and Alexander C Berg. Ssd: Single
shot multibox detector. In European conference on computer vision,
pages 21–37. Springer, 2016.
[3] Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik.
Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer
vision and pattern recognition, pages 580–587, 2014.
[4] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster
r-cnn: Towards real-time object detection with region proposal
networks. arXiv preprint arXiv:1506.01497, 2015.
[5] Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr
Dollár. Focal loss for dense object detection. In Proceedings of the
IEEE international conference on computer vision, pages 2980–2988,
2017.
[6] Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. Realtime multi-person 2d pose estimation using part affinity fields.
In Proceedings of the IEEE conference on computer vision and pattern
recognition, pages 7291–7299, 2017.
[7] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks.
23Advances in neural information processing systems, 25:1097–1105,
2012.
[8] Alejandro Newell, Zhiao Huang, and Jia Deng. Associative embedding: End-to-end learning for joint detection and grouping.
arXiv preprint arXiv:1611.05424, 2016.
[9] Hei Law and Jia Deng. Cornernet: Detecting objects as paired
keypoints. In Proceedings of the European conference on computer
vision (ECCV), pages 734–750, 2018.
[10] Zhongdao Wang, Liang Zheng, Yixuan Liu, and Shengjin
Wang. Towards real-time multi-object tracking. arXiv preprint
arXiv:1909.12605, 2(3):4, 2019.
[11] Joseph Redmon and Ali Farhadi. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
[12] Tomas Simon, Hanbyul Joo, and Yaser Sheikh. Hand keypoint detection in single images using multiview bootstrapping.
CVPR, 2017.
[13] Hanbyul Joo, Tomas Simon, Xulong Li, Hao Liu, Lei Tan, Lin
Gui, Sean Banerjee, Timothy Scott Godisart, Bart Nabbe, Iain
Matthews, Takeo Kanade, Shohei Nobuhara, and Yaser Sheikh.
Panoptic studio: A massively multiview system for social interaction capture. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 2017.
[14] Hanbyul Joo, Hao Liu, Lei Tan, Lin Gui, Bart Nabbe, Iain
Matthews, Takeo Kanade, Shohei Nobuhara, and Yaser Sheikh.
Panoptic studio: A massively multiview system for social motion
capture. In The IEEE International Conference on Computer Vision
(ICCV), 2015.
[15] Sven Bambach, Stefan Lee, David J. Crandall, and Chen Yu.
Lending a hand: Detecting hands and recognizing activities in
complex egocentric interactions. In The IEEE International Conference on Computer Vision (ICCV), December 2015.
24[16] James MacQueen et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1,
pages 281–297. Oakland, CA, USA, 1967.
[17] A. Neubeck and L. Van Gool. Efficient non-maximum suppression. In 18th International Conference on Pattern Recognition
(ICPR’06), volume 3, pages 850–855, 2006.

指導教授

范國清(Kuo-Chin Fan)

審核日期

2021-5-21

推文