基於物件偵測之多物件追蹤關聯策略

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：25

、訪客IP：3.138.36.87

姓名

張維珊(Wei-Shan Chang) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

基於物件偵測之多物件追蹤關聯策略
(A Novel Matching Strategy of Detection-Based Multi-Object Tracking)

相關論文

★ 基於LSTM之中文空中手寫辨識	★ 基於RGB無深度影像之中文空中手寫辨識
★ 結合跨尺度自注意力與分割混合層之輕量化分類網路

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2026-8-1以後開放)

摘要(中)

多物件追蹤技術應用相當廣泛，不論是從監視器畫面進行車流監控、人流監控、行人追蹤、球場上球員走位的戰術分析都會用到多物件追蹤的技術，其主要任務是將一段影片中分布在不同幀的偵測框正確的關聯起來，困難的地方在於當目標被長時間遮蔽、消失、場景複雜的情況下容易發生追蹤錯誤的情況，雖然已有許多研究提出不同的追蹤策略來解決此問題，但追蹤結果仍有可進步的空間。
為了提高多物件追蹤的準確度，本論文基於ByteTrack架構上提出一個兩階段的Online多物件追蹤方法: MEDIATrack，我們將Kalman Filter更換為NSA Kalman Filter、引入外觀特徵作為資料關聯參考資訊，並設計懲罰機制去緩解在場景複雜所出現的錯誤情況，此外也移除歷史軌跡中未激活軌跡的機制，直接將高信心值未匹配上的偵測框新增至歷史軌跡，使得本研究在MOT17資料集達到79.3(%) MOTA，達到了state-of-the-art的水準。

摘要(英)

Multi-object tracking (MOT) is widely applied to traffic flow monitoring, human flow monitoring, pedestrian tracking, or tactical analysis of players on the courts. It associates the detection boxes with tracklets for each frame in the video. The challenges of MOT include long-term occlusions, missing detections, and complex scenes. Although many trackers have proposed to solve these problems, the tracking results still have room for improvement. In this thesis, we propose a solution named MEDIATrack, which is a two-stage online multi-object tracking method based on the ByteTrack. We replace the Kalman Filter with the NSA Kalman Filter, introduce appearance features for track association, and design a punishment mechanism to alleviate errors in complex scenes. In addition, we remove the nonactivated strategy, and the high-score unmatched detection boxes are directly added to the tracklets. On MOT17, we achieve 79.3 MOT Accuracy and state-of-the-art performance.

關鍵字(中)

★ 多物件追蹤
★ 行人追蹤
★ 外觀特徵
★ 資料關聯
★ 卡爾曼濾波器

關鍵字(英)

★ Multiple-Object Tracking
★ Appearance Similarity
★ Data Association

論文目次

摘要 i
ABSTRACT ii
目錄 iii
圖目錄 v
表目錄 vi
第一章緒論 1
1.1 研究背景與動機 1
1.2 研究目的 1
1.3 章節大綱 2
第二章文獻回顧 3
2.1 物件偵測YOLOX 3
2.2 資料關聯 4
2.2.1 卡爾曼濾波器 4
2.2.2 NSA卡爾曼濾波器 5
2.2.3 匈牙利演算法 6
2.2.4 Kuhn-Munkres演算法 6
2.3 兩階段方法 6
2.4 一階段方法 12
第三章研究方法 13
3.1 系統架構 13
3.2 MEDIA Block 15
3.2.1 特徵提取方法 15
3.2.2 MEDIA策略 16
第四章實驗方法 19
4.1 資料集與評價指標 19
4.1.1 資料集 19
4.1.2 評價指標 21
4.2 開發環境與工具 23
4.3 實驗結果 23
4.3.1 消融實驗 23
4.3.2 方法比較 26
4.3.3 追蹤結果視覺化 27
第五章結論 29
5.1 貢獻 29
5.2 未來展望 29
參考文獻 30

參考文獻

[1] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779-788.
[2] Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, "Yolox: Exceeding yolo series in 2021," arXiv preprint arXiv:2107.08430, 2021.
[3] G. Welch and G. Bishop, "An introduction to the Kalman filter," 1995.
[4] Y. Du, J. Wan, Y. Zhao, B. Zhang, Z. Tong, and J. Dong, "GIAOTracker: A comprehensive framework for MCMOT with global information and optimizing strategies in VisDrone 2021," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 2809-2819.
[5] H. W. Kuhn, "The Hungarian method for the assignment problem," Naval research logistics quarterly, vol. 2, no. 1‐2, pp. 83-97, 1955.
[6] J. Xu, Y. Cao, Z. Zhang, and H. Hu, "Spatial-temporal relation networks for multi-object tracking," in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 3988-3998.
[7] B. Shuai, A. Berneshawi, X. Li, D. Modolo, and J. Tighe, "Siammot: Siamese multi-object tracking," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 12372-12382.
[8] P. Bergmann, T. Meinhardt, and L. Leal-Taixe, "Tracking without bells and whistles," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 941-951.
[9] Y. Wang, K. Kitani, and X. Weng, "Joint object detection and multi-object tracking with graph neural networks," in 2021 IEEE International Conference on Robotics and Automation (ICRA), 2021: IEEE, pp. 13708-13715.
[10] Y. Zhang et al., "Bytetrack: Multi-object tracking by associating every detection box," arXiv preprint arXiv:2110.06864, 2021.
[11] Y. Zhang, C. Wang, X. Wang, W. Zeng, and W. Liu, "Fairmot: On the fairness of detection and re-identification in multiple object tracking," International Journal of Computer Vision, vol. 129, no. 11, pp. 3069-3087, 2021.
[12] A. Bewley, Z. Ge, L. Ott, F. Ramos, and B. Upcroft, "Simple online and realtime tracking," in 2016 IEEE international conference on image processing (ICIP), 2016: IEEE, pp. 3464-3468.
[13] N. Wojke, A. Bewley, and D. Paulus, "Simple online and realtime tracking with a deep association metric," in 2017 IEEE international conference on image processing (ICIP), 2017: IEEE, pp. 3645-3649.
[14] Z. Wang, L. Zheng, Y. Liu, Y. Li, and S. Wang, "Towards real-time multi-object tracking," in European Conference on Computer Vision, 2020: Springer, pp. 107-122.
[15] L. Leal-Taixé, C. Canton-Ferrer, and K. Schindler, "Learning by tracking: Siamese CNN for robust target association," in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2016, pp. 33-40.
[16] S. Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," Advances in neural information processing systems, vol. 28, 2015.
[17] X. Zhou, D. Wang, and P. Krähenbühl, "Objects as points," arXiv preprint arXiv:1904.07850, 2019.
[18] P. Sun et al., "Transtrack: Multiple object tracking with transformer," arXiv preprint arXiv:2012.15460, 2020.
[19] X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai, "Deformable detr: Deformable transformers for end-to-end object detection," arXiv preprint arXiv:2010.04159, 2020.
[20] J. Peng et al., "Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking," in European conference on computer vision, 2020: Springer, pp. 145-161.
[21] F. Zeng, B. Dong, T. Wang, X. Zhang, and Y. Wei, "Motr: End-to-end multiple-object tracking with transformer," arXiv preprint arXiv:2105.03247, 2021.
[22] S. Zagoruyko and N. Komodakis, "Wide residual networks," arXiv preprint arXiv:1605.07146, 2016.
[23] K. Bernardin and R. Stiefelhagen, "Evaluating multiple object tracking performance: the clear mot metrics," EURASIP Journal on Image and Video Processing, vol. 2008, pp. 1-10, 2008.
[24] E. Ristani, F. Solera, R. Zou, R. Cucchiara, and C. Tomasi, "Performance measures and a data set for multi-target, multi-camera tracking," in European conference on computer vision, 2016: Springer, pp. 17-35.
[25] P. Chu, J. Wang, Q. You, H. Ling, and Z. Liu, "Transmot: Spatial-temporal graph transformer for multiple object tracking," arXiv preprint arXiv:2104.00194, 2021.
[26] N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, "End-to-end object detection with transformers," in European conference on computer vision, 2020: Springer, pp. 213-229.
[27] T. Meinhardt, A. Kirillov, L. Leal-Taixe, and C. Feichtenhofer, "Trackformer: Multi-object tracking with transformers," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 8844-8854.
[28] X. Zhou, V. Koltun, and P. Krähenbühl, "Tracking objects as points," in European Conference on Computer Vision, 2020: Springer, pp. 474-490.

指導教授

范國清謝君偉(Kuo-Chin Fan Jun-Wei Hsieh)

審核日期

2022-7-29

推文