使用對抗式圖形神經網路之物件偵測張榮

、線上人數：49

、訪客IP：3.145.35.234

姓名	張融(Rong Zhang) 查詢紙本館藏	畢業系所	通訊工程學系
論文名稱	使用對抗式圖形神經網路之物件偵測張榮 (Adversarial Graph Neural Network for Deep learning Object Detection)
檔案	[Endnote RIS 格式] [Bibtex 格式] [相關文章] [文章引用] [完整記錄] [館藏目錄] 至系統瀏覽論文 ( 永不開放)
摘要(中)	使用深度學習技術的物件偵測目前已取得階段性的成功。人工神經網路(Artificial Neural Networks, ANNs)與卷積神經網路(Convolutional Neural Networks, CNNs)等深度學習模型均被使用在此領域中。然而，傳統技術並無將物件之關聯性考慮其中。與之對比，圖形神經網路(Graph Neural Networks, GNNs)具有更適於計算關聯性之特性，及圖片中像素之間的關係。本論文提出的方法為以圖形神經網路(Graph Neural Networks, GNNs) 之深度學習模型，來實現物件偵測。圖形神經網路(Graph Neural Networks, GNNs)之優點在於其除了考慮隱藏特徵之外，也同時計算鄰接矩陣，及隱藏特徵之間關連性。圖形神經網路也能有效地保留與學習不同尺度之特徵，對於物件偵測時偵測不同大小之物件有巨大優勢，如: 特徵金字塔(Feature Pyramid Network, FPN)。基於以上，我們提出一基於圖形神經網路與特徵金字塔之深度學習模型。此模型利用圖形神經網路為生成器，特徵金字塔為鑑別器，從而獲得更好的物件偵測平均精準度(Average Precision, AP)。最終，提出之模型平均精準度達至50.1，並且在微軟場景中常見物件數據集(Microsoft Common Objects in Context, MS COCO)勝過其他模型。
摘要(英)	Object Detection using deep learning has achieved great success recently. Several deep learning architectures are used in this field, such as Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs). However, traditional neural networks are characterized by not including relationships between objects. In contrast, graph neural network has properties that are more suitable for them to compute adjacencies, which typically represent relationships between objects or pixels in an image. In this paper, we propose an GNN-based object detection model. Graph Neural Networks (GNNs) are considered to be an efficient framework for neural net for its node and edge learning. Graph Neural Networks (GNNs) can effectively preserve and learn multi-scale features generated by backbone, in our case, ResNet101. Furthermore, we also propose an adversarial model based on GNNs and Feature Pyramid Network (FPN). Using graph-autoencoder as generator and FPN as discriminator, we successfully improve the overall performance (50.1 AP) and outperform other GNNs based models on non-trivial dataset, MS COCO 2017.
關鍵字(中)	★ 圖形神經網路 ★ 物件偵測 ★ 對抗神經網路	關鍵字(英)	★ Graph neural networks ★ Object Detection ★ Adversarial Neural Networks
論文目次	摘要 i Adversarial Graph Neural Networks for Deep Learning Object Detection ii Abstract ii Acknowledgement iii Outline iv Outline of Figures vii Outine of Tables ix Chapter 1 Introduction 1 1-1 Background 1 1-2 Motivation 2 1-3 Thesis Framework 4 Chapter 2 Traditional techniques on Object Detection 5 2-1 Introduction on Object detection 5 2-2 Introduction on traditional methods of Object Detection 6 2-3 Conclusion on Traditional techniques on Object Detection 7 Chapter 3 Deep Learning Techniques 8 3-1 Graph Neural Network 8 3-1-1 Nodes and Edges feature representation 10 3-1-2 Graph Convolution method 11 3-1-3 Graph Attention method 13 3-1-4 Graph-pooling and Graph-unpooling 14 3-2 Adversarial Neural Network 15 3-2-1 Generator and Discriminator 16 3-2-2 Generative Adversarial Network 17 3-3 Deep learning models on Object Detection 18 3-3-1 One stage models 19 3-3-2 Two stages models 21 Chapter 4 Proposed method 24 4-1 Framework 24 4-1-1 Graph construction 24 4-1-2 Backbone 25 4-1-3 Adversarial Graph Neural Network 25 4-1-4 Prediction network 28 4-1-5 Optimization 29 4-2-1 Dataset 30 4-2-2 Implementation 31 4-2-3 Loss function 36 4-3 Testing Stage 38 Chapter 5 Experimental Results 40 5-1 Experimental Environment 40 5-2 Comparison 41 Chapter 6 Conclusion 48 Reference 49
參考文獻	[1] Krizhevsky, A., Sutskever, I. and Hinton, G., 2017. ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), pp.84-90. [2] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich, 2015. Going deeper with convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9. [3] Cai, Z. and Vasconcelos, N., 2021. Cascade R-CNN: High Quality Object Detection and Instance Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(5), pp.1483-1498. [4] Ren, S., He, K., Girshick, R. and Sun, J., 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), pp.1137-1149. [5] Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick, 2017. Mask R-CNN. Proceedings of the IEEE conference on computer vision and pattern recognition. [6] Gidaris, S., & Komodakis, N. (2015). Object detection via a multi-region and semantic segmentation-aware cnn model. In Proceedings of the IEEE international conference on computer vision (pp. 1134-1142). [7] Badrinarayanan, V., Kendall, A., & Cipolla, R. (2017). Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence, 39(12), 2481-2495. [8] Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2117-2125). [9] Liu, S., Qi, L., Qin, H., Shi, J., & Jia, J. (2018). Path aggregation network for instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8759-8768). [10] Zhao, G., Ge, W., & Yu, Y. (2021). GraphFPN: Graph Feature Pyramid Network for Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 2763-2772). [11] Trémeau, A., & Colantoni, P. (2000). Regions adjacency graph applied to color image segmentation. IEEE Transactions on image processing, 9(4), 735-744. [12] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014). Generative adversarial nets. Advances in neural information processing systems, 27. [13] Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., & Bengio, Y. (2017). Graph attention networks. arXiv preprint arXiv:1710.10903. [14] O’Mahony, N., Campbell, S., Carvalho, A., Harapanahalli, S., Hernandez, G. V., Krpalkova, L., ... & Walsh, J. (2019, April). Deep learning vs. traditional computer vision. In Science and information conference (pp. 128-144). Springer, Cham. [15] Karami, E., Shehata, M., & Smith, A. (2017). Image identification using SIFT algorithm: performance analysis against different image deformations. arXiv preprint arXiv:1710.02728. [16] Bay, H., Tuytelaars, T., & Gool, L. V. (2006, May). Surf: Speeded up robust features. In European conference on computer vision (pp. 404-417). Springer, Berlin, Heidelberg. [17] Rosten, E., & Drummond, T. (2006, May). Machine learning for high-speed corner detection. In European conference on computer vision (pp. 430-443). Springer, Berlin, Heidelberg. [18] Goldenshluger, A., & Zeevi, A. (2004). The Hough transform estimator. The Annals of Statistics, 32(5), 1908-1932. [19] Tsai, F. C. (1994). Geometric hashing with line features. Pattern Recognition, 27(3), 377-389. [20] Zhao, Z. Q., Zheng, P., Xu, S. T., & Wu, X. (2019). Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems, 30(11), 3212-3232. [21] Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 580-587). [22] Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 1440-1448).
指導教授	陳永芳張寶基(Yung-Fang Chen Pao-Chi Chang)	審核日期	2022-8-4
推文	facebook plurk twitter funp google live udn HD myshare reddit netvibes friend youpush delicious baidu
網路書籤	Google bookmarks del.icio.us hemidemi myshare

博碩士論文 109523012 詳細資訊