基於物件遮罩與邊界引導多尺度遞迴卷積神經 網路之語意分割

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：57

、訪客IP：3.147.89.105

姓名

王冠中(Kuan-Chung Wang) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

基於物件遮罩與邊界引導多尺度遞迴卷積神經網路之語意分割
(Semantic Segmentation Multi Scale Recurrent Convolutional Neural Network Based On Object Mask and Boundary Guided)

相關論文

★ Single and Multi-Label Environmental Sound Recognition with Gaussian Process	★ 波束形成與音訊前處理之嵌入式系統實現
★ 語音合成及語者轉換之應用與設計	★ 基於語意之輿情分析系統
★ 高品質口述系統之設計與應用	★ 深度學習及加速強健特徵之CT影像跟骨骨折辨識及偵測
★ 基於風格向量空間之個性化協同過濾服裝推薦系統	★ RetinaNet應用於人臉偵測
★ 金融商品走勢預測	★ 整合深度學習方法預測年齡以及衰老基因之研究
★ 漢語之端到端語音合成研究	★ 基於 ARM 架構上的 ORB-SLAM2 的應用與改進
★ 基於深度學習之指數股票型基金趨勢預測	★ 探討財經新聞與金融趨勢的相關性
★ 基於卷積神經網路的情緒語音分析	★ 運用深度學習方法預測阿茲海默症惡化與腦中風手術存活

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

近年來，深度學習作為機器學習的分支，也扮演著人工智慧的一個重要角色，其中以卷積神經網路(Convolutional Neural Network, CNN)在影像辨識上相較於傳統的分類方法有著突破性的表現。全卷積神經網路(Fully Convolutional Network, FCN)[10]的出現也使得影像語意分割相關的研究蓬勃發展，比起以往根據圖片紋理顏色等內容的聚類方式，加入了語意資訊訓練進而提升語意分割的準確率。本論文結合了兩個網路的優勢，一種基於物件邊緣引導方式，以強化邊緣及物件本身的完整性，另一種負責對影像語意分割的預測，提出一個端對端訓練的網路架構。
本論文所提出的架構改良了 DT-EdgeNet (Domain Transform with EdgeNet)[11]。在此，我們結合 OBG-FCN[12]遮罩網路取代[11]邊緣網路，使用的遮罩網路能預測背景、物件以及物件邊緣參考圖。另外我們提出架構使用多尺度 ResNet-101 當作基底網路，同時引入多尺度Atrous Convolution使用並接的方式結合至本架構訓練，藉此保有特徵圖的尺度，除了可以增加感知域，又可進一步的提升語意分割的準確度。
在實驗上，我們於 VOC2012 測試集上取得高辨識率外。此外，我們使用 Faster RCNN 提取物件邊框與提出架構語意分割結果結合作為延伸應用，以進行實例級別分割(instance-level segmentation)。

摘要(英)

In recent years, as a branch of machine learning, deep learning play an important role in Artificial Intelligence, which Convolutional Neural Network (CNN) has a breakthrough Performance in the image classification when comparing with traditional classification methods. The emergence of the full Convolutional Network (FCN)[10] also makes the study of image semantic segmentation flourish. In contrast to past work clustering according to the image texture and color, FCN joined the training of semantic information to improve the accuracy of semantic segmentation. Our paper combines the advantages of two networks, an object boundary based approach to strengthen the integrity of edge and the object itself, and the other is responsible for the prediction of image semantic segmentation, proposed an end-to-end training network architecture.
In this paper, proposed architecture improves the DT EdgeNet (Domain Transform with EdgeNet)[11]. Here, we combined the OBG-FCN [12] mask network and replaced the [11] edge network. The used mask network can predict background, object, and object edge reference diagrams. In addition, our architecture uses multi-scale ResNet-101 as the base network and introduces multi-scale Atrous Convolution to architecture training to preserve the dimensions of the feature map, which increases the receptive and further to enhance the accuracy of semantic segmentation.
In the experiments, we got the high performance of recognition on the VOC2012 test set. In addition, we combined extraction of object bounding box generated by Faster RCNN and result of proposed semantic segmentation as an extension application for instance-level segmentation.

關鍵字(中)

★ 卷積神經網路
★ 語意分割

關鍵字(英)

★ Convolutional Neural Network
★ Semantic Segmentation

論文目次

章節目次
中文摘要 i
Abstract ii
章節目次 iii
圖目錄 vi
表目錄 viii
第一章緒論 1
1.1 背景 1
1.2 研究動機與目的 2
1.3 研究方法與章節概要 2
第二章深度學習 4
2.1類神經網路 4
2.2.1 類神經網路發展 4
2.2感知機原理 5
2.3倒傳遞類神經網路 7
第三章影像語意分割 9
3.1 影像語意分割相關文獻 10
3.2 Fully Convolutional Networks for Semantic Segmentation 14
3.2.1 全卷積神經網路 15
3.2.2 密集預測(dense prediction) 16
3.3 DeepLab 18
3.3.1 孔洞演算法(Hole Algorithm) 19
3.3.2 控制感知域並加速卷積網路的運算 20
3.3.3 全連結條件隨機場 21
3.4 Object Boundary Guide[12] 23
3.4.1 OBP-FCN與重新標籤邊界 24
3.4.2 邊界與物件引導之遮罩層 26
第四章提出架構 28
4.1 ResNet-101 29
4.1.1 殘差網路(Residual Network) 30
4.2 遮罩網路 31
4.2.1 OBP-Mask架構 32
4.3域轉換(Domain Transform) 33
4.3.1遞迴濾波之域轉換 34
4.3.2可訓練的域轉換濾波器 36
第五章、實驗 40
5.1實驗設置介紹 40
5.2實驗結果 41
5.2.1物件遮罩實驗 41
5.2.3其他網路之比較 43
5.3延伸應用 43
第六章結論及未來研究方向 45
第七章參考文獻 46

參考文獻

[1] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. ”Imagenet
classification with deep convolutional neural networks.” Advances in neural
information processing systems. 2012.
[2] Simonyan, Karen, and Andrew Zisserman. ”Very deep convolutional
networks for large-scale image recognition.” arXiv preprint
arXiv:1409.1556 (2014).
[3] Szegedy, Christian, et al. ”Going deeper with convolutions.” Proceedings of
the IEEE conference on computer vision and pattern recognition. 2015.
[4] He, Kaiming, et al. ”Deep residual learning for image
recognition.” Proceedings of the IEEE conference on computer vision and
pattern recognition. 2016.
[5] Girshick, Ross, et al. ”Rich feature hierarchies for accurate object detection
and semantic segmentation.” Proceedings of the IEEE conference on
computer vision and pattern recognition. 2014.
[6] Girshick, Ross. ”Fast r-cnn.” Proceedings of the IEEE international
conference on computer vision. 2015.
[7] Ren, Shaoqing, et al. ”Faster R-CNN: Towards real-time object detection
with region proposal networks.” Advances in neural information processing
systems. 2015.
[8] D. G. Lowe. Distinctive image features from scale-invariant keypoints. IJCV,
60(2):91–110, 2004.
[9] N. Dalal and B. Triggs. Histograms of oriented gradients for human
detection. In CVPR, 2005.
[10] Long, Jonathan, Evan Shelhamer, and Trevor Darrell. ”Fully convolutional
47

networks for semantic segmentation.” Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition. 2015.
[11] L. Chen, J. Barron, G. Papandreou, K. Murphy, and A. Yuille, “Semantic
image segmentation with task-specific edge detection using CNNs and a
discriminatively trained domain transform,” arXiv preprint
arXiv:1511.03328, 2015.
[12] Q. Huang, C. Xia, W. Zheng, Y. Song, H. Xu, and C. C. J. Kuo, “Object
Boundary Guided Semantic Segmentation” arXiv preprint
arXiv:1603.09742, 2016.
[13] W. McCulloch and W. Pitts. “A logical calculus of the ideas immanent in
nervous activity,” The bulletin of mathematical biophysics 5.4 (1943):
115-133.
[14] M. Minsky, S. Papert, “Perceptrons,” M.I.T. Press Perceptrons, 1969
[15] F. Rosenblatt, “The perceptron: A probabilistic model for information
storage and organization in the brain,” Psychological Review, Vol 65(6),
Nov 1958, 386-408.
[16] D. Hebb, “The Organization of Behavior: A Neuropsychological Theory,”
New York: Wiley, 1949.
[17] D. Rumelhart, G. Hinton, and R. Williams, “Learning representations by
back-propagating errors,” Neurocomputing: foundations of research, James
A. Anderson and Edward Rosenfeld (Eds.). MIT Press, Cambridge, MA,
USA 696-699, 1988. [18] 曾定章,影像處理
[19] B. Hariharan, P. Arbel´aez, R. Girshick, and J. Malik, “Simultaneous
detection and segmentation,” In ECCV. 2014.
48

[20] J. Dai, K. He, and J. Sun. “Convolutional feature masking for joint object
and stuff segmentation,” arXiv preprint arXiv: 1412.1283, 2014.
[21] He, Kaiming, et al. ”Spatial pyramid pooling in deep convolutional
networks for visual recognition.” European Conference on Computer Vision.
Springer, Cham, 2014.
[22] H. Noh, S. Hong and B. Han, ”Learning Deconvolution Network for
Semantic Segmentation,” 2015 IEEE International Conference on Computer
Vision (ICCV), Santiago, 2015, pp. 1520-1528.
[23] L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille, “DeepLab:
Semantic Image Segmentation with Deep Convolutional Nets, Atrous
Convolution, and Fully Connected CRFs,” arXiv:1606.00915,2016
[24] G. Papandreou, L. C. Chen, K. P. Murphy and A. L. Yuille, “Weakly-and
Semi-Supervised Learning of a Deep Convolutional Network for Semantic
Image Segmentation,” 2015 IEEE International Conference on Computer
Vision (ICCV), Santiago, 2015, pp. 1742-1750.
[25] X. He, R. S. Zemel, and M. Carreira-Perpindn, “Multiscale conditional
random fields for image labeling,” In CVPR, 2004
[26] C. Rother, V. Kolmogorov, and A. Blake, “Grabcut: Interactive foreground
extraction using iterated graph cuts,” In SIGGRAPH, 2004.
[27] P. Kr¨ahenb¨uhl, and V. Koltun, “Efficient inference in fully connected crfs
with gaussian edge potentials,”In NIPS, 2011
[28] Zagoruyko, Sergey, et al. ”A multipath network for object detection.” arXiv
preprint arXiv:1604.02135 (2016).
[29] Zheng, Shuai, et al. ”Conditional random fields as recurrent neural
49

networks.” Proceedings of the IEEE International Conference on Computer
Vision. 2015.
[30] J. R. Uijlings, K. E. van de Sande, T. Gevers, and A.W. Smeulders.
“Selective search for object recognition,”In IJCV, 2013.
[31] J. R. Uijlings, K. E. van de Sande, T. Gevers, and A.W. Smeulders.
“Selective search for object recognition,”In IJCV, 2013.
[32] Arbel´aez, P., Pont-Tuset, J., Barron, J., Marques, F., Malik, J.: Multiscale
combinatorial grouping. In: CVPR (2014)
[33] P. Dollár and C. L. Zitnick, “Structured Forests for Fast Edge
Detection,” 2013 IEEE International Conference on Computer Vision,
Sydney, NSW, 2013, pp. 1841-1848.
[34] P. O. Pinheiro, R. Collobert, and P. Dollár, “Learning to segment object
candidates,” arXiv: 1506.06204, 2015. [35] Y. Ganin and V. Lempitsky, “ 4 N -fields: Neural network nearest neighbor fields for image transforms,” In ACCV, 2014
[36] D. C. Ciresan, A. Giusti, L. M. Gambardella, and J. Schmidhuber. “Deep
neural networks segment neuronal membranes in electron microscopy
images.,”In NIPS, pages 2852–2860, 012
[37] L.-C. Chen, Y. Yang, J. Wang, W. Xu, and A. L. Yuille, “Attention to scale:
Scale-aware semantic image segmentation,” in CVPR, 2016
[38] W. Liu, A. Rabinovich, and A. C. Berg, “ParseNet: Looking wider to see
better,” arXiv preprint arXiv:1506.04579, 2015.
[39] E. S. L. Gastal and M. M. Oliveira, “Domain transform for edge-aware
image and video processing,” In SIGGRAPH, 2011.
[40] M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman.
“The PASCAL Visual Object Classes (VOC) Challenge,” IJCV, 2010
50

[41] B. Hariharan , P. Arbelaez, L. Bourdev , S. Maji ,and J. Malik, “Semantic
Contours from Inverse Detectors,”In ICCV,2011
[42] J. Chung, C. Gulcehre, K. Cho, amd Y. Bengio “Empirical evaluation of
gated recurrent neural networks on sequence modeling,” arXiv preprint
arXiv:1412.3555 ,2014.
[43] Kingma, Diederik, and Jimmy Ba. ”Adam: A method for stochastic
optimization.” arXiv preprint arXiv:1412.6980 (2014).

指導教授

王家慶(Jia-Ching Wang)

審核日期

2017-8-18

推文