博碩士論文 103522076 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:10 、訪客IP:3.141.244.201
姓名 李俊宏(Jyun-Hong Li)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 物件遮罩與邊界引導之遞迴卷積神經網路
(Object Mask and Boundary Guided Recurrent Convolution Neural Network)
相關論文
★ Single and Multi-Label Environmental Sound Recognition with Gaussian Process★ 波束形成與音訊前處理之嵌入式系統實現
★ 語音合成及語者轉換之應用與設計★ 基於語意之輿情分析系統
★ 高品質口述系統之設計與應用★ 深度學習及加速強健特徵之CT影像跟骨骨折辨識及偵測
★ 基於風格向量空間之個性化協同過濾服裝推薦系統★ RetinaNet應用於人臉偵測
★ 金融商品走勢預測★ 整合深度學習方法預測年齡以及衰老基因之研究
★ 漢語之端到端語音合成研究★ 基於 ARM 架構上的 ORB-SLAM2 的應用與改進
★ 基於深度學習之指數股票型基金趨勢預測★ 探討財經新聞與金融趨勢的相關性
★ 基於卷積神經網路的情緒語音分析★ 運用深度學習方法預測阿茲海默症惡化與腦中風手術存活
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 ( 永不開放)
摘要(中) 卷積神經網路(Convolutional Neural Network, CNN)在辨識上的能力有著優秀的效能,卷積神經網路不只提升了全影像分類的效能,也使得區域影像的辨識提升。而全卷積神經網路(Fully Convolutional Network, FCN)的出現也使得影像語意分割相關的研究蓬勃發展,比起以往使用區域提取(Region Proposal)結合支持向量機(Super Vector Machine, SVM)的方式,大幅的提升語意分割的準確率。
本論文結合兩種網路達到效能的提升,一種負責遮罩的產生,另一種負責對影像語意的分析。我們所提出的方法可以改良DT-EdgeNet (Domain Transform with EdgeNet)[19]使用影像邊緣圖執行域轉換的部份,由於[19]產生的輸出圖會包含影像中所有可能的邊緣。這些邊緣也會包含非物件部份,所以使用域轉換的時候,有機會受到非物件邊緣的影像造成語意分割的結果錯誤,而我們所使用的遮罩網路會預測出只包含背景、物件以及邊界只包含目標物件的邊緣參考圖,因此可以降低,非物件邊緣的影響。在我們的研究中還發現不使用邊界得分圖,而是使用物件得分圖執行域轉換的方式,可以更進一步的提升網路的準確度,而且我們的遮罩網路,除了可以輔助域轉換的結果外,也可以產生有效的遮罩優化分割結果的空間與區域訊息。
我們的研究過程也改良OBG-FCN(object boundary guided FCN)[19]的架構,將各種步長的OBG-FCN使用串接的方式訓練,可以更進一步的提升網路對物件與邊界的準確度。
最終我們提出之架構使用Pascal VOC2012驗證資料的效能,比起所使用的基礎網路[18]提升了約6.6%。
摘要(英) Convolution neural network (CNN) has outstanding performance on recognition, CNN not only enhance the effectiveness of the whole-image classification, but also makes the identification of local task upgrade. The Full convolution neural network (FCN) also makes the improvement on semantic image segmentation, compared to the traditional way using region proposal combined super vector machine, and significantly improved the accuracy of semantic segmentation.
In our paper, we combined two network to improve accuracy. One produces mask, and the other one classifies label of pixel. One of our proposed is that, we change the joint images of domain transform in DT-EdgeNet [19]. Due to the joint images of DT-EdgeNet are edges. These edges include the edges of object, which do not belong to the training set. So we guess that result of [19] after domain transform mind be influence by these edges.
Our mask net can produce score map of background, object and boundary. These results do not include object belong to the training set. Therefore, we can reduce the influence of non-class object. Our mask net can also produce mask to optimize spatial information.
Our other proposal is that we concatenate different pixel stride of OBG-FCN [18]. By adding this concatenate layer to train net, we can enhance the accuracy of object of boundary.
In the end, we tested our proposed architecture on Pascal VOC2012, and got 6.6% higher than baseline on mean IOU.
關鍵字(中) ★ 深層學習
★ 卷積神經網路
★ 影像語意分割
關鍵字(英) ★ deep learning
★ convolution neural network
★ semantic image segmentation
論文目次 章節目次
第一章 緒論 1
1.1 前言 1
1.2 研究動機與目的 1
1.3 論文架構與章節概要 1
第二章 深度學習 3
2.1 類神經網路 3
2.1.1 類神經網路的發展 3
2.2 感知機運作原理 4
2.3 倒傳遞類神經網路 6
2.4卷積神經網路 7
第三章 影像語意分割 9
3.1 影像語意分割相關文獻 10
3.2 FULLY CONVOLUTIONAL NETWORKS FOR SEMANTIC SEGMENTATION 14
3.2.1 全卷積神經網路 15
3.2.2 密集預測(dense prediction) 16
3.2.2.1 如何優化分割結果 17
3.3 DEEPLAB 18
3.3.1 孔洞演算法(Hole algorithm) 19
3.3.2 控制感知域並加速卷積網路的運算 20
3.3.3 全連結條件隨機場(Fully-Connected Conditional Random Fields) 21
3.4 OBJECT BOUNDARY GUIDE 23
3.4.1 OBP-FCN與重新標籤邊界 24
3.4.2 邊界與物件引導之遮罩層 26
第四章 物件遮罩與邊界引導 28
4.1 簡介 28
4.2 域轉換(DOMAIN TRANSFORM) 28
4.2.1 遞迴濾波之域轉換 29
4.2.2 可訓練的域轉換濾波器 31
4.3 物件與邊界之預測 34
4.3.1 OBP-MASK架構 35
4.3.2 OBP-FCN VS OBP-MASK 36
4.4 物件與邊界 38
4.4.1 物件遮罩 38
4.4.2 物件與邊界引導 38
第五章 實驗 40
5.1 實驗設置介紹 40
5.2 實驗結果 41
5.2.1 物件遮罩實驗 41
5.2.1 物件與邊界引導實驗 43
5.2.2 與其他網路之比較 47
第六章 結論及未來研究方向 49
參考文獻 [1] K. Simonyan and A. Zisserman. “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv: 1409.1556, 2014
[2] A. Krizhevsky, I. Sutskever, and G. Hinton. “ImageNet classification with deep convolutional neural networks,” In NIPS, 2012.
[3] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. “Going deeper with convolutions,” CoRR, abs/1409.4842, 2014.
[4] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. “Gradient-based learning applied to document recognition,” Proc. of the IEEE, 1998.
[5] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,” 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 580-587.
[6] R. Girshick, “Fast R-CNN,” 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 1440-1448.
[7] S. Ren, K. He, R. Girshick, J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.PP, no.99, pp.1-1.
[8] K. He, X. Zhang, S. Ren and J. Sun, “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1904-1916, Sept. 1 2015.
[9] A. Dosovitskiy P. Fischer J. Springenberg, M. Riedmiller, T. Brox, “Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks,” in IEEE Transactions on Pattern Analysis and Machine Intelligence , vol.PP, no.99, pp.1-1.
[10] S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Su, D. Du, C. Huang, and P. Torr., “Conditional Random Fields as Recurrent Neural Networks,” 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 1529-1537.
[11] J. Dai, K.He, and J. Sun, “Instance-aware semantic segmentation via multi-task network cascades,” arXiv preprint arXiv: 1512.04412, 2015
[12] S.Zagoruyko, A.Lerer, T. Lin, P. Pinheiro, S. Gross, S. Chintala, and P. Dollár. “A multipath network for object detectionm” arXiv preprint arXiv: 1604.02135, 2016.
[13] J. Dai, K. He, Y. Li, S. Ren,and J. Sun, “Instance-sensitive Fully Convolutional Networks,” arXiv:1603.08678,2016
[14] D. Pathak, P. Krähenbühl and T. Darrell, “Constrained Convolutional Neural Networks for Weakly Supervised Segmentation,” 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 1796-1804.
[15] D. Lin, J. Dai, J. Jia, K. He, J. Sun, “ScribbleSup: Scribble-Supervised Convolutional Networks for Semantic Segmentation,” arXiv: 1604.05144,2016
[16] W. Zhang, C. Cao, S. Chen, J. Liu and X. Tang, “Style Transfer Via Image Component Analysis,” in IEEE Transactions on Multimedia, vol. 15, no. 7, pp. 1594-1601, Nov. 2013.
[17] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, “ Mastering the game of go with deep neural networks and tree search.” Nature, 529(7587):484–489, 2016.
[18] Q. Huang, C. Xia, W. Zheng, Y. Song, H. Xu, and C. C. J. Kuo, “Object Boundary Guided Semantic Segmentation” arXiv preprint arXiv:1603.09742, 2016.
[19] L. Chen, J. Barron, G. Papandreou, K. Murphy, and A. Yuille, “Semantic image segmentation with task-specific edge detection using CNNs and a discriminatively trained domain transform,” arXiv preprint arXiv:1511.03328, 2015.
[20] E. S. L. Gastal and M. M. Oliveira, “Domain transform for edge-aware image and video processing,” In SIGGRAPH, 2011.
[21] W. McCulloch and W. Pitts. “A logical calculus of the ideas immanent in nervous activity,” The bulletin of mathematical biophysics 5.4 (1943): 115-133.
[22] M. Minsky, S. Papert, “Perceptrons,” M.I.T. Press Perceptrons, 1969
[23] F. Rosenblatt, “The perceptron: A probabilistic model for information storage and organization in the brain,” Psychological Review, Vol 65(6), Nov 1958, 386-408.
[24] D. Hebb, “The Organization of Behavior: A Neuropsychological Theory,” New York: Wiley, 1949.
[25] D. Rumelhart, G. Hinton, and R. Williams, “Learning representations by back-propagating errors,” Neurocomputing: foundations of research, James A. Anderson and Edward Rosenfeld (Eds.). MIT Press, Cambridge, MA, USA 696-699, 1988.
[26] 曾定章,影像處理
[27] P. Arbeláez, B. Hariharan, C. Gu, S. Gupta, L. Bourdev and J. Malik, “Semantic segmentation using regions and parts,” Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, Providence, RI, 2012, pp. 3378-3385.
[28] J. Carreira, R. Caseiro, J. Batista, and C. Sminchisescu. “Semantic segmentation with second-order pooling,” In ECCV, 2012.
[29] B. Hariharan, P. Arbel´aez, R. Girshick, and J. Malik, “Simultaneous detection and segmentation,” In ECCV. 2014.
[30] J. Dai, K. He, and J. Sun. “Convolutional feature masking for joint object and stuff segmentation,” arXiv preprint arXiv: 1412.1283, 2014.
[31] J. Dai, K. He and J. Sun, “BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation,” 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 1635-1643.
[32] G. Papandreou, L. C. Chen, K. P. Murphy and A. L. Yuille, “Weakly-and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation,” 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 1742-1750.
[33] J. R. Uijlings, K. E. van de Sande, T. Gevers, and A.W. Smeulders. “Selective search for object recognition,”In IJCV, 2013.
[34] P. Arbeláez, J. Pont-Tuset, J. Barron, F. Marques and J. Malik, “Multiscale Combinatorial Grouping,” 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 328-335.
[35] P. Dollár and C. L. Zitnick, “Structured Forests for Fast Edge Detection,” 2013 IEEE International Conference on Computer Vision, Sydney, NSW, 2013, pp. 1841-1848.
[36] P. O. Pinheiro, R. Collobert, and P. Dollár, “Learning to segment object candidates,” arXiv: 1506.06204, 2015.
[37] J. Long, E. Shelhamer and T. Darrell, “Fully convolutional networks for semantic segmentation,” 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015, pp. 3431-3440.
[38] Y. Ganin and V. Lempitsky, “ -fields: Neural network nearest neighbor fields for image transforms,” In ACCV, 2014
[39] D. C. Ciresan, A. Giusti, L. M. Gambardella, and J. Schmidhuber. “Deep neural networks segment neuronal membranes in electron microscopy images.,”In NIPS, pages 2852–2860, 012
[40] S. Gupta, R. Girshick, P. Arbelaez, and J. Malik, “Learning rich features from RGB-D images for object detection and segmentation,” In ECCV. Springer, 2014.
[41] L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Semantic image segmentation with deep convolutional nets and fully connected crfs,” arXiv preprint arXiv: 1412.7062.2014
[42] B. Hariharan, P. Arbeláez, R. Girshick and J. Malik, “Hypercolumns for object segmentation and fine-grained localization,” 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015, pp. 447-456.
[43] W. Liu, A. Rabinovich, and A. C. Berg, “ParseNet: Looking wider to see better,” arXiv preprint arXiv:1506.04579, 2015.
[44] G. Papandreou, I. Kokkinos, and P.-A. Savalle, “Untangling local and global deformations in deep convolutional networks for image classification and sliding window detection,” arXiv: 1412.0296,2014.
[45] X. He, R. S. Zemel, and M. Carreira-Perpindn, “Multiscale conditional random fields for image labeling,” In CVPR, 2004
[46] C. Rother, V. Kolmogorov, and A. Blake, “Grabcut: Interactive foreground extraction using iterated graph cuts,” In SIGGRAPH, 2004.
[47] P. Kr¨ahenb¨uhl, and V. Koltun, “Efficient inference in fully connected crfs with gaussian edge potentials,”In NIPS, 2011
[48] S. Xie and Z. Tu, “Holistically-Nested Edge Detection,” 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 1395-1403.
[49] J. Chung, C. Gulcehre, K. Cho, amd Y. Bengio “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv preprint arXiv:1412.3555 ,2014.
[50] L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille, “DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs,” arXiv:1606.00915,2016
[51] M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman. “The PASCAL Visual Object Classes (VOC) Challenge,” IJCV, 2010
[52] B. Hariharan , P. Arbelaez, L. Bourdev , S. Maji ,and J. Malik, “Semantic Contours from Inverse Detectors,”In ICCV,2011
[53] H. Noh, S. Hong and B. Han, "Learning Deconvolution Network for Semantic Segmentation," 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 1520-1528.
指導教授 王家慶(Jia-Ching Wang) 審核日期 2016-8-30
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明