物件遮罩與邊界引導之遞迴卷積神經網路;Object Mask and Boundary Guided Recurrent Convolution Neural Network

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/72295

jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/72295

题名:	物件遮罩與邊界引導之遞迴卷積神經網路;Object Mask and Boundary Guided Recurrent Convolution Neural Network
作者:	李俊宏;Li,Jyun-Hong
贡献者:	資訊工程學系
关键词:	深層學習;卷積神經網路;影像語意分割;deep learning;convolution neural network;semantic image segmentation
日期:	2016-08-30
上传时间:	2016-10-13 14:37:38 (UTC+8)
出版者:	國立中央大學
摘要:	卷積神經網路(Convolutional Neural Network, CNN)在辨識上的能力有著優秀的效能，卷積神經網路不只提升了全影像分類的效能，也使得區域影像的辨識提升。而全卷積神經網路(Fully Convolutional Network, FCN)的出現也使得影像語意分割相關的研究蓬勃發展，比起以往使用區域提取(Region Proposal)結合支持向量機(Super Vector Machine, SVM)的方式，大幅的提升語意分割的準確率。本論文結合兩種網路達到效能的提升，一種負責遮罩的產生，另一種負責對影像語意的分析。我們所提出的方法可以改良DT-EdgeNet (Domain Transform with EdgeNet)[19]使用影像邊緣圖執行域轉換的部份，由於[19]產生的輸出圖會包含影像中所有可能的邊緣。這些邊緣也會包含非物件部份，所以使用域轉換的時候，有機會受到非物件邊緣的影像造成語意分割的結果錯誤，而我們所使用的遮罩網路會預測出只包含背景、物件以及邊界只包含目標物件的邊緣參考圖，因此可以降低，非物件邊緣的影響。在我們的研究中還發現不使用邊界得分圖，而是使用物件得分圖執行域轉換的方式，可以更進一步的提升網路的準確度，而且我們的遮罩網路，除了可以輔助域轉換的結果外，也可以產生有效的遮罩優化分割結果的空間與區域訊息。我們的研究過程也改良OBG-FCN(object boundary guided FCN)[19]的架構，將各種步長的OBG-FCN使用串接的方式訓練，可以更進一步的提升網路對物件與邊界的準確度。最終我們提出之架構使用Pascal VOC2012驗證資料的效能，比起所使用的基礎網路[18]提升了約6.6%。 ;Convolution neural network (CNN) has outstanding performance on recognition, CNN not only enhance the effectiveness of the whole-image classification, but also makes the identification of local task upgrade. The Full convolution neural network (FCN) also makes the improvement on semantic image segmentation, compared to the traditional way using region proposal combined super vector machine, and significantly improved the accuracy of semantic segmentation. In our paper, we combined two network to improve accuracy. One produces mask, and the other one classifies label of pixel. One of our proposed is that, we change the joint images of domain transform in DT-EdgeNet [19]. Due to the joint images of DT-EdgeNet are edges. These edges include the edges of object, which do not belong to the training set. So we guess that result of [19] after domain transform mind be influence by these edges. Our mask net can produce score map of background, object and boundary. These results do not include object belong to the training set. Therefore, we can reduce the influence of non-class object. Our mask net can also produce mask to optimize spatial information. Our other proposal is that we concatenate different pixel stride of OBG-FCN [18]. By adding this concatenate layer to train net, we can enhance the accuracy of object of boundary. In the end, we tested our proposed architecture on Pascal VOC2012, and got 6.6% higher than baseline on mean IOU.
显示于类别:	[資訊工程研究所] 博碩士論文

文件中的档案:

档案	描述	大小	格式	浏览次数
index.html		0Kb	HTML	663	检视/开启

在NCUIR中所有的数据项都受到原著作权保护.

社群 sharing

数据加载中.....