基於深度學習之結合全局及局部資訊和修復分割細節的語義分割方法;Global and Local context and Coarse to Fine Semantic Segmentation

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/81080

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/81080

題名:	基於深度學習之結合全局及局部資訊和修復分割細節的語義分割方法;Global and Local context and Coarse to Fine Semantic Segmentation
作者:	邱亦成;Chiu, Yi-Cheng
貢獻者:	資訊工程學系
關鍵詞:	深度學習;語義分割;卷積神經網路;deep learning;semantic segmentation;convolutional neural network
日期:	2019-07-15
上傳時間:	2019-09-03 15:33:30 (UTC+8)
出版者:	國立中央大學
摘要:	圖像語義分割的問題在計算機視覺和人工智慧是非常熱門的議題。影像分割的訓練資料集的產生，也非常耗費時間及人力，訓練高精確度的影像分割結果來減輕資料產出的成本，也是本論文的目標。最近對基於深度學習的語義分割研究中，為了能即時在道路上運行和GPU卡的容量限制，通常會採取下採樣的操作，導致場景中的細節丟失。我們在論文中探討各個著名的語義分割架構所提出的方法，從自編碼到專注力模型分析其貢獻及優缺點，此外我們也修改其網路架構，提出由二個模組所組成的JCF架構，其中一個模組從高分辨率圖像中取得細節資訊，透過最後通道權重結合二特徵圖使原來的分割結果更加精細。而我們最終所提出的網路架構GLNet，結合全域專注力資訊和局部的多尺度上下文資訊，幫助模型理解各種場景之間物體的關係，減少分類的錯誤，並透過通道權重模組，引入卷積神經網路前層的資訊來修補分割物件的邊界和細節部分，而我們提出的架構和目前幾個著名的方法相比得到了改進。 ;The issue of image semantic segmentation is renowned within computer vision and artificial intelligence. The ground truth in image segmentation is hard to produce and is time- and resource-intensive. It is also the goal of this paper to produce high-precision image segmentation results to reduce the cost of ground truth data output. Recently, in the research of semantic segmentation based on deep learning, in order to be able to run in real-time and limit the capacity of the GPU card, has reduced image resolution through downsampling operation, resulting in detail loss in the scene. In the paper, we explore the famous semantic segmentation architecture, from autoencoder to attention model to analyze its contribution, advantages and disadvantages. In addition, we also modify its network architecture, and propose a JCF architecture consisting of two modules. One module obtains detailed information from high-resolution images, and combines two feature map with the channel weights to make the segmentation result from coarse to fine. Our proposed network architecture, combined with global spatial information and local multi-scale context information, helps the model understand the relationship between objects between various scenes, reduces false alarm, and repair the boundaries and details of the segmented object through channel attention modules. The experiments of our proposed architecture is improved compared to state-of-the-art methods.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	170	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....