基於注意力之用於物件定位的語義分割方法;Attention Based Semantic Segmentation for Object Localization

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/83919

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/83919

題名:	基於注意力之用於物件定位的語義分割方法;Attention Based Semantic Segmentation for Object Localization
作者:	席朗斯;Hirunsirisombut, Phanuvich
貢獻者:	資訊工程學系
關鍵詞:	語意分割;深度學習;擴張捲積;注意力網路;Semantic Segmentation;Deep learning;Dilated Convolutional;Attention network
日期:	2020-07-10
上傳時間:	2020-09-02 17:42:00 (UTC+8)
出版者:	國立中央大學
摘要:	現今已有多數的研究者使用深度學習來解決電腦視覺領域的問題，語意分割是其中一項最熱門的問題，其目的在於以像素為單位進行各類別的標註。U-Net是其中一項著名的方法，該方法於2015年提出並用於生物醫學上的圖像語意分割。然而U-Net用於小物件的分割不甚理想，此外先前的研究引出了一個問題，該問題與透過Self-attention來強化語意分割時使用ReLU有關，因為該激勵函數會將負數轉為零。為了解決這些問題，有人提出用於物件定位的語意分割基於Dilated Attention的方法。首先這個研究使用典型的U-Net來提取特徵，為了預防相關資訊在較深層的網路時會遺失，注意力模組被用於skipping connected時。此外每個注意力模組皆使用擴張捲積取代典型的卷積來增加感受野，並將淺層特徵傳至深層。在現實環境中車子等物件可能會因太靠近而有重疊的現象，對這個現象進行語意分割的問題稱之為「merging regions」。我們在分割兩個物件時使用Watershed transform後處理的方法來解決該問題。實驗結果顯示在語意分割任務中相較於原先的方法並使用數種不同的損失函數，這個方法在Dice score coefficient評分較原先的方法來得優秀。 ;Nowadays, many researches were built up to solve problems in computer vision field by using deep learning algorithms. Semantic segmentation is one of most popular problem that related to label every single of pixels in an image which category that they belong to. Then, famous approach call “U-Net” was invented in 2015 for medical purpose in case of biomedical segmentation. Unfortunately, U-net is facing with small reception fields that affected to outcoming result. Moreover, one problem of previous work of usage self-attention for enhance semantic segmentation came from using Rectified Linear Unit (ReLU), because of degree of negative part of this activation function will judge every value that spreading around negative number into zero. To address these problems, Dilated Attention based semantic segmentation for object localization was proposed. Firstly, this work using standard U-net as a main network to extract features from input. Then, each edge of skipping strategy inside U-net network, attention modules are placed to prevent missing relevant information while going deeper. Moreover, each attention module is using atrous convolutional instead of ordinary convolutional to enlarge reception fields of attention module to collect and pass feature from coarse layer to fine layer. In the real scenarios, object like cars may stick too close to another cars. Unfortunately, this problem called “merging regions” that appear then we try to segment two or more object that are overlaying. To solve the problem, Watershed transform is using as post-processing strategy to separate two objects apart. For experimental result shows that under Dice score coefficient or DSC measurement this proposed method outperform baseline model with combination of models with different well-known loss functions in semantic segmentation task.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	103	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....