利用注意力插件改善卷積網路：使用前置與後置方法;Attention-based plugin for CNN improvement: Front end and Back end

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/81077

jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/81077

题名:	利用注意力插件改善卷積網路：使用前置與後置方法;Attention-based plugin for CNN improvement: Front end and Back end
作者:	吳佳霖;Wu, Chia-Lin
贡献者:	資訊工程學系
关键词:	卷積網路;插件;注意力模型;CNN;plugin;attention model
日期:	2019-07-15
上传时间:	2019-09-03 15:33:13 (UTC+8)
出版者:	國立中央大學
摘要:	卷積神經網絡處理的一個常見的任務是圖像分類任務，且其模型結構可以進一步擴展到不同類型的工作。例如，影像語意分割與對象檢測都基於類似於處理分類問題的卷積網路架構。基於卷積神經網絡提供的特徵識別能力，卷積神經網絡在處理這些任務時，與其他傳統方法相比具有一定的性能上的提升。大多數卷積神經網絡的設計，通常將原始圖像作為這些任務的訓練和測試階段的輸入信息。因為在電腦視覺的技術中，特徵的擷取與選擇並不總是可預期的，藉由卷積網路自身的學習能力能夠提取到更適合的特徵。當任務的描述目標未覆蓋整個圖像時，卷積神經網絡可能會在訓練時將部分非正確的特徵納入預測考量。為了提高卷積神經網絡模型的正確性和穩定性，並且不遺漏任何隱含的圖像信息，我們嘗試將專注遮罩資訊以數種不同的形式提供給深度學習模型。為了後續實驗的比較，我們採取了兩個主要想法去設計各個方法。第一個是前置方法，這類型的方法會以不同形式提供專注資訊給模型的輸入階段。主要是在模型的輸入階段為更好的預測結果提供了額外的附加特徵。另一種後置方法，是為了提高判斷正確位置的能力，在訓練階段應用額外的子訓練任務。相比之下，第二種種類的方法，為我們的實驗的目標任務提供了更合理的改進和兼容性。;A general task that convolutional neural network(CNN) dealing is image classification, and the model structure has been further extended to different kinds of works. For example, both semi-segmentation and object detection are based on slimier technics that solve the classification problem. Based on the pattern recognition ability that CNN provided, it can provide more performance improvement compared to other traditional methods. Most of the CNN design usually takes a raw image as input information on both training and testing phase of these tasks, because the suitable feature in computer vision is not always predictable. When the describing target of a task is not covering the entire image, the CNN model will be free to learn any pattern that might not be the right patterns of the target objects. For increasing the correctness and the robustness of a CNN model and not losing any possible information of an image, we attempt to assign attention information to the model. For comparison, there are two groups of methods we are using. The front end which assigns the attention information in different forms provides an additional feature for the prediction. Another end aims to increase the ability of judgment on correct positions, that applies an additional loss function on the training phase. For comparison, the second end provides more reasonable improvements and compatibility on our experimental results.
显示于类别:	[資訊工程研究所] 博碩士論文

文件中的档案:

档案	描述	大小	格式	浏览次数
index.html		0Kb	HTML	125	检视/开启

在NCUIR中所有的数据项都受到原著作权保护.

社群 sharing

数据加载中.....