以非局部的解碼器-擠壓-激勵網路及自適應深度列表達成基於編碼器-解碼器的單鏡頭深度估計任務;Monocular Depth Estimation based on encoder-decoder with Non-Local Decoder-Squeeze-Excitation Network and Adaptive Depth List

NCUIR > College of Electrical Engineering & Computer Science > Graduate Institute of Electrical Engineering > Electronic Thesis & Dissertation > Item 987654321/88366

Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/88366

Title:	以非局部的解碼器-擠壓-激勵網路及自適應深度列表達成基於編碼器-解碼器的單鏡頭深度估計任務;Monocular Depth Estimation based on encoder-decoder with Non-Local Decoder-Squeeze-Excitation Network and Adaptive Depth List
Authors:	萬偉中;Wan, Wei-Chung
Contributors:	電機工程學系
Keywords:	單鏡頭深度估計任務;解碼器-擠壓-激勵網路;自適應深度列表
Date:	2022-04-15
Issue Date:	2022-07-14 00:24:15 (UTC+8)
Publisher:	國立中央大學
Abstract:	單鏡頭深度估計是計算機視覺中的一個重要議題。近年來，基於CNN（卷積神經網路）的做法中的端到端的編碼器-解碼器(Encoder-Decoder)架構中展現了合理的結果。對於編碼器部分，大部分的研究是基於一個強大的特徵擷取器來獲得良好的特徵，並利用此特徵進行上採樣重構深度圖像。在一個強大的編碼器下，人們發現即使是簡單的上採樣過程也能達到良好的準確度。然而，為了達到更高質量的深度估計，改善在解碼器部分更為關鍵。即使現在，很少有合理且確切有用的方法對上採樣過程做出貢獻。在本文中，我們提出了一個新穎的單鏡頭深度估計網路架構設計。更準確卻說，我們提出了一個基於CNNs網路的模組，該模組以從全局的角度來考慮整個上採樣過程。提出的模組設計是基於SE-Net的概念，並通過全局視角關注機制對整個解碼器中的不同解析度的特徵圖進行了適當的重新校準，該模組為解碼器-擠壓-激勵模組(DSE)。我們更進一步將其與非局部網路注意機制結合起來，並完成設計了用於整個上採樣過程的非局部解碼器-擠壓-激勵 (Non-Local Decoder-Squeeze-and-Excitation: NL-DSE)模組。此外，我們還提出了一個自適應深度列表（Adaptive Depth List: ADL）的輸出限制範圍方法，以提高近距離估計的準確度。結合這些建議的技術，我們的結果在NYU Depth估計資料集V2 (NYU Depth V2)上進行了評估，並在準確度上達到目前CNN的state-of-the-art的做法。;Monocular depth estimation is an essential topic in computer vision. In recent years, the CNNs (Convolutional Neural Networks) based model shows the reasonable result from an end-to-end encoder-decoder architecture. For the encoder part, most of the research is based on a robust feature extractor to get good features. With a strong encoder, it was found that even simple up-sampling processes can achieve good accuracy. However, in the decoder part, it is more critical in a high-quality depth estimation task. Even now, few reasonable methods contribute to the up-sampling process. In this paper, we present a novel monocular depth estimation design. We propose an innovative CNN-based network module that considers the whole up-sampling process globally. This design is based on the concept of SE-Net, and properly recalibrated the feature maps with a global perspective attention mechanism. We further combine it with Non-local network attention mechanisms to design the Non-Local Decoder-Squeeze-and-Excitation (NL-DSE) module for the whole up-sampling process. Furthermore, we also propose an output limiting range method called Adaptive Depth List (ADL) to enhance the precision of the near distance estimation. Combining with these proposed techniques, our results are evaluated on the NYU Depth V2 dataset and outperforms the state-of-the-art CNN-based approaches in accuracy.
Appears in Collections:	[Graduate Institute of Electrical Engineering] Electronic Thesis & Dissertation

Files in This Item:

File	Description	Size	Format
index.html		0Kb	HTML	42	View/Open

社群 sharing

Loading...