博碩士論文 110523001 完整後設資料紀錄

DC 欄位 語言
DC.contributor通訊工程學系zh_TW
DC.creator張鎮宇zh_TW
DC.creatorCheng-Yu Changen_US
dc.date.accessioned2023-8-15T07:39:07Z
dc.date.available2023-8-15T07:39:07Z
dc.date.issued2023
dc.identifier.urihttp://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=110523001
dc.contributor.department通訊工程學系zh_TW
DC.description國立中央大學zh_TW
DC.descriptionNational Central Universityen_US
dc.description.abstract現今科技快速發展的時代,硬體上持續的突破史的人工智慧的研究日益進展,需多的研究都逐漸有二維平面拓展到三維空間中,例如:自駕車產業、娛樂影視業,三維的人體建模、醫學美容等相關的領域。 在二維圖像中估計中人們可以很好的判斷場景的三維距離,但是在電腦視覺中,由單一的二維圖像推估三維場景一直以來都是一項值得關注的議題,因為人們能很快速地由圖像中辨識出物體並且能夠很好的預估物體的位置資訊,於是現今獲取三維空間資訊幾乎皆使用激光雷達或是深度相機,這些雖然能暫時解決三維空間資訊不足問題,但這些設備通常更加的昂貴且需要額外的輸入,因此由純視覺方法估計三維場景並且語意分割與補全語意意圖更好更快的解決場景理解的問題。 同時間在三維體素的場景在訓練與應用的階段會使用到大量的記憶體,因此如何在有限的資源限制之下能夠提高效能並重建三維場景也是在語意場景補全重要的一部份。 本研究將由單張RGB圖像重建三維場景並完成語意場景補全,在模型中加入注意力機制,對於不同尺度特徵在不同層級特徵對於使語意場景補全的影響,並提高語意場景補全模型的品質,減少訓練時間,並且分析在使用之記憶體與模型效能之間之優點。本研究在客觀的評估(IoU, mIoU)上皆有傑出的表現。zh_TW
dc.description.abstractIn today′s era of rapid technological development, research on artificial intelligence has been making continuous breakthroughs in hardware. More and more studies are gradually expanding from two-dimensional planes to three-dimensional space, encompassing various fields such as the self-driving car industry, entertainment film industry, three-dimensional human modeling, and medical aesthetics. While people are good at judging the three-dimensional distance of a scene in two-dimensional image estimation, estimating the three-dimensional scene from a single two-dimensional image has always been a matter of concern in computer vision. This is because people can quickly identify objects in images and accurately predict their locations. As a result, the acquisition of three-dimensional spatial information nowadays heavily relies on laser radar or depth cameras. Although these devices temporarily solve the problem of insufficient 3D spatial information, they are typically expensive and require additional inputs. Therefore, adopting a purely visual approach for estimating 3D scenes, along with semantic segmentation and complementary semantics, can better address the challenges of scene understanding. Simultaneously, training and applying three-dimensional scenes require significant memory resources. Consequently, improving performance and reconstructing three-dimensional scenes with limited resources are crucial aspects of semantic scene complementation. This study reconstructs 3D scenes from a single RGB image and completes the semantic scene complementation by adding an attention mechanism to the model and the importance of features at different levels to make it useful during training, in order to improve the quality of the semantic scene complementation model, reduce the training time, and investigate the advantages between the memory used and the model performance. This study shows outstanding performance in objective evaluation (IoU, mIoU).en_US
DC.subject語意場景補全zh_TW
DC.subject注意力機制zh_TW
DC.subject深度學習zh_TW
DC.subject語意分割zh_TW
DC.subjectsemantic scene completionen_US
DC.subjectAttention mechanismen_US
DC.subjectdeep learningen_US
DC.subjectsemantic segmentationen_US
DC.title基於多層次注意力機制之單目相機語意場景補全技術zh_TW
dc.language.isozh-TWzh-TW
DC.titleMonoscene camera semantic scene completion technique based on multi-level attention mechanismsen_US
DC.type博碩士論文zh_TW
DC.typethesisen_US
DC.publisherNational Central Universityen_US

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明