應用視覺語意圖於模擬環境中操作機器人日常生活任務;The Application of the Visual Semantics to Operate Robot Daily Tasks in a Simulated Environment

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/86616

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/86616

題名:	應用視覺語意圖於模擬環境中操作機器人日常生活任務;The Application of the Visual Semantics to Operate Robot Daily Tasks in a Simulated Environment
作者:	蔡政育;Tsai, Cheng-Yu
貢獻者:	資訊工程學系
關鍵詞:	人工智慧;智慧型機器人;深度學習;圖神經網路;Artificial Intelligence;Smart Robots;Deep Learning;Graph Neural Networks
日期:	2021-08-03
上傳時間:	2021-12-07 13:01:50 (UTC+8)
出版者:	國立中央大學
摘要:	近年來，深度學習已被廣泛應用於機器人領域，所面臨的研究議題就屬機器人的視覺與語言的互動特別值得關注和需要突破。有許多的此類研究會用ALFRED (Action Learning From Realistic Environments and Directives) 當作效能評估指標，在此環境中，機器人需要依照所需執行的語言指令來執行日常室內家庭任務。本篇論文認為給予機器人視覺語意理解、語言語意理解，可使得推論能力得以提升。在本篇論文中，提出了一種新穎的方法-VSGM (Visual Semantic Graph Memory)，利用語意圖的表示方式，能夠獲得較好的視覺影像特徵，提升機器人的視覺理解能力。藉由先驗知識與「場景圖生成網路」，轉換成圖表示方式，給予機器人；並將影像中的物件映射成由上而下以自我為中心的地圖 (Top-down Egocentric Map)；最終藉由「圖神經網路」提取當前任務重要的物件特徵。本論文提出之方法，在ALFRED環境中進行驗證，在模型加入VSGM後，能夠提升任務成功率6~10 %。;In recent years, developing AI for robotics has raised much attention. The interaction of vision and language of robots is particularly difficult. We consider that giving robots an understanding of visual semantics and language semantics will improve inference ability. In our method, we propose a novel method-VSGM (Visual Semantic Graph Memory), which uses the semantic graph to obtain better visual image features, improve the robot′s visual understanding ability. By providing prior knowledge of the robot and detecting the objects in the image, it predicts the correlation between the attributes of the object and the objects and converts them into a graph-based representation; and mapping the object in the image to be a top-down egocentric map. Finally, the important object features of the current task are extracted by Graph Neural Networks. Our proposed method is verified in the ALFRED (Action Learning From Realistic Environments and Directives) dataset. In this dataset, the robot needs to perform daily indoor household tasks following the required language instructions. After the model is added to the VSGM, the task success rate can be improved by 6~10 %.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	44	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....