資訊軌跡規劃近年在人工智慧領域引起注意,此技術不同於傳統的機器人軌跡規劃在於—目標為最大化資訊收集而非避開障礙物朝向目標點。根據資訊量的定義不同有相對應的應用,例如: 例如搜尋感染農作物、檢查建築結構裂痕、山難搜救、搜尋盜獵濫伐、監控城市汙染分布、自動建立環境三維地圖等等。然而這些問題已被證明為NP-hard,所以僅能求出近似解。為了能突破此領域的研究現狀,本研究提出一個以深度逆強化學習的方法,透過分析人類如何解決日常生活中的資訊軌跡規劃問題,來建立一個學習方法提升機器人執行資訊軌跡規劃的效能。本計畫將分為三年執行,第一年著重以深度逆強化學習來探索人類處理資訊軌跡規劃問題的獎勵函數。第二年著重以分析人類處理不同資訊軌跡規劃時的遷移學習能力。第三年著重在人機協同處理資訊軌跡規劃問題。此計畫之研究目標將針對人類在資訊軌跡規劃問題之三項子議題探討: (1) 資訊軌跡規劃問題可學習? 若可,需要多少資料量?(2) 人類在處理不同環境的資訊軌跡規劃問題時如何遷移學習?(3) 人類與機器人在資訊軌跡規劃的差異性與互補性為何? ;The AI community has been paying more attention to the concept of informative path planning (IPP). The difference between path planning and IPP is that IPP is to maximize information gathering instead of avoiding obstacles. There are different applications depending on the definition of information (e.g., detection of infected plants, search for structural failure, mountain rescue and search, illegal logging, monitor of pollutions and 3D mapping). However, finding optimal solutions for these problems are NP-hard, so finding approximate solutions is a feasible way. To make a breakthrough of the IPP research status, this research proposed a deep inverse reinforcement learning approach to improve IPP performance of robots through analyzing how humans solve IPP problems in daily lives. The project will take three years. The focus of the first year is to explore the reward functions of that humans solve IPP problems via deep inverse reinforcement learning. The focus of the second year is to analyze the transfer learning of that humans solve different IPP problems. The focus of the third year is to explore human-robot cooperative IPP problems. The goal of this research is to explore three issues of IPP:(1) IPP is learnable? If it is learnable, how much data robots need? (2) How do humans transfer their knowledge for different IPP problems? (3) What’s the difference and respective strengths of humans and robots?