摘要: | 在物聯網時代,大量低功耗的無線通訊節點將會廣泛佈署,對於佈署在複雜、危險區域的節點,例如:沙漠、荒野、災難、戰場等,節點運作勢必依賴電池作為電力來源,而太陽能已被視為實現永久性無線通訊的有效方式。由於具備高機動性、靈活佈署以及成本低廉,無人機可以被靈活地調度從分佈廣泛的地面無線通訊節點收集感測數據,從而改善無線通訊的能源效率,然而無人機的飛行受限於本身搭載電池的電力限制,有效規劃無人機通訊的資源分配是一大設計挑戰。 本研究考慮多個獵能節點利用從太陽能收集的電量進行上鏈通訊傳輸資料至多台無人機,探討無人機飛行軌跡、無人機與節點通訊關聯以及功率控制策略,以有效管理多台無人機通訊環境下的同頻干擾,為確保公平性,採用最大化最差節點總資料傳輸率作為設計目標。此聯合設計是一個高度非凸問題,並且需要知道未來時間的瞬時獵能狀態和通道狀態資訊,然而這在現實環境中很難預測得知。為克服這些設計難題,本研究首先提出一種基於凸優化的離線方法,該方法僅利用統計平均的獵能狀態和通道狀態資訊,通過應用連續凸逼近和交替優化將問題轉化為三個凸子問題,進而求得無人機飛行軌跡、無人機與節點通訊關聯以及功率控制的離線策略。運用離線策略設計在線強化學習方法,根據即時環境資訊來改善系統效能,在離線優化的飛行路徑上規範多台無人機的飛行走廊,避免無人機進行不必要的飛行探索,藉此提高無人機於強化學習時的學習效率及效能。 ;In the era of the Internet of Things (IoT), a large number of low-power wireless communication nodes will be widely deployed. For nodes deployed in complex and dangerous areas, e.g., deserts, wilderness, disasters, and battlefields, the operation mainly relies on batteries as the power source, and solar energy has been regarded as an effective way to achieve permanent wireless communications. Due to the advantages of high mobility, easy deployment, and low cost, unmanned aerial vehicles (UAVs) can be flexibly used to collect data from widely distributed ground wireless nodes, thus improving the energy efficiency of wireless communications. However, the flight of UAVs is limited by the power constraints of their own batteries, and it is an essential issue to appropriately design the resource allocation of UAV communications. In this paper, we consider multiple solar-powered wireless nodes which utilize the harvested solar energy to transmit collected data to multiple UAVs in the uplink. In this context, we jointly design the UAV flight trajectory, UAV-node communication association, and uplink power control strategy to effectively use the harvested energy and manage the co-channel interference under a finite time horizon. To ensure the fairness of wireless nodes, the design goal is to maximize the worst sum rate among nodes. The joint design problem is highly non-convex and requires the causal (future) knowledge of the instantaneous energy harvesting information (EHI) and channel state information (CSI), which are difficult to predict in reality. To overcome these design challenges, we first propose an offline method based on convex optimization that only utilizes the average EHI and CSI and solve the problem via three convex sub-problems by applying successive convex approximation (SCA) and alternating optimization to find the offline strategy for UAV trajectory, UAV-node communication association, and uplink power control. Using the offline strategy, we further design an online reinforcement learning (RL) method to improve the system performance based on real-time environmental information. An idea of regulated flight corridors of multiple UAVs, based on the offline optimized flight paths, is proposed to avoid unnecessary flight exploration of UAVs and enables us to improve not only the learning efficiency but also the system performance, as compared with the conventional RL method. |