姓名 黃大祐(Da-Yu Huang) 畢業系所 通訊工程學系
論文名稱 連網無人機路徑規劃與基地台連線策略之共同設計:使用模仿增強的深度強化學習方法
(Joint Trajectory Design and BS Association for Cellular-Connected UAV: An Imitation Augmented Deep Reinforcement Learning Approach)
摘要(中) 在本論文中,我們考慮一個蜂窩網路連線無人機的路徑設計和基地
台連線策略問題。 為了維持可靠連線需求,無人機在飛行期間必須與
蜂窩網路保持連線。 由於地面基地台對無人機的天線增益會隨著無人
化任務完成時間。 然後,我們提出一個深度學習框架解決該非凸最佳
化問題。 對於無人機與基地台的連線策略,我們建構了一個在指定區
和最佳連線基地台的非線性映射關係。 為了解決基地台連線策略和路
的良好經驗中學習策略。 我們的結果顯示了在路徑長度方面,與現有
深度強化學習方法相比,所提出之改進方法更具有優勢。 此外,使用
摘要(英) This paper concerns the problem of trajectory design and base station (BS) association for cellular-connected unmanned aerial vehicles (UAVs).
To support safety-critical functions, one primary requirement for UAVs is to maintain reliable cellular connectivity at every time instant during the flight mission.
Since the antenna gain of a ground BS (GBS) changes with the position of the UAV, the UAV-GBS association strategy should be jointly considered with the trajectory design, which has not been studied in the prior arts.
In this paper, we first formulate the problem of joint BS association and trajectory design with the objective of minimizing the mission completion time under a connectivity outage constraint.
Then, a deep learning framework is proposed to solve the formulated non-convex optimization problem in a decoupled manner.
For the UAV-GBS association strategy, the signal strength radio map of a given area is constructed, which is used to train a deep neural network (DNN) to approximate the nonlinear mapping from the UAV position to the optimal GBS.
To tackle the high complexity due to the coupled decision variables of GBS association and UAV movement, a novel deep reinforcement learning (DRL) approach is developed to learn the optimal trajectory, in which the UAV can learn from its own past good experiences.
Our simulation results confirm the superiority of the proposed DRL approach compared to the conventional DRL approaches in terms of trajectory length.
Additionally, it is demonstrated that the nearest association scheme fails to provide reliable cellular connections, whereas our proposed approach can ensure strong connectivity with the GBS during the whole trajectory.
關鍵字(中) ★ 無人機
★ 蜂窩網路
★ 基地台連線
★ 路徑設計
★ 深度強化學 習
關鍵字(英) ★ Unmanned aerial vehicle (UAV)
★ cellular networks
★ cell association
★ trajectory design
★ deep reinforcement learning
論文目次 論文摘要 ................................................................................................. i
Abstract .................................................................................................... iii
目錄............................................................................................................. vi
表目錄......................................................................................................... ix
一、 緒論..................................................................................... 1
1.1 研究背景 . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 研究動機與目的 . . . . . . . . . . . . . . . . . . . . 2
1.3 論文架構 . . . . . . . . . . . . . . . . . . . . . . . . 3
二、 文獻探討............................................................................. 4
2.1 飛行基地台之路徑設計 . . . . . . . . . . . . . . . . . 4
2.2 蜂窩網路連線之無人機路徑設計 . . . . . . . . . . . 4
2.3 綜合觀點 . . . . . . . . . . . . . . . . . . . . . . . . 5
三、 系統模型和問題表述......................................................... 6
3.1 通道模型 . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 基地台與無人機天線模型 . . . . . . . . . . . . . . . 9
3.3 問題描述 . . . . . . . . . . . . . . . . . . . . . . . . 10
四、 基於深度強化學習的演算法............................................. 16
4.1 強化學習模型 . . . . . . . . . . . . . . . . . . . . . . 16
4.1.1 狀態空間 . . . . . . . . . . . . . . . . . . . . . . . . 16
4.1.2 動作空間 . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.3 獎勵函數 . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 模仿增強的深度強化學習 . . . . . . . . . . . . . . . 20
4.3 用於飛行決策的雙深度Q網路 . . . . . . . . . . . . . 21
4.4 用於無人機與基地台連線策略的深度神經網路 . . . 22
4.5 學習演算法 . . . . . . . . . . . . . . . . . . . . . . . 23
4.6 演算法分析 . . . . . . . . . . . . . . . . . . . . . . . 25
五、 模擬結果與分析................................................................. 28
5.1 路徑設計的模擬結果 . . . . . . . . . . . . . . . . . . 29
5.2 基地台連線的模擬結果 . . . . . . . . . . . . . . . . . 35
六、 結論與貢獻......................................................................... 41
參考文獻..................................................................................................... 42
附錄一......................................................................................................... 49
指導教授 陳昱嘉(Yu-Jia Chen) 審核日期 2020-12-3
