行動邊緣計算環境下基於深度強化學習和契約激 勵的任務卸載與資源分配機制

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：27

、訪客IP：3.137.171.147

姓名

侯博允(HOU PO YUN) 查詢紙本館藏

畢業系所

通訊工程學系

論文名稱

行動邊緣計算環境下基於深度強化學習和契約激勵的任務卸載與資源分配機制
(A Novel Mechanism Based on Deep Reinforcement Learning and Contract Incentives for Task Offloading and Resource Allocation in Mobile Edge Computing Environments)

相關論文

★ 非結構同儕網路上以特徵相似度為基準之搜尋方法	★ 以階層式叢集聲譽為基礎之行動同儕網路拓撲架構
★ 線上RSS新聞資料流中主題性事件監測機制之設計與實作	★ 耐延遲網路下具密度感知的路由方法
★ 整合P2P與UPnP內容分享服務之家用多媒體閘道器：設計與實作	★ 家庭網路下簡易無縫式串流影音播放服務之設計與實作
★ 耐延遲網路下訊息傳遞時間分析與高效能路由演算法設計	★ BitTorrent P2P 檔案系統下載端網路資源之可調式配置方法與效能實測
★ 耐延遲網路中利用訊息編碼重組條件之資料傳播機制	★ 耐延遲網路中基於人類移動模式之路由機制
★ 車載網路中以資料匯集技術改善傳輸效能之封包傳送機制	★ 適用於交叉路口環境之車輛叢集方法
★ 車載網路下結合路側單元輔助之訊息廣播機制	★ 耐延遲網路下以靜態中繼節點（暫存盒）最佳化訊息傳遞效能之研究
★ 耐延遲網路下以動態叢集感知建構之訊息傳遞機制	★ 跨裝置影音匯流平台之設計與實作

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2026-8-19以後開放)

摘要(中)

隨著物聯網（IoT）的快速發展，對計算敏感的終端設備數量顯著增加。透過將這
些設備的計算任務卸載到邊緣伺服器，邊緣計算在減少任務延遲和減輕雲端伺服器計
算負擔方面顯示出其效益。然而，無策略地卸載計算任務可能導致邊緣伺服器的資源
低效利用，引發延遲增加和計算成本上升。因此，設計有效的任務卸載和資源配置策
略，以優化延時和能耗，是目前研究的重點和挑戰。減少雲伺服器的任務延遲和計算
負擔已成為重要議題。邊緣伺服器在缺乏適當激勵機制下可能不願分享資源，因此提
供適當的獎勵至關重要。考慮到隱私洩露風險，移動用戶可能不願意透露私有資訊，
這導致雲平台和邊緣伺服器間的資訊不對稱。先前研究通常假設雲平台能完全掌握邊
緣伺服器資訊，但在實際中並非如此。本文提出了一種基於深度強化學習（DRL）的
契約激勵機制。與傳統方法不同，DRL 能在不需要預先了解環境詳情的情況下運作。
DRL 通過學習和適應設計激勵機制，使得在動態和不確定環境中有效地激勵參與者完
成任務，達到雲平台的最大效用。本文的貢獻包括在資訊不對稱情景下提出聯合資源
分配和計算卸載激勵問題，系統分析最優契約的充要條件，並將契約激勵問題制定為
不完全資訊情景下的馬可夫決策過程，設計深度確定性策略梯度（DDPG）方法，以在
高維度的動作和狀態空間下獲得計算資源和激勵報酬策略。

摘要(英)

With the rapid development of the Internet of Things (IoT), the number of computation-
sensitive end devices has significantly increased. By offloading the computational tasks
of these devices to edge servers, edge computing has demonstrated its benefits in reduc-
ing task latency and alleviating the computational burden on cloud servers. However,
indiscriminately offloading computational tasks may lead to inefficient use of edge server
resources, resulting in increased latency and higher computational costs. Therefore, de-
signing effective task offloading and resource allocation strategies to optimize latency and
energy consumption is currently a key research focus and challenge. Reducing the
task latency and computational burden on cloud servers has become an important is-
sue. Without appropriate incentive mechanisms, edge servers may be unwilling to share
resources, making the provision of suitable rewards crucial. Traditional incentive mech-
anisms, such as auction theory and Stackelberg games, rely on frequent information ex-
change, leading to high signaling costs. Considering the risk of privacy leaks, mobile
users may be reluctant to disclose private information, resulting in information asymme-
try between cloud platforms and edge servers. Previous research often assumed that cloud
platforms have complete information about edge servers, which is not the case in prac-
tice. This paper proposes a contract incentive mechanism based on deep reinforcement
learning (DRL). Unlike traditional methods, DRL can operate without prior knowledge
of the environment’s details. DRL learns and adapts to design incentive mechanisms,
effectively motivating participants to complete tasks in dynamic and uncertain environ-
ments, achieving the maximum utility of the cloud platform. The contributions of this
paper include proposing the joint resource allocation and computation offloading incen-
tive problem under information asymmetry, systematically analyzing the necessary and
sufficient conditions for optimal contracts, formulating the contract incentive problem
as a Markov decision process under incomplete information, and designing a deep deter-
ministic policy gradient (DDPG) method to obtain computation resource and incentive
reward strategies in high-dimensional action and state spaces.

關鍵字(中)

★ 激勵機制
★ 賽局
★ 契約理論
★ 強化學習

關鍵字(英)

★ Incentive mechanism
★ Game theory
★ Contract theory
★ Reinforcement Learning

論文目次

摘要i
Abstract ii
圖目錄v
表目錄vi
1 簡介1
2 背景與相關文獻探討4
2.1 計算卸載(Computing Offloading) 與資源分配(Resource Allocation) . . . 4
2.1.1 計算卸載(Computing Offloading) . . . . . . . . . . . . . . . . . . . 5
2.1.2 資源分配(Resource Allocation) . . . . . . . . . . . . . . . . . . . . 6
2.2 契約理論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 契約理論結合計算卸載之議題. . . . . . . . . . . . . . . . . . . . . 8
2.2.2 契約理論結合資源分配之議題. . . . . . . . . . . . . . . . . . . . . 10
2.3 強化學習. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.1 契約理論結合強化學習之背景. . . . . . . . . . . . . . . . . . . . . 14
3 研究方法15
3.1 系統架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1.1 傳輸模型. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.2 計算算能耗模型. . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2 契約模型. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.1 基於任務卸載的契約模型. . . . . . . . . . . . . . . . . . . . . . . 20
3.2.2 委託者效用函數. . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.3 參與者效用函數. . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.4 可行性條件(Feasibility Conditions) . . . . . . . . . . . . . . . . . 23
3.2.5 契約理論聯合卸載問題描述. . . . . . . . . . . . . . . . . . . . . . 24
3.2.6 可行性條件特性( Properties of Feasible Contract) . . . . . . . . . 25
3.2.7 最佳化契約問題描述. . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3 小結. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4 深度強化式學習. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.4.1 強化學習模型. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4 實驗與結果分析40
4.1 實驗對照組. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2 實驗環境. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.2.1 實驗參數. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2.2 模型參數. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.3 超參數調整影響. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3.1 學習率對於DDPG 的影響. . . . . . . . . . . . . . . . . . . . . . 45
4.3.2 衰減率對於DDPG 的影響. . . . . . . . . . . . . . . . . . . . . . 47
4.4 實驗結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.4.1 不同匯率之效用與成本比較. . . . . . . . . . . . . . . . . . . . . . 49
4.4.2 常態意願分佈下效用比較. . . . . . . . . . . . . . . . . . . . . . . 51
4.4.3 均勻意願分佈下效用比較. . . . . . . . . . . . . . . . . . . . . . . 53
4.4.4 雙眾數意願分佈下效用比較. . . . . . . . . . . . . . . . . . . . . . 56
5 結論與未來展望58
參考文獻59

參考文獻

[1] T. Qiu, J. Chi, X. Zhou, Z. Ning, M. Atiquzzaman, and D. O. Wu, “Edge computing
in industrial internet of things: Architecture, advances and challenges,” IEEE
Communications Surveys and Tutorials, vol. 22, no. 4, pp. 2462–2488, 2020.
[2] T. Liu, L. Fang, Y. Zhu, W. Tong, and Y. Yang, “A near-optimal approach for online
task offloading and resource allocation in edge-cloud orchestrated computing,” IEEE
Transactions on Mobile Computing, vol. 21, no. 8, pp. 2687–2700, 2022.
[3] C. You, K. Huang, H. Chae, and B.-H. Kim, “Energy-efficient resource allocation for
mobile-edge computation offloading,” IEEE Transactions on Wireless Communications,
vol. 16, no. 3, pp. 1397–1411, 2017.
[4] S. M. A. Huda and S. Moh, “Deep reinforcement learning-based computation offloading
in uav swarm-enabled edge computing for surveillance applications,” IEEE
Access, vol. 11, pp. 68 269–68 285, 2023.
[5] G. Ji, B. Zhang, Z. Yao, and C. Li, “A reverse auction based incentive mechanism
for mobile crowdsensing,” in ICC 2019 - 2019 IEEE International Conference on
Communications (ICC), 2019, pp. 1–6.
[6] C.-L. Hu, K.-Y. Lin, and C. K. Chang, “Incentive mechanism for mobile crowdsensing
with two-stage stackelberg game,” IEEE Transactions on Services Computing, vol. 16,
no. 3, pp. 1904–1918, 2023.
[7] C. Su, F. Ye, T. Liu, Y. Tian, and Z. Han, “Computation offloading in hierarchical
multi-access edge computing based on contract theory and bayesian matching game,”
IEEE Transactions on Vehicular Technology, vol. 69, no. 11, pp. 13 686–13 701, 2020.
[8] N. Zhao, W. Du, F. Ren, Y. Pei, Y.-C. Liang, and D. Niyato, “Joint task offloading,
resource sharing and computation incentive for edge computing networks,” IEEE
Communications Letters, vol. 27, no. 1, pp. 258–262, 2023.
[9] N. Zhao, Y. Pei, Y.-C. Liang, and D. Niyato, “Deep-reinforcement-learning-based
contract incentive mechanism for joint sensing and computation in mobile crowdsourcing
networks,” IEEE Internet of Things Journal, vol. 11, no. 7, pp. 12 755–
12 767, 2024.
[10] C. Wang, W. Lu, S. Peng, Y. Qu, G. Wang, and S. Yu, “Modeling on energyefficiency
computation offloading using probabilistic action generating,” IEEE Internet
of Things Journal, vol. 9, no. 20, pp. 20 681–20 692, 2022.
[11] K. Zheng, G. Jiang, X. Liu, K. Chi, X. Yao, and J. Liu, “Drl-based offloading for
computation delay minimization in wireless-powered multi-access edge computing,”
IEEE Transactions on Communications, vol. 71, no. 3, pp. 1755–1770, 2023.
[12] H. Yu, Q. Wang, and S. Guo, “Energy-efficient task offloading and resource scheduling
for mobile edge computing,” in 2018 IEEE International Conference on Networking,
Architecture and Storage (NAS), 2018, pp. 1–4.
[13] T. X. Tran and D. Pompili, “Joint task offloading and resource allocation for multiserver
mobile-edge computing networks,” IEEE Transactions on Vehicular Technology,
vol. 68, no. 1, pp. 856–868, 2019.
[14] H. Tran-Dang and D.-S. Kim, “Impact of task splitting on the delay performance of
task offloading in the iot-enabled fog systems,” in 2021 International Conference on
Information and Communication Technology Convergence (ICTC), 2021, pp. 661–
663.
[15] K. Zhang, Y. Mao, S. Leng, Q. Zhao, L. Li, X. Peng, L. Pan, S. Maharjan, and
Y. Zhang, “Energy-efficient offloading for mobile edge computing in 5g heterogeneous
networks,” IEEE Access, vol. 4, pp. 5896–5907, 2016.
[16] C.-F. Liu, M. Bennis, M. Debbah, and H. V. Poor, “Dynamic task offloading and resource
allocation for ultra-reliable low-latency edge computing,” IEEE Transactions
on Communications, vol. 67, no. 6, pp. 4132–4150, 2019.
[17] Q. Luo, C. Li, T. H. Luan, and W. Shi, “Collaborative data scheduling for vehicular
edge computing via deep reinforcement learning,” IEEE Internet of Things Journal,
vol. 7, no. 10, pp. 9637–9650, 2020.
[18] J. X. Liao and X. W. Wu, “Resource allocation and task scheduling scheme in prioritybased
hierarchical edge computing system,” in 2020 19th International Symposium
on Distributed Computing and Applications for Business Engineering and Science
(DCABES), 2020, pp. 46–49.
[19] H. Ye, G. Y. Li, and B.-H. F. Juang, “Deep reinforcement learning based resource
allocation for v2v communications,” IEEE Transactions on Vehicular Technology,
vol. 68, no. 4, pp. 3163–3173, 2019.
[20] L. Zhang, Y. Sun, Y. Tang, H. Zeng, and Y. Ruan, “Joint offloading decision and
resource allocation in mec-enabled vehicular networks,” in 2021 IEEE 93rd Vehicular
Technology Conference (VTC2021-Spring), 2021, pp. 1–5.
[21] P. Bolton and M. Dewatripont, Contract theory. MIT press, 2004.
[22] Y. Zhang, L. Song, W. Saad, Z. Dawy, and Z. Han, “Contract-based incentive mechanisms
for device-to-device communications in cellular networks,” IEEE Journal on
Selected Areas in Communications, vol. 33, no. 10, pp. 2144–2155, 2015.
[23] G. Ji, Z. Yao, B. Zhang, and C. Li, “A reverse auction-based incentive mechanism for
mobile crowdsensing,” IEEE Internet of Things Journal, vol. 7, no. 9, pp. 8238–8248,
2020.
[24] M. Zeng, Y. Li, K. Zhang, M. Waqas, and D. Jin, “Incentive mechanism design for
computation offloading in heterogeneous fog computing: A contract-based approach,”
in 2018 IEEE International Conference on Communications (ICC), 2018, pp. 1–6.
[25] Z. Zhou, H. Liao, X. Zhao, B. Ai, and M. Guizani, “Reliable task offloading for
vehicular fog computing under information asymmetry and information uncertainty,”
IEEE Transactions on Vehicular Technology, vol. 68, no. 9, pp. 8322–8335, 2019.
[26] Z. Zhou, H. Liao, B. Gu, S. Mumtaz, and J. Rodriguez, “Resource sharing and task
offloading in iot fog computing: A contract-learning approach,” IEEE Transactions
on Emerging Topics in Computational Intelligence, vol. 4, no. 3, pp. 227–240, 2020.
[27] S. M. A. Kazmi, T. N. Dang, I. Yaqoob, A. Manzoor, R. Hussain, A. Khan, C. S.
Hong, and K. Salah, “A novel contract theory-based incentive mechanism for cooperative
task-offloading in electrical vehicular networks,” IEEE Transactions on
Intelligent Transportation Systems, vol. 23, no. 7, pp. 8380–8395, 2022.
[28] Z. Hu, Z. Zheng, L. Song, T. Wang, and X. Li, “Uav offloading: Spectrum trading
contract design for uav-assisted cellular networks,” IEEE Transactions on Wireless
Communications, vol. 17, no. 9, pp. 6093–6107, 2018.
[29] Z. Zhou, P. Liu, J. Feng, Y. Zhang, S. Mumtaz, and J. Rodriguez, “Computation
resource allocation and task assignment optimization in vehicular fog computing: A
contract-matching approach,” IEEE Transactions on Vehicular Technology, vol. 68,
no. 4, pp. 3113–3125, 2019.
[30] M. Diamanti, P. Charatsaris, E. E. Tsiropoulou, and S. Papavassiliou, “Incentive
mechanism and resource allocation for edge-fog networks driven by multi-dimensional
contract and game theories,” IEEE Open Journal of the Communications Society,
vol. 3, pp. 435–452, 2022.
[31] Y. Li, B. Yang, H. Wu, Q. Han, C. Chen, and X. Guan, “Joint offloading decision
and resource allocation for vehicular fog-edge computing networks: A contractstackelberg
approach,” IEEE Internet of Things Journal, vol. 9, no. 17, pp. 15 969–
15 982, 2022.
[32] H. Liu, H. Zhao, L. Geng, and W. Feng, “A policy gradient based offloading scheme
with dependency guarantees for vehicular networks,” in 2020 IEEE Globecom Workshops
(GC Wkshps), 2020, pp. 1–6.
[33] F. Jiang, X. Zhu, and C. Sun, “Double dqn based computing offloading scheme for
fog radio access networks,” in 2021 IEEE/CIC International Conference on Communications
in China (ICCC), 2021, pp. 1131–1136.
[34] C. Qiu, Y. Hu, Y. Chen, and B. Zeng, “Deep deterministic policy gradient (ddpg)-
based energy harvesting wireless communications,” IEEE Internet of Things Journal,
vol. 6, no. 5, pp. 8577–8588, 2019.
[35] N. C. Luong, D. T. Hoang, S. Gong, D. Niyato, P. Wang, Y.-C. Liang, and D. I. Kim,
“Applications of deep reinforcement learning in communications and networking: A
survey,” IEEE Communications Surveys & Tutorials, vol. 21, no. 4, pp. 3133–3174,
2019.
[36] Y. Zhan, C. H. Liu, Y. Zhao, J. Zhang, and J. Tang, “Free market of multi-leader
multi-follower mobile crowdsensing: An incentive mechanism design by deep reinforcement
learning,” IEEE Transactions on Mobile Computing, vol. 19, no. 10, pp.
2316–2329, 2020.
[37] C. Chen, S. Gong, W. Zhang, Y. Zheng, and Y. C. Kiat, “Deep reinforcement learning
based contract incentive for uavs and energy harvest assisted computing,” in GLOBECOM
2022 - 2022 IEEE Global Communications Conference, 2022, pp. 2224–2229.
[38] N. H. Tran, W. Bao, A. Zomaya, M. N. H. Nguyen, and C. S. Hong, “Federated
learning over wireless networks: Optimization model design and analysis,” in IEEE
INFOCOM 2019 - IEEE Conference on Computer Communications, 2019, pp. 1387–
1395.
[39] T. Liu, J. Li, F. Shu, M. Tao, W. Chen, and Z. Han, “Design of contract-based
trading mechanism for a small-cell caching system,” IEEE Transactions on Wireless
Communications, vol. 16, no. 10, pp. 6602–6617, 2017.
[40] W. Hou, H. Wen, N. Zhang, J. Wu, W. Lei, and R. Zhao, “Incentive-driven task
allocation for collaborative edge computing in industrial internet of things,” IEEE
Internet of Things Journal, vol. 9, no. 1, pp. 706–718, 2022.

指導教授

胡誌麟(Chih-Lin Hu)

審核日期

2024-8-20

推文