在本論文中,我們研究了在 5G O-RAN 架構下如何高效傳輸混合型資料流 (URLLC、eMBB、mMTC),特別是在考慮到 O-RAN 元件儲存資源有限的情況下,提出了一種基於 Markov decision process (MDP) 和 Duel DQN 的強化學習方法。 此方法結合了網路切片技術和資料預處理,旨在提升頻寬使用率和封包接收率,並降低資料流的丟失率。 研究結果表明,該方法在智慧工廠環境中的模擬結果顯示出顯著的性能提升,尤其是在提升整體頻寬使用率、降低傳輸延遲和丟失率方面,展示了其在實際應用中的有效性。具體而言,本研究解決了兩個關鍵性問題:(1)如何在保證資料流穩定進入系統後被有效執行,提升整體封包接收率;(2)如何找到最佳的空間分配資料流並預留容量 處理緊急資料,以提升整體頻寬使用率。 總結來說,本研究提出的方法在 5G O-RAN 架構下的混合型資料流傳輸問題中展示了良好的應用潛力,對於 5G O-RAN 的未來發展具有重要的指導意義。;In this thesis, we investigate how to efficiently transmit hybrid data streams (URLLC, eMBB, mMTC) under the 5G O-RAN architecture, especially considering the limited storage resources of O-RAN components. We propose a reinforcement learning method based on Markov Decision Process (MDP) and Duel DQN. This method combines network slicing techniques and data preprocessing to improve bandwidth utilization and packet reception rate while reducing data stream loss. The research results show that this method demonstrates significant performance im provements in a smart factory environment, especially in enhancing overall bandwidth utilization, reducing transmission delay, and loss rate, demonstrating its effectiveness in practical applications. Specifically, this study addresses two key issues: (1) how to en sure stable execution of data streams after they enter the system, thereby improving the overall packet reception rate; and (2) how to find the optimal spatial allocation for data streams and reserve capacity to handle urgent data, thereby improving overall bandwidth utilization. In conclusion, the proposed method demonstrates good application potential in solving the hybrid data stream transmission problem under the 5G O-RAN architecture, providing important guidance for the future development of 5G O-RAN.