使用強化學習於機械手臂在腹部之超音波掃描

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：130

、訪客IP：18.216.45.231

姓名

陸柏崴(Po-Wei LU) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

使用強化學習於機械手臂在腹部之超音波掃描
(Reinforcement Learning for Robotic Arm Ultrasound Scanning of the Abdomen)

相關論文

★ 使用梳狀濾波器於相位編碼之穩態視覺誘發電位腦波人機介面	★ 應用電激發光元件於穩態視覺誘發電位之腦波人機介面判斷
★ 智慧型手機之即時生理顯示裝置研製	★ 多頻相位編碼之閃光視覺誘發電位驅動大腦人機介面
★ 以經驗模態分解法分析穩態視覺誘發電位之大腦人機界面	★ 利用經驗模態分解法萃取聽覺誘發腦磁波訊號
★ 明暗閃爍視覺誘發電位於遙控器之應用	★ 使用整體經驗模態分解法進行穩態視覺誘發電位腦波遙控車即時控制
★ 使用模糊理論於穩態視覺誘發之腦波人機介面判斷	★ 利用正向模型設計空間濾波器應用於視覺誘發電位之大腦人機介面之雜訊消除
★ 智慧型心電圖遠端監控系統	★ 使用隱馬可夫模型於穩態視覺誘發之腦波人機介面判斷與其腦波控制遙控車應用
★ 使用類神經網路於肢體肌電訊號進行人體關節角度預測	★ 使用等階集合法與影像不均勻度修正於手指靜脈血管影像切割
★ 應用小波編碼於多通道生理訊號傳輸	★ 結合高斯混合模型與最大期望值方法於相位編碼視覺腦波人機介面之目標偵測

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2026-6-30以後開放)

摘要(中)

本研究於使用強化式學習結合機械手臂，使其能夠在腹部上進行超音波掃描。首先使用深度攝影機進行三維建模，構建虛擬腹部模型，這些虛擬物體用於訓練機械手臂，使其能夠在不確定的表面上進行超音波掃描。在訓練過程中，根據UR5e機械手臂的Denavit-Hartenberg（DH）參數表進行逆向運動學控制，以精確地控制虛擬機械手臂的運動。強化學習方法採用PPO(近端優化策略)網路之學習方式，透過獎懲機制使虛擬機械手臂自主學習。在這個架構中，行動者（Actor）負責選擇動作，評論家（Critic）
則評估選擇動作的價值，並通過獎懲機制來調整行動者的策略。這種方法使得機械手臂能夠在不確定的環境中不斷學習和改進其行動策略，以實現最佳的超音波掃描效果。在完成模型訓練後，使用Intel Realsense D435 深度攝影機對實際的表面物體進行拍攝，並將拍攝到的深度圖像導入 PyBullet 中進行建模，這樣可以替換訓練過程中使用的隨
機虛擬物體，達到真實環境下的應用。在 PyBullet 中生成的物體模型上，應用訓練好的強化學習模型來獲得虛擬機械手臂掃描路徑的關節角度。這些關節角度通過 RTDE（Real-Time Data Exchange）通訊方式傳送給實際的UR5e機械手臂，使其能夠精確地復現虛擬機械手臂的動作，進行超音波掃描。在實際機械手臂的控制中，為了避免TCP（工具中心點）施加過大的力，使用了UR5e機械手臂內建的力感測系統，將施加的力量控制在一定範圍內，確保掃描過程中的安全性和準確性。實驗結果顯示，經過訓練的機械手臂能夠在腹部上進行精確的超音波掃描。此外，建模誤差僅為0.00000079 m2，掃描過程中的施力穩定維持在 9N 至 10N，平均為9.65N。在餘弦距離方面，平均餘弦距離為 0.0017，接近 0，表明超音波探頭與腹部法向量高度對齊。

摘要(英)

This study integrates reinforcement learning with a robotic arm to enable ultrasound scanning on abdominal surfaces. A depth camera was first used for 3D modeling to construct
virtual abdominal models. These virtual objects were utilized to train the robotic arm to perform ultrasound scans on uncertain surfaces. During the training process, inverse kinematics control was implemented based on the Denavit-Hartenberg (DH) parameter table of the UR5e robotic arm, ensuring precise control of the virtual robotic arm′s movements. The reinforcement learning method employed the Proximal Policy Optimization (PPO) network, where the virtual robotic arm autonomously learned through a reward mechanism. In this framework, the Actor selects actions, while the Critic evaluates the value of these actions and
adjusts the Actor′s strategy through a reward mechanism. This approach enables the robotic arm to continuously learn and improve its strategies in uncertain environments, achieving optimal ultrasound scanning performance.After completing model training, the Intel Realsense
D435 depth camera was used to capture depth images of actual surface objects, which were then imported into PyBullet for modeling. This allowed the replacement of the random virtual
objects used during training, facilitating real-world applications. On the object models generated in PyBullet, the trained reinforcement learning model was applied to determine joint angles for the virtual robotic arm′s scanning path. These joint angles were transmitted to the
physical UR5e robotic arm via the Real-Time Data Exchange (RTDE) communication protocol, enabling it to replicate the actions of the virtual robotic arm and perform ultrasound
scanning.For the control of the actual robotic arm, the built-in force sensing system of the UR5e robotic arm was used to maintain the applied force within a safe range, preventing excessive force at the tool center point (TCP). This ensured safety and accuracy during the scanning process. The force control mechanism effectively prevented damage to the scanned objects due to overexertion while enhancing the accuracy and reliability of the scanning
viii data.Experimental results demonstrated that the trained robotic arm could perform precise ultrasound scanning on various abdominal surfaces, even with uncertainties. Additionally, the modeling error was only 0.00000079 m2, and the applied force during scanning was
consistently maintained between 9N and 10N, with an average of 9.65N. Regarding cosine distance, the mean cosine distance was 0.0017, close to 0, indicating a high alignment of the ultrasound probe with the abdominal surface′s normal vectors.

關鍵字(中)

★ 機械手臂
★ 逆向運動學
★ 機器人運動學
★ 強化學習

關鍵字(英)

★ robotic arm
★ inverse kinematics
★ robot kinematics
★ reinforcement learning

論文目次

中文摘要 vii
Abstract viii
目錄 x
圖目錄 xii
表目錄 xv
第一章緒論 1
1-1 研究動機與目的 1
1-2 論文章節結構 4
第二章原理介紹 5
2-1 機器人運動學 5
2-1-1 D-H參數法 5
2-1-2 正向運動學 9
2-1-3 逆向運動學 12
2-2 強化學習 16
2-2-1 強化學習簡介 16
2-2-2 近端策略優化算法（Proximal Policy Optimization ,PPO） 18
第三章研究設計與方法 21
3-1 系統架構 21
3-1-1 使用者系統 22
3-1-2 強化學習系統 22
3-1-3 3D建模系統 23
3-1-4 影像消除系統 24
3-1-5 控制機械手臂系統 25
3-1-6 超音波掃描設備 27
3-2 建立虛擬機械手臂 28
3-2-1 虛擬機械手臂建模 28
3-2-2 機械手臂的結構設計 28
3-2-3 PyBullet模擬的整合 31
3-3 建立訓練環境 32
3-3-1 三維建模與深度前處理 32
3-3-2 點雲法向量計算與平面擬合 33
3-3-3 三維建模與密度過濾 34
3-3-4 設計隨機目標點以多樣化訓練路徑 35
3-4 強化學習訓練 37
3-4-1 強化學習的基本框架 37
3-4-2 設計 Observation與 Action 38
3-4-3 Reward 設計 42
3-4-4 使用PPO進行強化學習訓練 44
3-5 機械手臂逆向運動 49
3-5-1 雅可比矩陣 (Jacobian Matrix) 49
3-5-2 逆雅可比矩陣 50
3-5-3 逆向運動計算流程 51
3-6 資料傳輸 52
3-7 實驗場景 53
第四章結果與討論 55
4-1 機械手臂掃描結果 55
4-1-1 位置誤差與餘弦距離 55
4-1-2 實驗結果一 57
4-1-3 實驗結果二 58
4-1-4 實驗結果三 59
4-2 機械手臂力回饋結果 61
4-2-1 強化學習結合機械手臂系統結果 62
4-2-2 手動測試一 63
4-2-3 手動測試二 64
4-2-4 手動測試三 65
4-2-5 強化學習結合機械手臂與手動測試的比較 66
4-3 超音波影像 68
4-4 實驗討論 69
第五章結論與未來展望 73
第六章參考文獻 74

參考文獻

[1] G. Li et al., "Robotic system for MRI-guided stereotactic neurosurgery," IEEE transactions on biomedical engineering, vol. 62, no. 4, pp. 1077-1088, 2014.
[2] A. M. Priester, S. Natarajan, and M. O. Culjat, "Robotic ultrasound systems in medicine," IEEE transactions on ultrasonics, ferroelectrics, and frequency control, vol. 60, no. 3, pp. 507-523, 2013.
[3] K. C. Lau, E. Y. Y. Leung, P. W. Y. Chiu, Y. Yam, J. Y. W. Lau, and C. C. Y. Poon, "A flexible surgical robotic system for removal of early-stage gastrointestinal cancers by endoscopic submucosal dissection," IEEE Transactions on Industrial Informatics, vol. 12, no. 6, pp. 2365-2374, 2016.
[4] M. Hoeckelmann, I. J. Rudas, P. Fiorini, F. Kirchner, and T. Haidegger, "Current capabilities and development potential in surgical robotics," International Journal of Advanced Robotic Systems, vol. 12, no. 5, p. 61, 2015.
[5] Q. Huang and Z. Zeng, "A review on real?time 3D ultrasound imaging technology," BioMed research international, vol. 2017, no. 1, p. 6027029, 2017.
[6] R. Kojcev et al., "On the reproducibility of expert-operated and robotic ultrasound acquisitions," International journal of computer assisted radiology and surgery, vol. 12, pp. 1003-1011, 2017.
[7] Z. Pan, S. Tian, M. Guo, J. Zhang, N. Yu, and Y. Xin, "Comparison of medical image 3D reconstruction rendering methods for robot-assisted surgery," in 2017 2nd International Conference on Advanced Robotics and Mechatronics (ICARM), 2017: IEEE, pp. 94-99.
[8] N. Koizumi et al., "Integration of diagnostics and therapy by ultrasound and robot technology," in 2010 International Symposium on Micro-NanoMechatronics and Human Science, 2010: IEEE, pp. 53-58.
[9] B. Ihnatsenka and A. P. Boezaart, "Ultrasound: Basic understanding and learning the language," International journal of shoulder surgery, vol. 4, no. 3, p. 55, 2010.
[10] P. Chatelain, A. Krupa, and N. Navab, "Confidence-driven control of an ultrasound probe," IEEE Transactions on Robotics, vol. 33, no. 6, pp. 1410-1424, 2017.
[11] T. Haidegger, "Autonomy for surgical robots: Concepts and paradigms," IEEE Transactions on Medical Robotics and Bionics, vol. 1, no. 2, pp. 65-76, 2019.
[12] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, "Deep reinforcement learning: A brief survey," IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 26-38, 2017.
[13] V. Mnih et al., "Human-level control through deep reinforcement learning," nature, vol. 518, no. 7540, pp. 529-533, 2015.
[14] E. Theodorou, J. Buchli, and S. Schaal, "Reinforcement learning of motor skills in high dimensions: A path integral approach," in 2010 IEEE International Conference on Robotics and Automation, 2010: IEEE, pp. 2397-2403.
[15] P. Abbeel, A. Coates, M. Quigley, and A. Ng, "An application of reinforcement learning to aerobatic helicopter flight," Advances in neural information processing systems, vol. 19, 2006.
[16] K. M. Jagodnik, P. S. Thomas, A. J. van den Bogert, M. S. Branicky, and R. F. Kirsch, "Human-like rewards to train a reinforcement learning controller for planar arm movement," IEEE Transactions on Human-Machine Systems, vol. 46, no. 5, pp. 723-733, 2016.
[17] S. Gu, E. Holly, T. Lillicrap, and S. Levine, "Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates," in 2017 IEEE international conference on robotics and automation (ICRA), 2017: IEEE, pp. 3389-3396.
[18] S. Levine, C. Finn, T. Darrell, and P. Abbeel, "End-to-end training of deep visuomotor policies," Journal of Machine Learning Research, vol. 17, no. 39, pp. 1-40, 2016.
[19] V. Kumar, T. Hermans, D. Fox, S. Birchfield, and J. Tremblay, "Contextual reinforcement learning of visuo-tactile multi-fingered grasping policies," arXiv preprint arXiv:1911.09233, 2019.
[20] X. Yu, R. Yu, J. Yang, and X. Duan, "A robotic auto-focus system based on deep reinforcement learning," in 2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV), 2018: IEEE, pp. 204-209.
[21] X. Tan, C.-B. Chng, Y. Su, K.-B. Lim, and C.-K. Chui, "Robot-assisted training in laparoscopy using deep reinforcement learning," IEEE Robotics and Automation Letters, vol. 4, no. 2, pp. 485-492, 2019.
[22] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, "Proximal policy optimization algorithms," arXiv preprint arXiv:1707.06347, 2017.
[23] G. Ning, J. Chen, X. Zhang, and H. Liao, "Force-guided autonomous robotic ultrasound scanning control method for soft uncertain environment," International Journal of Computer Assisted Radiology and Surgery, vol. 16, no. 12, pp. 2189-2199, 2021.
[24] G. Ning, X. Zhang, and H. Liao, "Autonomic robotic ultrasound imaging system based on reinforcement learning," IEEE transactions on biomedical engineering, vol. 68, no. 9, pp. 2787-2797, 2021.
[25] Q. Huang and J. Lan, "Remote control of a robotic prosthesis arm with six-degree-of-freedom for ultrasonic scanning and three-dimensional imaging," Biomedical signal processing and control, vol. 54, p. 101606, 2019.
[26] B. Duan, L. Xiong, X. Guan, Y. Fu, and Y. Zhang, "Tele-operated robotic ultrasound system for medical diagnosis," Biomedical Signal Processing and Control, vol. 70, p. 102900, 2021.
[27] S. Hofer et al., "Sim2real in robotics and automation: Applications and challenges," IEEE transactions on automation science and engineering, vol. 18, no. 2, pp. 398-400, 2021.
[28] F. Aalamifar et al., "Robot-assisted automatic ultrasound calibration," International journal of computer assisted radiology and surgery, vol. 11, pp. 1821-1829, 2016.
[29] E. M. Boctor, G. Fischer, M. A. Choti, G. Fichtinger, and R. H. Taylor, "A dual-armed robotic system for intraoperative ultrasound guided hepatic ablative therapy: a prospective study," in IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA′04. 2004, 2004, vol. 3: IEEE, pp. 2517-2522.
[30] J. Villalobos, I. Y. Sanchez, and F. Martell, "Statistical comparison of Denavit-Hartenberg based inverse kinematic solutions of the UR5 robotic manipulator," in 2021 International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME), 2021: IEEE, pp. 1-6.
[31] Q. Liu, D. Yang, W. Hao, and Y. Wei, "Research on kinematic modeling and analysis methods of UR robot," in 2018 IEEE 4th Information Technology and Mechatronics Engineering Conference (ITOEC), 2018: IEEE, pp. 159-164.
[32] K. P. Hawkins, "Analytic inverse kinematics for the universal robots UR-5/UR-10 arms," 2013.
[33] J. Villalobos, I. Y. Sanchez, and F. Martell, "Alternative inverse kinematic solution of the ur5 robotic arm," in Advances in Automation and Robotics Research: Proceedings of the 3rd Latin American Congress on Automation and Robotics, Monterrey, Mexico 2021, 2022: Springer, pp. 200-207.
[34] X. Liu, X. Cheng, X. Chang, and Y. Zheng, "Analysis of motion characteristics of lower limb exoskeleton robot," in Journal of Physics: Conference Series, 2023, vol. 2581, no. 1: IOP Publishing, p. 012003.
[35] S. R. Buss, "Introduction to inverse kinematics with jacobian transpose, pseudoinverse and damped least squares methods," IEEE Journal of Robotics and Automation, vol. 17, no. 1-19, p. 16, 2004.
[36] M. Nicola and C.-I. Nicola, "Improvement of PMSM control using reinforcement learning deep deterministic policy gradient agent," in 2021 21st International Symposium on Power Electronics (Ee), 2021: IEEE, pp. 1-6.
[37] X. Yang, Z. Ji, J. Wu, and Y.-K. Lai, "An open-source multi-goal reinforcement learning environment for robotic manipulation with pybullet," in Annual Conference Towards Autonomous Robotic Systems, 2021: Springer, pp. 14-24.
[38] E. S. Gastal and M. M. Oliveira, "Domain transform for edge-aware image and video processing," in ACM SIGGRAPH 2011 papers, 2011, pp. 1-12.
[39] M. U. Bromba and H. Ziegler, "Application hints for Savitzky-Golay digital smoothing filters," Analytical Chemistry, vol. 53, no. 11, pp. 1583-1586, 1981.
[40] G. Guo, H. Wang, D. Bell, Y. Bi, and K. Greer, "KNN model-based approach in classification," in On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE: OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003, Catania, Sicily, Italy, November 3-7, 2003. Proceedings, 2003: Springer, pp. 986-996.
[41] M. Kazhdan, M. Bolitho, and H. Hoppe, "Poisson surface reconstruction," in Proceedings of the fourth Eurographics symposium on Geometry processing, 2006, vol. 7, no. 4.
[42] M. A. Wiering and M. Van Otterlo, "Reinforcement learning," Adaptation, learning, and optimization, vol. 12, no. 3, p. 729, 2012.
[43] G. Qian, S. Sural, Y. Gu, and S. Pramanik, "Similarity between Euclidean and cosine angle distance for nearest neighbor queries," in Proceedings of the 2004 ACM symposium on Applied computing, 2004, pp. 1232-1237.
[44] Y. Lin, J. Lloyd, A. Church, and N. F. Lepora, "Tactile gym 2.0: Sim-to-real deep reinforcement learning for comparing low-cost high-resolution robot touch," IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 10754-10761, 2022.
[45] S. Li et al., "A mobile robot hand-arm teleoperation system by vision and imu," in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020: IEEE, pp. 10900-10906.
[46] X. Ma, Z. Zhang, and H. K. Zhang, "Autonomous scanning target localization for robotic lung ultrasound imaging," in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021: IEEE, pp. 9467-9474.

指導教授

李柏磊

審核日期

2025-1-20

推文