使用強化學習之動作同步遠端操控人型機器人;A reinforcement learning based motion tracking approach for remote humanoid robot manipulation

NCU Institutional Repository > 資訊電機學院 > 電機工程研究所 > 博碩士論文 > Item 987654321/90050

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/90050

題名:	使用強化學習之動作同步遠端操控人型機器人;A reinforcement learning based motion tracking approach for remote humanoid robot manipulation
作者:	吳誌銘;Wu, Jhih-Ming
貢獻者:	電機工程學系
關鍵詞:	慣性感測器;逆向運動學;強化學習;運動重定向;IMU;Inverse kinematics;Reinforcement learning;Motion retargeting
日期:	2022-08-25
上傳時間:	2022-10-04 12:09:11 (UTC+8)
出版者:	國立中央大學
摘要:	本研究使用可穿戴式慣性感測器（IMU）獲取關節旋轉資訊，將IMU數據輸入至Unity軟體中加以計算得以獲得人體骨骼姿態，設計運動捕捉系統獲取人體運動時間序列數據。本系統使用運動重定向（Motion Retargeting）的技術，透過人體骨骼姿態控制NaoV6，於頭部配戴VR頭戴顯示器以獲取機器人視野，使操作者有沉浸式的體驗，設計語音系統使人機兩端能進行語音通訊。NaoV6身上配置數個關節傳感器，透過傳感器之反饋信息與動態模型（Linear Inverse Pendulum）來實現平穩走路，考量機器人的安全性，足部控制設計條件透過閥值觸發前進、側移及轉身等動作，手勢姿態能夠滿足許多多元生活需求，因此手部控制必須更加精密，本研究使用逆向運動學與強化學習兩種方法將人體的手部姿態運動重定向至機器人手部控制，並比較兩種運動重定向方式對於系統的優劣。逆向運動學法替機器人手臂建置Denavit-Hartenberg參數模型（D-H模型），並將當前人體姿態之笛卡爾座標位置映射至機器人維度之座標位置，依據逆向運動學解，將當前手臂位置回推至機器人的關節角度，驅使機器人執行相應的動作。強化學習法，採用行動者評論家網路之學習方式，於機器人端預先設計好動作，分別為歡呼、揮手、指向、雙手合十、敬禮及擦臉等目標動作，透過獎懲機制自主式學習，使受試者的姿態資料自動生成機器人的姿態動作。經實驗驗證，本系統能即時識別操作者動作，而機器人在表現動作上也能順暢的被操控與正確表現動作，提出的模型具備泛化的能力，能夠執行部分未學習的動作，並依據平均弗雷歇距離分析，本系統平均軌跡誤差約1.9公分，在重定向控制上具有很高的穩定度。;This study aims to use the inertial measurement unit sensor (IMU) data to reconstruct the human skeleton animation posture in Unity. A self-designed motion capture system is used to record the time series trajectory data of human animation. In this system, human controls Nao-V6 remotely by human posture with motion retargeting method, and gets the robot vision by VR headset, making user have an immersive experience. Design an audio system to communicate with the user of operator side and robot side. Equipped with the smooth default foot movement in Nao V6 due to the feedback information of the sensors that mounted on the robot and Linear Inverse Pendulum model. Considering the safety of the robot, the foot control such as move forward, move sideway and turn action will be triggered by threshold. People always use different gestures to meet with various requirement of daily life, that is the reason that gesture control must be more sophisticated. Two different motion retargeting methods are indicated and compared in this research, inverse kinematics and reinforcement learning. The inverse kinematics method needs to build Denavit-Hartenberg parameter model for each robot’s arm, and map the Cartesian coordinate of the current human posture to the robot dimension. The joint angles of the robot will be back-calculated through the current human arm position by the inverse kinematics solution. The reinforcement learning adopts Actor-Critic network. For the model learning, the robot should make six pre-designed motions. In the training phase, the human gesture will generate the gesture of the robot, the model parameter will update by reward and punishment rules. The proposed system has been demonstrated to successfully recognize subjects’ different in the initial onset of each motion action. According to the analysis of the average Fréchet distance, the average trajectory error of the system is 1.9 cm, and it has a high stability in the motion retargeting control.
顯示於類別:	[電機工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	60	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....