摘要: | 中文摘要 本研究著重於精密組裝環境中之連續人體動作識別的挑戰,特別解決因僅應用攝影機 在識別小動作、過渡不穩定性和整體準確性方面的局限性。為此,提出了一種新穎的 多模組系統,包含識別身體動作的相機、識別精密手部動作的手環、和確認手部動作 識別是否合乎啟動條件的次要相機。這些訊號輸入給身體與手部動作辨識模組,最後 以決策模組統合兩者輸出以形成更好的識別結果。透過實際任務,包含樂高汽車組裝 和電子連接器組裝任務來評估系統的性能,並比較了 AE+LSTM、LSTM+Attention 及 單純 LSTM 三種深度學習模型的性能。結果表明,LSTM+Attention 模型在手部和身體 動作辨識方面均有著優越的表現。其次,所提出之方法在識別大規模身體動作和小手 部動作方面都有顯著改進,而且在精細動作識別方面顯著優於基於單純攝影機的系統。 最後,決策模型有效地管理了過渡不穩定,提高了HAR系統的整體可靠性。總結而言, 這項研究為精密組裝環境提出了強大的解決方案,為 HAR 領域做出了貢獻,有可能提 高工業環境中的安全性、效率和人機協作。未來的工作應該集中在改進演算法以更好 地處理噪音。此外,也應確保 HAR 系統在動態工業環境中對使用者友好且有效。;Abstract This study addresses the challenges of continuous Human Action Recognition (HAR) in precision assembly environments, focusing on the limitations of camera-based systems in recognizing small actions, transition instability, and overall accuracy. To this end, a novel multi-module system is proposed, including a camera that recognizes body movements, a bracelet that recognizes precise hand movements, and a secondary camera that confirms whether hand movement recognition meets the startup conditions. The sensing signals are input to the body and hand action recognition modules, and finally the decision-making module integrates their outputs to form better recognition results. This research employed an experimental approach using LEGO car assembly and electronic connector assembly tasks to evaluate the performance of the system. Three deep learning models, AE + LSTM, LSTM + Attention, and LSTM, were compared. The results show that the LSTM + Attention model demonstrated superior performance in both hand and body action recognition. Also, significant improvements in recognizing both large-scale body movements and small hand actions, and here the wearable sensor outperforming the camera-based system in fine-action recognition. Finally, the decision-making model effectively managed transition instability and enhanced the overall reliability of the HAR system. This research contributes to the field of HAR by proposing a robust solution for precision assembly environments, potentially improving safety, efficiency, and human-robot collaboration in industrial settings. Future work should focus on refining the algorithms to better handle noise. Additionally, emphasis should be placed on ensuring that the HAR system is user friendly and effective in dynamic industrial settings. |