Advancing Human Action Recognition for Precision Assembly Using Vision and Mechanomyography Signals

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：15

、訪客IP：3.140.254.100

姓名

泰利(Ashiq Hussain Teeli) 查詢紙本館藏

畢業系所

機械工程學系

論文名稱

(Advancing Human Action Recognition for Precision Assembly Using Vision and Mechanomyography Signals)

相關論文

★ 微波化學強化碳化矽表面拋光之研究	★ 智慧製造垂直系統整合之資產管理殼
★ 應用於智慧製造之網宇實體系統訓練資料異常檢知	★ 應用深度學習與物聯網評估CNC加工時間
★ 混合視覺與光達感測的感知融合機器人定位系統	★ 結合遺傳演算法與類神經網路之分散式機械結構最佳化系統之研究
★ 以資料分散式服務發展智慧產品與其系統之研究	★ 精微產品組裝的智能人機協作系統
★ YOLOv7 模型於小物件檢測之改良與應用	★ 應用分治法於刀具壽命預測模型之研究
★ 基於多通道 RSS 的可見光定位系統之設計與其訊號處理方法	★ 應用誤差分析與參數擬合提升數位雙生模型逼真度之方法
★ 自動化工作站排程系統之設計與應用	★ 基於區塊鏈之去中心化製造執行系統
★ 應用於專案排程之混合蟻群演算法

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

中文摘要
本研究著重於精密組裝環境中之連續人體動作識別的挑戰，特別解決因僅應用攝影機
在識別小動作、過渡不穩定性和整體準確性方面的局限性。為此，提出了一種新穎的
多模組系統，包含識別身體動作的相機、識別精密手部動作的手環、和確認手部動作
識別是否合乎啟動條件的次要相機。這些訊號輸入給身體與手部動作辨識模組，最後
以決策模組統合兩者輸出以形成更好的識別結果。透過實際任務，包含樂高汽車組裝
和電子連接器組裝任務來評估系統的性能，並比較了 AE+LSTM、LSTM+Attention 及
單純 LSTM 三種深度學習模型的性能。結果表明，LSTM+Attention 模型在手部和身體
動作辨識方面均有著優越的表現。其次，所提出之方法在識別大規模身體動作和小手
部動作方面都有顯著改進，而且在精細動作識別方面顯著優於基於單純攝影機的系統。
最後，決策模型有效地管理了過渡不穩定，提高了HAR系統的整體可靠性。總結而言，
這項研究為精密組裝環境提出了強大的解決方案，為 HAR 領域做出了貢獻，有可能提
高工業環境中的安全性、效率和人機協作。未來的工作應該集中在改進演算法以更好
地處理噪音。此外，也應確保 HAR 系統在動態工業環境中對使用者友好且有效。

摘要(英)

Abstract
This study addresses the challenges of continuous Human Action Recognition (HAR) in
precision assembly environments, focusing on the limitations of camera-based systems in
recognizing small actions, transition instability, and overall accuracy. To this end, a novel
multi-module system is proposed, including a camera that recognizes body movements, a
bracelet that recognizes precise hand movements, and a secondary camera that confirms
whether hand movement recognition meets the startup conditions. The sensing signals are input
to the body and hand action recognition modules, and finally the decision-making module
integrates their outputs to form better recognition results. This research employed an
experimental approach using LEGO car assembly and electronic connector assembly tasks to
evaluate the performance of the system. Three deep learning models, AE + LSTM, LSTM +
Attention, and LSTM, were compared. The results show that the LSTM + Attention model
demonstrated superior performance in both hand and body action recognition. Also, significant
improvements in recognizing both large-scale body movements and small hand actions, and
here the wearable sensor outperforming the camera-based system in fine-action recognition.
Finally, the decision-making model effectively managed transition instability and enhanced the
overall reliability of the HAR system. This research contributes to the field of HAR by
proposing a robust solution for precision assembly environments, potentially improving safety,
efficiency, and human-robot collaboration in industrial settings. Future work should focus on
refining the algorithms to better handle noise. Additionally, emphasis should be placed on
ensuring that the HAR system is user friendly and effective in dynamic industrial settings.

關鍵字(中)

★ 人體動作辨識
★ 精確組裝
★ 穿戴式感測器
★ 手部動作辨識
★ 深度學習模型
★ 過渡不穩定性
★ 工業安全
★ 人機協作
★ 精細動作辨識

關鍵字(英)

★ Human Action Recognition
★ Precision Assembly
★ Wearable Sensors
★ Hand Action Recognition
★ Deep Learning Model
★ Transition Instability
★ Industrial Safety
★ Human-Robot Collaboration
★ Fine-Action Recognition

論文目次

Abstract i
中文摘要 ii
Acknowledgment iii
Content iv
Figures vii
Tables ix
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Objective 2
1.3 Thesis Outline 3
Chapter 2 Literature Review 4
2.1 Deep Learning 4
2.1.1 Convolutional Neural Networks 4
2.1.2 Recurrent Neural Networks 6
2.1.3 Attention Models 8
2.1.4 MediaPipe 8
2.2 Data Capture Techniques 9
2.2.1 Mechanomyography (MMG) 9
2.2.2 Body Tracking 10
2.3 Industrial Applications 10
2.4 Related Studies 12
2.5 Synthesis of Literature 15
Chapter 3 Methodology 16
3.1 Research Design 16
3.2 Equipment Used in This Study 17
3.2.1 ZED 2 Camera 17
3.2.2 CoolSo Bracelet 18
3.2.3 Simple Camera 19
3.2.4 Computing Environment 21
3.3 Dataset 22
3.3.1 ZED 2 Camera 23
3.3.2 CoolSo bracelet 24
3.3.3 Pre-processing 28
3.4 Deep Learning Models 29
3.4.1 Auto-Encoder and LSTM Model 29
3.4.2 LSTM and Attention Model 31
3.4.3 LSTM Model 32
Chapter 4 Experiment and Result 33
4.1 Dataset 33
4.2 Action Sequence and Experimental Setup 35
4.3 Training Parameters 37
4.4 Experimental Results 39
4.4.1 Hand Action Recognition Model 39
4.4.2 Body Action Recognition Model 41
4.4.3 Decision Making Model 41
4.4.4 Results and Analysis of Real-Time Action Recognition 43
4.5 Precision electronic connector assembly 44
Chapter 5 Discussion 49
5.1 Deep Learning Model 49
5.2 HAR System 49
5.3 Decision Model 50
5.4 System Effectiveness and General Applicability 50
Chapter 6 Conclusion and Future Work 52
6.1 Contribution 52
6.2 Future Work 53
References 54

參考文獻

Figure 1 Components used in Assembly 2
Figure 2 Single Layer Neural Network[1] 4
Figure 3 The overall structural design of the CNN model[3] 5
Figure 4 Auto-Encoder architecture diagram[5] 6
Figure 5 Recurrent network system[6] 7
Figure 6 Architecture of the LSTM model[7] 8
Figure 7 Mediapipe hand tracking[10] 9
Figure 8 MMG signals[11] 10
Figure 9 Overview of the human machine interaction system [12] 11
Figure 10 Module diagram of the system [13] 12
Figure 11 Timeline and framework of the system [14] 13
Figure 12 Discrete and continuous action recognition comparison [15] 14
Figure 13 Recognized gestures in[17] 15
Figure 14 System architecture of continuous action recognition 16
Figure 15 ZED 2 binocular vision camera 17
Figure 16 Body key points 18
Figure 17 CoolSo wearable device 18
Figure 18 LOGITECH HD Camera 20
Figure 19 Hand key points in rectangular boundary 20
Figure 20 Relationship between computers 22
Figure 21 Skeleton frame in the sequence 23
Figure 22 Data collection system 25
Figure 23 Gesture data collection system 26
Figure 24 Auto Encoder + LSTM model architecture 30
Figure 25 LSTM + Attention model architecture 31
Figure 26 LSTM model architecture 32
Figure 27 Five hand action recognition for this experiment 34
Figure 28 Four body action recognition for this experiment 35
Figure 29 LEGO Car assembly recipe 37
Figure 30 Effect of Sequence Length 39
Figure 31 Training and validation loss graph, along with a confusion matrix derived from the test dataset for a hand action recognition model 40
Figure 32 Training and validation loss graph, along with a confusion matrix derived from the test dataset for body action recognition model 41
Figure 33 Algorithm of decision module 43
Figure 34 Sequence diagram of prediction results of real time continuous action recognition 44
Figure 35 Tiny parts flex and terminal used in connector assembly 45
Figure 36 Confusion matrix derived from the bracelet data input 46
Figure 37 Confusion matrix derived from the camera data input 47
Figure 38 Action A0 (Pick up Flex) and A2 (Install Flex) 48

指導教授

林錦德(Lin, Chin-Te)

審核日期

2024-11-11

推文