應用機器視覺於機械手臂隨機物件夾取 與三維人體姿態偵測

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：91

、訪客IP：3.133.136.95

姓名

廖子程(Zi-Cheng Liao) 查詢紙本館藏

畢業系所

機械工程學系

論文名稱

應用機器視覺於機械手臂隨機物件夾取與三維人體姿態偵測
(Model Learning Based on Machine Vision for Unknown Object 6DoF Grasp and 3D Human Pose Reconstruction)

相關論文

★ 矩形平板施加點質量陣列的振動特性與暫態波傳分析與波源歷時反算應用	★ 矩形平板部分埋沒於顆粒體之動態反應與振動特性
★ 以雙向耦合離散元素法與有限元素法模擬顆粒體在矩形板振動下產生的克拉尼圖與反克拉尼圖

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2026-9-1以後開放)

摘要(中)

隨著近年的AI熱潮以及各類協作型、人型機器人的問世，協作型機器人的環境感知與任務決策能力在近年已成為學界與業界的重點研究方向，本論文將深入研究協作型機械手臂自主針對三維特徵自主夾取演算法與多相機陣列對於人體姿態的三維重建，並將上述演算法共同結合在同一場域中，使機械手臂在自主完成夾取任務的同時，能實時感知到操作人員的三維姿態訊息，即時做出安全性應對。其中，機械手臂三維特徵自主夾取演算法，為分析深度相機拍攝到的立體資訊，自主決策料堆夾取的姿態，用於碼放隨意堆放的任意工件。本演算法以在點雲上隨機灑點的方式，一步步篩選出可用的六自由度夾取姿態，這些隨機生成的夾取姿態將經過碰撞檢測和三維卷積神經網路模型分類器的篩選，提升夾取成功率的同時，也盡量確保夾取任務執行時的安全性，該演算法在實際夾取任務中的的準確率與穩定性也將在本研究中進行驗證，並探討各類不同的夾取任務環境、參數改變對於演算法的最終輸出影響。多相機陣列對於人體姿態的三維重建將以HRNet作為演算法的基礎，使用客製化的人體關鍵點資料集進行訓練，用於人體關鍵點的二維點檢測，再以多相機陣列基於人體的影像二維座標進行多相機三角測量。本研究中將詳細介紹HRNet對於客製化資料集的訓練策略、針對環繞擺設之多相機陣列的影像內、外部參數校正策略，與最後將兩者結合所建構的人體三維姿態優化演算法。實驗方面將對於多相機陣列的三維重建準確率進行評估，並逐步展示最後所建構的三維人體姿態成果，和優化演算法對於三維人體姿態在單幀與連續幀中的影響。

摘要(英)

With the recent AI boom and the emergence of various collaborative and humanoid robots, the environmental perception and task decision-making capabilities of collaborative robots have become a key research direction in academia and industry in recent years. This thesis will discuss the autonomous grasping algorithm for 3D features of collaborative robotic arms and the 3D reconstruction of human posture using a multi-camera array. The autonomous grasping algorithm for 3D features of robotic arms, analyzes the 3D information captured by depth cameras to autonomously grasping objects. This algorithm uses a random seeding method on the point cloud to gradually filter out the available 6DOF grasping postures. These randomly generated grasping postures will be filtered through collision detection and a 3D convolutional neural network model classifier to improve the grasping success rate and ensuring the safety of the grasping task execution. The accuracy and stability of the algorithm in real grasping tasks will also be verified in this study. The 3D reconstruction of human posture using a multi-camera array will be based on the HRNet algorithm, trained using a customized human keypoint dataset for 2D point detection of human keypoints. Then multi-camera triangulation will be performed using the 2D points of the human image in the multi-camera array. This study will detail the training strategy of HRNet for customized datasets, the image intrinsic and extrinsic calibration strategy for multi-camera arrays arranged around the environment, and the final 3D human posture optimization algorithm construction. The accuracy of 3D reconstruction using multi-camera arrays will be evaluated experimentally, and the final 3D human posture results and the impact of the optimization algorithm on 3D human posture in single and continuous frames will be gradually demonstrated

關鍵字(中)

★ 機械手臂
★ 六自由度料堆夾取
★ 深度相機
★ 三維卷積神經網路
★ HRNet
★ 多相機
★ 三角測量
★ 三維人體姿態

關鍵字(英)

★ Robotic arm
★ 6DoF grasp
★ Depth camera
★ 3D convolution neural network
★ HRNet
★ Camera array
★ Triangulation
★ 3D human pose

論文目次

摘要 i
Abstract ii
目錄 iii
圖目錄 vii
表目錄 xvi
第一章、前言 1
1-1 研究動機 1
1-2 文獻回顧 2
1-3 內容簡介 6
第二章、機械手臂與視覺感測器之整合原理 8
2-1 針孔相機模型(Pinhole Camera Model) 8
2-2 三角測量原理 10
2-3 光束平差法 11
2-3-1 重投影誤差 11
2-3-2 置信域方法 12
2-4 ArUco標記 14
2-4-1 ArUco特徵候選檢測 14
2-4-2 ArUco標記確認 14
2-4-3 ArUco標記三維姿態轉換 15
2-5 機械手臂與影像系統座標關係 16
2-5-1 Eye to hand 16
2-5-2 Eye in hand 16
2-6 Eye in hand手眼校正 17
2-7 深度相機與點雲取像後處理 18
2-7-1 多視角點雲縫補 18
2-7-2 點雲邊界定義與範圍裁切 19
2-7-3 體素化 20
2-7-4 點雲法向量 20
第三章、機械手臂自適應夾取演算法原理 33
3-1 演算法說明 33
3-1-1 點雲資料景物分離 33
3-1-2 夾取姿態候選生成 34
3-1-3 夾取姿態碰撞預防 36
3-1-4 三維神經網路模型分類器簡介 36
3-2 三維卷積神經網路分類器 37
3-2-1 輸入層 37
3-2-2 三維卷積層 38
3-2-3 密集層與輸出層 39
3-2-4 損失函數 39
3-3 夾取姿態體素訓練資料分析與蒐集 40
3-3-1 訓練資料蒐集自製軟體 41
3-3-2 正面訓練資料定義 41
3-3-3 反面訓練資料定義 41
3-3-4 夾取姿態體素訓練資料總覽 42
3-4 分類器訓練與評估 43
3-4-1 分類器訓練 43
3-4-2 分類器評估 43
第四章、機械手臂自適應夾取之實驗與討論 55
4-1 實驗設備介紹 55
4-1-1 達明協作型機器人TM5-900 55
4-1-2 Robotiq力回饋夾爪2F-85 55
4-1-3 Intel RealSense深度相機D435i 55
4-1-4 自製ArUco工作盤 56
4-1-5 泡棉積木 57
4-2 夾取演算法運作流程與各項參數 57
4-3 模型分數與夾取成功率關係測試 57
4-4 夾取候選生成解析度與分類器分數關係實驗 59
4-4-1 種子點雲解析度之影響 60
4-4-2 旋轉步數之影響 61
4-4-3 夾取候選生成解析度與分類器分數之關聯 62
4-5 物品堆放與分類器分數關係實驗 63
4-5-1 單物件隨機擺放實驗 63
4-5-2 多物件隨機擺放實驗 64
第五章、HRNet二階段全身人體姿態多相機三維檢測 107
5-1 二維人體姿態模型 107
5-2 HRNet網路架構 107
5-2-1 ResNet殘差模組(BasicBlock, Bottleneck) 107
5-2-2主幹層(Stem Layer)-前處理 108
5-2-3第一階段(Layer1)-卷積通道擴展 108
5-2-4第二階段(Stage 2)-二分支 108
5-2-5第三階段(Stage3) -三分支 109
5-2-6第四階段(Stage4) -四分支 110
5-2-7最終層(Final Layer)-分支總結 110
5-3熱圖關鍵點析出後處理 110
5-4多人關鍵點分組與人體窗格前處理 112
5-5 HRNet全身人體資料集訓練 112
5-5-1二階段人體檢測訓練資料前處理 113
5-5-2 Halpe-Full-Body資料集總覽 113
5-5-3 Halpe-Full-Body資料集訓練 114
5-6 多相機陣列之定位校正 115
5-6-1 相機陣列內部參數校正 115
5-6-2 相機陣列外部參數校正 116
5-7 相機陣列之三角測量優化 118
5-8 人體動作重建平滑化 119
第六章、人體三維姿態偵測之實驗與討論 139
6-1 實驗設備介紹 139
6-1-1 廣角網路攝影機 139
6-1-2 棋盤格校正板 139
6-1-3 ArUco標記箱 139
6-2 相機陣列內部參數之校正效果測試 140
6-3 機陣列外部參數之校正效果測試 140
6-3-1 虛擬環境Godot之誤差測試 140
6-3-2 真實環境之誤差測試 141
6-4 相機陣列三維單人姿態檢測之實驗結果 141
6-4-1 三維人體靜態姿態檢測 141
6-4-2 三維人體動態姿態檢測 142
6.5 相機陣列三維多人姿態檢測之實驗結果 143
6.6 三維多人姿態檢測與自適應夾取整合結果 143
第七章、結論與未來展望 154
7.1 結論 154
7.2 未來展望 156
參考文獻 158

參考文獻

[1] OpenCV, Image Thresholding, 2008,
https://docs.opencv.org/3.4/d7/d4d/tutorial_py_thresholding.html.
[2] Joseph Redmon, et al, “You Only Look Once: Unified, Real-Time Object Detection”,
Proceedings of the IEEE conference on computer vision and pattern recognition, 2016,
pp. 779-788.
[3] Shaoqing Ren, et al, “Faster R-CNN: Towards Real-Time Object Detection with Region
Proposal Networks”, Advances in neural information processing systems, Vol 28, 2015,
pp. 28.
[4] R. Qi Charles, et al, “PointNet: Deep Learning on Point Sets for 3D Classification and
Segmentation”, 2017 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), 2017, pp. 77-85.
[5] Yangyan Li, et al, “PointCNN: Convolution On X-Transformed Points”, Proceedings of
the 32nd International Conference on Neural Information Processing Systems, 2018, pp.
828-838.
[6] Fang, Hao-Shu, et al, "Anygrasp: Robust and efficient grasp perception in spatial and
temporal domains." IEEE Transactions on Robotics, 2023.
[7] H. -S. Fang, C. Wang, M. Gou and C. Lu, "GraspNet-1Billion: A Large-Scale
Benchmark for General Object Grasping," 2020 IEEE/CVF Conference on Computer
Vision and Pattern Recognition (CVPR), 2020, pp. 11441-11450.
[8] Cao, Hanwen, et al. "Suctionnet-1billion: A large-scale benchmark for suction grasping."
IEEE Robotics and Automation Letters, Vol 6, Number 4, 2021, pp. 8718-8725.
[9] Ten Pas, A., et al, “Grasp Pose Detection in Point Clouds”, The International Journal of
Robotics Research, Vol 36, Issue 13-14, October 2017, Elsevier, pp. 1455-1473.
[10] Liang, Hongzhuo, et al. "Pointnetgpd: Detecting grasp configurations from point sets."
2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 3629
3635.
[11] Radu Bogdan Rusu, Steve Cousins, “3D is here: Point Cloud Library (PCL)”, IEEE
International Conference on Robotics and Automation (ICRA), 2011, IEEE.
[12] Liu, et al, “Point-voxel cnn for efficient 3d deep learning”, Advances in neural
information processing systems, Vol 32, 2019.
[13] Steven Macenski, et al, “Robot Operating System 2: Design, architecture, and uses in the
wild”, Science Robotics, Vol 7, Number 66, 2022, pp. eabm6074.
[14] David Coleman, et al, “Reducing the Barrier to Entry of Complex Robotic Software: a
MoveIt! Case Study”, Journal of Software Engineering for Robotics, Vol 5, Number 1,
2014, pp. 3-16.
[15] Sun, K., et al, “Deep high-resolution representation learning for human pose estimation”,
In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition,
2019, pp. 5693-5703.
[16] Kaiming He, et al, “Deep Residual Learning for Image Recognition”, Proceedings of the
IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[17] Tsung-Yi Lin, et al, “Microsoft COCO: Common Objects in Context”. Computer Vision--ECCV 2014: 13th European Conference, Vol 13, 2014, Springer, pp. 740-755.
[18] Mykhaylo Andriluka et al, “2D Human Pose Estimation: New Benchmark and State of
the Art Analysis”, Proceedings of the IEEE Conference on computer Vision and Pattern
Recognition, 2014, pp. 3686-3693.
[19] Cyrill Stachniss, The Basics about Bundle Adjustment, 2020,
https://www.youtube.com/watch?v=sobyKHwgB0Y&t=1746s.
[20] Y. M. Wang, et al, “A camera calibration technique based on OpenCV”, The 3rd
International Conference on Information Sciences and Interaction Sciences, 2010, IEEE,
pp. 403-406.
[21] Lindeberg, Tony, Scale invariant feature transform, 2012.
[22] Nikolay Mayorov, Scipy , Large-scale bundle adjustment in scipy, 2016, https://scipy
cookbook.readthedocs.io/items/bundle_adjustment.html.
[23] Garrido-Jurado et al, “Automatic generation and detection of highly reliable fiducial
markers under occlusion”, Pattern Recognition, Vol 47, Number 9, 2014, Elsevier, pp.
2280-2292.
[24] Bradski, Gary, “The OpenCV Library”, Dr. Dobb′s Journal of Software Tools, Vol 25,
Number 11, 2000, Miller Freeman Inc, pp. 120-123.
[25] OpenCV Library, Detection of ArUco Markers, 2008,
https://docs.opencv.org/4.x/d5/dae/tutorial_aruco_detection.html.
[26] Open3D, Point cloud, Crop point cloud, 2018,
https://www.open3d.org/docs/release/tutorial/geometry/pointcloud.html.
[27] Open3D, Voxelization, 2018,
https://www.open3d.org/docs/latest/tutorial/Advanced/voxelization.html.
[28] Open3D, Point cloud, Vertex normal estimation, 2018,
https://www.open3d.org/docs/release/tutorial/geometry/pointcloud.html.
[29] The MathWorks, Image Undistortion, 2024,
https://www.mathworks.com/help/visionhdl/ug/image-undistort.html.
[30] John Stechschulte, MoveIt, New MoveIt Calibration Capabilities from Intel,
https://moveit.ros.org/moveit/ros/2020/08/26/moveit-calibration.html.
[31] John Hunter, et al, Matplotlib, 3D voxel / volumetric plot,
https://matplotlib.org/stable/gallery/mplot3d/voxels.html.
[32] Jürgen Schmidhuber, Deep Learning in Neural Networks: An Overview, pp.10-16. 2014.
[33] PyTorch Contributors, PyTorch Doc, Conv3d, 2023,
https://pytorch.org/docs/stable/generated/torch.nn.Conv3d.html.
[34] PyTorch Contributors, PyTorch Doc, torch.nn, 2023,
https://pytorch.org/docs/stable/nn.html.
[35] PyTorch Contributors, PyTorch Doc, ReLU, 2023,
https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html.

指導教授

廖展誼(Chan-Yi Liao)

審核日期

2024-8-15

推文