基於自注意力與擬合平面感知局部幾何之三維點雲分類網路

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：134

、訪客IP：3.145.96.137

姓名

劉晉丞(Jin-Cheng Liu) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

基於自注意力與擬合平面感知局部幾何之三維點雲分類網路
(PointGPS - A 3D Point Cloud Classification Network Aware Local Geometry from Fitting Plane and Self-Attention)

相關論文

★ 用於邊緣計算的全新輕量化物件偵測系統	★ 基於標靶訓練策略與強預測器的神經網路架構搜索方法
★ 利用ε-greedy強化基於Transformer的物件偵測演算法之效能

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2028-7-11以後開放)

摘要(中)

三維點雲的資料與一般二維的圖像，無論是數據儲存、資料特性及分類網路架構有許多不同之處，近年來三維掃描設備有增加的趨勢，像是手機的三維結構光掃描，抑或是新世代的汽車搭載的雷達或是光達，所以面對與日俱增的三維點雲資料，除了使用傳統二維分類網路的架構，我們需要更準確的三維點雲分類網路，而PointNet是在三維點雲分類網路率先有效且準確的模型，然而PointNet++以包含簡單的局部特徵去克服只有考慮三維點雲全局特徵的侷限性，因此我們為了要讓三維點雲分類網路能夠更好去學習更細緻局部的幾何結構，我們於本論文提出了基於自注意力與擬合平面感知局部幾何特徵的三維點雲分類網路架構PointGPS。

我們提出的網路架構基於PointMLP所設計，PointMLP基於PointNet++ 所設計，PointMLP其設計了透過帶有殘差的 MLP (Multi-Layer Perceptron) 提升了準確度，網路架構一開始使用 Embedding 模組來將點雲提升成為更高維度的特徵，接著會經過四次的幾何特徵映射模組和前後的特徵擷取模組，透過幾何特徵映射模組擷取點雲的特徵，這裡透過最遠點採樣把點雲每次減少一半，並且選取其周圍附近的鄰居，這些鄰居的高維特徵和最遠點本身的高維特徵相減，以及最遠點本身的高維度特徵來訓練模型，我們在此模組設計透過 SVD (Singular Value Decomposition) 奇異值分解來擬合鄰居的平面，以及透過 Self-attention來計算更細緻的局部幾何結構特徵，接著經過前後的特徵擷取模組，也就是含有殘差的MLP模組，最後經過Max Pooling Layer最大池化層將特徵縮小，再經過分類器也就是全連接層、批量標準化層、以及激勵函數，還有透過隨機遺忘部分權重來讓模型對於更多資料有泛化的能力，最後我們提出的設計對於準確度有很大的提升。

摘要(英)

The data of three-dimensional point clouds differs significantly from that of regular two-dimensional images in terms of data storage, data characteristics, and classification network architecture. 3D scanning devices have become increasingly popular recently, like structured light scanners in smartphones or radar/lidar systems in next-gen cars. As the volume of 3D point cloud data increases, there is a growing demand for more accurate classification networks that are specifically designed to analyze such data. PointNet is the pioneering and effective model among 3D point cloud classification networks, However, PointNet++ was introduced to incorporate simple local features to conquer the limitation of considering only global features of 3D point clouds. The aim is to assist the 3D point cloud classification network learn more detailed local geometric structures better. In this thesis, we introduce PointGPS, a novel 3D Point Cloud classification network architecture that leverages self-attention and plane fitting for perceiving local geometric features.

The foundation of our proposed network architecture is built on PointMLP, which is built upon PointNet++. PointMLP enhances the accuracy by incorporating residual Multi-Layer Perceptron (MLP) modules. The architecture begins with an embedding module that elevates the point cloud into higher-dimensional features. It then undergoes four rounds of geometric feature mapping module and feature extraction modules. The geometric feature mapping modules capture features from the point cloud using farthest point sampling where the point cloud is reduced by half at each step, and then neighboring points are selected. The high-dimensional characteristics of the surrounding points are subtracted from the high-dimensional attributes of the farthest point itself, in addition to the high-dimensional attributes of the farthest point. In this module, we utilize Singular Value Decomposition (SVD) to fit the plane of the neighbors and employ self-attention to compute more detailed local geometric structure features. Subsequently, the feature extraction modules, which include MLP modules with residual connections, are applied. In the end, the attributes undergo a downsizing process using a Max Pooling layer, which is subsequently followed by a classifier comprising of fully connected layers and batch normalization layers, activation functions, and random weight dropping to enhance the model′s generalization capability to unseen data. With these design choices, our proposed model achieves significant improvements in accuracy.

關鍵字(中)

★ 三維點雲

關鍵字(英)

★ 3D Point Clouds

論文目次

摘要 i
Abstract ii
目錄 iii
圖目錄 v
一、緒論 1
1-1.研究動機 1
1-2.研究目的 2
1-3.論文總覽 2
二、文獻回顧 3
2-1.三維點雲分類 3
2-2.PointNet、PointNet++及PointMLP介紹 4
三、研究方法 6
3-1.特徵升維模組 Embedding Module 6
3-2.幾何特徵映射模組 Geometric Affine Module 7
3-2-1.最遠點採樣演算法 Farthest Point Sampling Algorithm 10
3-2-2.K-近鄰演算法 k-Nearest Neighbors Algorithm (k-NN) 12
3-3.特徵前擷取模組 PreExtraction Module 14
3-4.特徵後擷取模組 PosExtraction Module 16
3-5.分類器 Classifier 17
3-6.實作細節 18
3-6-1.細緻局部幾何特徵 18
3-6-2.擬合平面–奇異值分解 Singular Value Decomposition 19
3-6-3.點雲特徵 Superpoint 23
3-6-4.自注意力 Self-attention 25
3-6-5.點雲資料擴增 27

四、實驗結果及分析 28
4-1.實驗環境 28
4-2.訓練資料 29
4-3.實驗結果 29
4-3-1.局部幾何特徵設計實驗 29
4-3-2.平面擬合局部幾何特徵通道數消融實驗 31
4-3-3.平面擬合特徵及Superpoint實驗 31
4-3-4.平面擬合及Superpoint使用Self-attention實驗 32
4-3-5.鄰居數量k-Neighbor消融實驗 33
4-3-6.激勵函數消融實驗 33
4-3-7.ScanObjectNN資料集準確度 34
4-3-8.點雲資料擴增實驗 34
4-3-9.驗證局部幾何特徵通用性 35
五、結論與未來展望 36
六、參考文獻 37

參考文獻

[1] QI, Charles R., et al. Pointnet: Deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. p. 652-660.
[2] QI, Charles Ruizhongtai, et al. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, 2017, 30.
[3] MA, Xu, et al. Rethinking network design and local geometry in point cloud: A simple residual MLP framework. arXiv preprint arXiv:2202.07123, 2022.
[4] CAMUFFO, Elena; MARI, Daniele; MILANI, Simone. Recent advancements in learning algorithms for point clouds: An updated overview. Sensors, 2022, 22.4: 1357.
[5] GUO, Yulan, et al. Deep learning for 3d point clouds: A survey. IEEE transactions on pattern analysis and machine intelligence, 2020, 43.12: 4338-4364.
[6] MIRBAUER, Martin, et al. Survey and evaluation of neural 3d shape classification approaches. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44.11: 8635-8656.
[7] JOSEPH-RIVLIN, Mor; ZVIRIN, Alon; KIMMEL, Ron. Momen (e) t: Flavor the moments in learning to classify shapes. In: Proceedings of the IEEE/CVF international conference on computer vision workshops. 2019. p. 0-0.
[8] ZHAO, Hengshuang, et al. Pointweb: Enhancing local neighborhood features for point cloud processing. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019. p. 5565-5573.
[9] DUAN, Yueqi, et al. Structural relational reasoning of point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. p. 949-958.
[10] YAN, Xu, et al. Pointasnl: Robust point clouds processing using nonlocal neural networks with adaptive sampling. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020. p. 5589-5598.
[11] CHENG, Silin, et al. Pra-net: Point relation-aware network for 3d point cloud analysis. IEEE Transactions on Image Processing, 2021, 30: 4436-4448.
[12] WU, Wenxuan; QI, Zhongang; FUXIN, Li. Pointconv: Deep convolutional networks on 3d point clouds. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition. 2019. p. 9621-9630.
[13] THOMAS, Hugues, et al. Kpconv: Flexible and deformable convolution for point clouds. In: Proceedings of the IEEE/CVF international conference on computer vision. 2019. p. 6411-6420.
[14] LIU, Yongcheng, et al. Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019. p. 8895-8904.
[15] LI, Yangyan, et al. Pointcnn: Convolution on x-transformed points. Advances in neural information processing systems, 2018, 31.
[16] XU, Yifan, et al. Spidercnn: Deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European conference on computer vision (ECCV). 2018. p. 87-102.
[17] KOMARICHEV, Artem; ZHONG, Zichun; HUA, Jing. A-cnn: Annularly convolutional neural networks on point clouds. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019. p. 7421-7430.
[18] KUMAWAT, Sudhakar; RAMAN, Shanmuganathan. Lp-3dcnn: Unveiling local phase in 3d convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. p. 4903-4912.

[19] ZHOU, Hui, et al. Cylinder3d: An effective 3d framework for driving-scene lidar semantic segmentation. arXiv preprint arXiv:2008.01550, 2020.
[20] FAN, Hehe, et al. Pstnet: Point spatio-temporal convolution on point cloud sequences. arXiv preprint arXiv:2205.13713, 2022.
[21] FAN, Hehe; YANG, Yi. PointRNN: Point recurrent neural network for moving point cloud processing. arXiv preprint arXiv:1910.08287, 2019.
[22] SIMONOVSKY, Martin; KOMODAKIS, Nikos. Dynamic edge-conditioned filters in convolutional neural networks on graphs. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. p. 3693-3702.
[23] WANG, Yue, et al. Dynamic graph cnn for learning on point clouds. Acm Transactions On Graphics (tog), 2019, 38.5: 1-12.
[24] ZHANG, Kuangen, et al. Linked dynamic graph cnn: Learning on point cloud via linking hierarchical features. arXiv preprint arXiv:1904.10014, 2019.
[25] SHEN, Yiru, et al. Mining point cloud local structures by kernel correlation and graph pooling. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. p. 4548-4557.
[26] CHEN, Chao, et al. Clusternet: Deep hierarchical cluster network with rigorously rotation-invariant representation for point cloud analysis. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019. p. 4994-5002.
[27] XU, Qiangeng, et al. Grid-gcn for fast and scalable point cloud learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020. p. 5661-5670.
[28] ZHANG, Yingxue; RABBAT, Michael. A graph-cnn for 3d point cloud classification. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2018. p. 6279-6283.
[29] LANDRIEU, Loic; SIMONOVSKY, Martin. Large-scale point cloud semantic segmentation with superpoint graphs. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. p. 4558-4567.
[30] DEMANTKÉ, Jérôme, et al. Dimensionality based scale selection in 3D lidar point clouds. The international archives of the photogrammetry, remote sensing and spatial information sciences, 2012, 38: 97-102.
[31] GUINARD, Stéphane; LANDRIEU, Loic; VALLET, Bruno. Weakly supervised segmentation-aided classification of urban scenes from 3D LiDAR point clouds. 2017.
[32] VASWANI, Ashish, et al. Attention is all you need. Advances in neural information processing systems, 2017, 30.
[33] WU, Zhirong, et al. 3d shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2015. p. 1912-1920.
[34] UY, Mikaela Angelina, et al. Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In: Proceedings of the IEEE/CVF international conference on computer vision. 2019. p. 1588-1597.

指導教授

范國清謝君偉(Kuo-Chin Fan Jun-Wei Hsieh)

審核日期

2023-7-17

推文