用於動作辨識中的時空圖卷積網路之可重構硬體架構設計;A Reconfigurable Hardware Architecture for Spatial Temporal Graph Convolutional Network in action recognition

NCU Institutional Repository > 資訊電機學院 > 電機工程研究所 > 博碩士論文 > Item 987654321/92832

jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/92832

题名:	用於動作辨識中的時空圖卷積網路之可重構硬體架構設計;A Reconfigurable Hardware Architecture for Spatial Temporal Graph Convolutional Network in action recognition
作者:	陳子捷;CHEN, TZU-CHIEH
贡献者:	電機工程學系
关键词:	圖卷積網路;硬體加速器;可重構架構;特殊應用積體電路;現場可程式化邏輯閘陣列;graph convolutional neural network;hardware accelerator;reconfigurable architecture;ASIC;FPGA
日期:	2023-08-09
上传时间:	2023-10-04 16:11:35 (UTC+8)
出版者:	國立中央大學
摘要:	人類的動作辨識可以使用各種不同的數據當作輸入，主要有RGB圖像或是骨架數據等。傳統方法使用RGB圖像當作輸入，並採用了卷積神經網路(CNN)或循環神經網路(RNN)模型。然而，由於圖片中存在各種噪音，導致這些方法的準確性很低。為了提高辨識的準確性，一些研究探討了使用骨架數據做為替代的輸入。儘管如此，CNN或RNN模型無法充分利用骨架數據中存在的空間和時間關係，因此限制了模型的有效性。近年來，圖卷積網路(GCN)因為其在社會網路分析以及推薦系統等任務中的廣泛適用性而獲得了極大的關注。GCN特別適用於處理非歐基里德的數據，像是人體骨骼關節。與RGB圖像不同的是，它不會受到環境因素的影響。然而，由於GCN的計算複雜性以及數據稀疏性，當使用在CPU或是GPU平台上時，往往會導致高延遲以及低功率效率。為了面對這些挑戰，設計專門的硬體加速器起到了至關重要的作用。時空圖卷積網路(ST-GCN)是一個廣泛用於人類動作辨識的模型。在本文中，我們為ST-GCN提出了一個高度平行化運算且靈活的架構。我們的架構包含了通用型的處理元件(PE)，它們可以被歸類為組合引擎和聚合引擎來計算GCN層。這些PE也可以在處理TCN層時相互連接。根據我們提出的方法，此加速器還具有良好的可擴展性。我們在ASIC和FPGA平台上都實現了該硬體設計。與其他一樣為ST-GCN實現硬體設計的論文相比，我們提出的方法實現了高達39.5%的延遲降低以及高達2.23倍的功率效率提高。;Human action recognition can leverage various data inputs, including RGB images and skeleton data. Traditional approaches utilize RGB images as input and employ convolutional neural network (CNN) or recurrent neural network (RNN) models. However, these methods suffer from low accuracy due to the presence of various background noises in the images. In an attempt to enhance accuracy, some studies have explored the use of skeleton data as an alternative input. Nevertheless, CNN or RNN models are unable to fully exploit the spatial and temporal relationships inherent in the skeleton data, limiting their effectiveness. In recent years, Graph Convolutional Networks (GCNs) have gained significant attention due to their wide applicability in tasks such as social network analysis and recommendation systems. GCNs are particularly suitable for processing non-Euclidean data, such as human skeleton joints, which remain unaffected by environmental factors unlike RGB images. However, the computational complexity and data sparsity of GCNs often result in high latency and low power efficiency when deployed on CPU or GPU platforms. To address these challenges, dedicated hardware accelerators play a crucial role. In this paper, we propose a highly parallelized and flexible architecture for Spatial-Temporal Graph Convolutional Networks (ST-GCN), a widely used model in human action recognition. Our architecture incorporates general Processing Elements (PEs) that can be grouped into combination engines and aggregation engines to compute GCN layers. These PEs can also be interconnected while processing TCN layers. The accelerator also has high scalability due to our proposed method. We implemented our design on both ASIC and FPGA platforms. Compared to other works that also implement hardware designs for ST-GCN, our proposed method achieves up to 39.5% reduction in latency and up to 2.23x improvement in power efficiency.
显示于类别:	[電機工程研究所] 博碩士論文

文件中的档案:

档案	描述	大小	格式	浏览次数
index.html		0Kb	HTML	28	检视/开启

在NCUIR中所有的数据项都受到原著作权保护.

社群 sharing

数据加载中.....