利用深度學習基於時間與空間特徵生成人體姿態骨架

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：15

、訪客IP：13.58.215.45

姓名

謝瑞筑(Jui-Chu Hsieh) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

利用深度學習基於時間與空間特徵生成人體姿態骨架
(Use deep learning to generate human posture skeleton based on temporal and spatial features)

相關論文

★ 基於edX線上討論板社交關係之分組機制	★ 利用Kinect建置3D視覺化之Facebook互動系統
★ 利用 Kinect建置智慧型教室之評量系統	★ 基於行動裝置應用之智慧型都會區路徑規劃機制
★ 基於分析關鍵動量相關性之動態紋理轉換	★ 基於保護影像中直線結構的細縫裁減系統
★ 建基於開放式網路社群學習環境之社群推薦機制	★ 英語作為外語的互動式情境學習環境之系統設計
★ 基於膚色保存之情感色彩轉換機制	★ 一個用於虛擬鍵盤之手勢識別框架
★ 分數冪次型灰色生成預測模型誤差分析暨電腦工具箱之研發	★ 使用慣性傳感器構建即時人體骨架動作
★ 基於多台攝影機即時三維建模	★ 基於互補度與社群網路分析於基因演算法之分組機制
★ 即時手部追蹤之虛擬樂器演奏系統	★ 基於類神經網路之即時虛擬樂器演奏系統

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2026-7-5以後開放)

摘要(中)

在最近這幾年來，人體姿態估計的發展越發成熟，在醫療場景、手勢辨識、頭部運
動軌跡等運用，除了大家最常見的電腦視覺領域，使用無線訊號作為人體姿態估計
也開始有越來越多的應用，不過使用電腦視覺有一個很重大的問題：隱私。假設當
我們今天需要在一個養老院或是醫院裡的病房監督老人或是病患的狀況時，使用
傳統方式的攝影機去紀錄房間內人員會使得他們失去隱私，也擔心這些影像可能
被不肖人士拿去做違法的不良行為，因此我們考慮到了WiFi在現今生活的使用率
很高且幾乎遍地都是，利用WIFI的無線電波的特性，我們使用了WIFI的實體層訊
號CSI(Channel state information)來獲得空間中活動的人體骨架姿態。WiFi的CSI訊
號會反射空間中的任何物體，導致訊號會參雜很多空間中的雜訊，這些雜訊會使得
我們的深度學習網路所生成的骨架精準度不夠高，尤其是在四肢的效果是最差的，
所以我們的資料集對於CSI的振幅以及相位做了去噪的處理。
為了可以生成出精確的人體姿態，我們考慮到CSI訊號是一個時間序列，並且人
體動作間的前後關係對波形也有影響，所以我們使用了時間序列網路TCN來獲取時
間特徵，再利用Attention UNET從時間特徵與空間特徵生成人體骨架關節點座標。
我們的成果在PCK@50達到了97%的準確度。

摘要(英)

In recent years, human posture estimation has advanced significantly, with applications in medical scenarios, gesture recognition, and head motion tracking. While
computer vision is commonly used, it raises privacy concerns, especially in settings
like nursing homes or hospital rooms, where camera monitoring can invade privacy
and risk misuse of images.Given the widespread use of WiFi, we utilize its physical
layer signal, Channel State Information (CSI), to capture human skeleton postures.
WiFi’s CSI signals reflect off various objects, including human bodies, causing noise
that affects the accuracy of the generated skeletons, particularly the limbs. To address this, we applied noise reduction techniques to the amplitude and phase of the
CSI signals in our dataset.
To generate accurate human posture, we consider that CSI signals are time series
data, and the sequential relationship between human movements affects the waveform. Therefore, we used a Temporal Convolutional Network (TCN) to capture temporal features and then utilized an Attention UNET to generate human skeleton joint
coordinates from both temporal and spatial features. Our results achieved an accuracy of 97% in PCK@50.

關鍵字(中)

★ 無線訊號
★ 人體姿態估計
★ 深度學習

關鍵字(英)

論文目次

Contents
1 Introduction 1
2 Related Work 4
2.1 Human Pose Estimation 4
2.1.1 Reconstructing The Human Skeleton 6
2.1.2 Existing Research 13
2.1.2.1 Coordinate Regression-Based Estimation 13
2.1.2.2 Heatmap-Based Detection 18
2.2 Human activity recognition 19
2.3 Indoor human localization 20
3 Methods & Model 22
3.1 Data Preprocessing 22
3.1.1 Phase unwrapping 24
3.2 System Overview 25
3.2.1 Bidirectional TCN Model 26
3.2.2 TCN Model 27
3.2.3 Causal Convolutions 29
3.2.4 Dilated Convolutions 30
3.3 Attention U-NET 31
4 Experiments 33
4.1 Dataset 33
4.1.1 Introduction 33
4.1.2 Data description 34
4.1.3 Data collection and design 37
4.1.3.1 Device 37
4.1.3.2 Environment 38
4.1.3.3 Process 39
4.2 Training 40
4.3 Experimental Results 40
5 Conclusion 52
Bibliography

參考文獻

[1] Fei Wang et al. “Csi-net: Unified human body characterization and pose recognition”. In: arXiv preprint arXiv:1810.03064 (2018).
[2] Xiaochao Dang et al. “A Device-Free Indoor Localization Method Using CSI
with Wi-Fi Signals”. In: Sensors 19.14 (2019). ISSN: 1424-8220. DOI: 10.3390/
s19143233. URL: https://www.mdpi.com/1424-8220/19/14/3233.
[3] Baha’ A. Alsaify et al. “A dataset for Wi-Fi-based human activity recognition in line-of-sight and non-line-of-sight indoor environments”. In: Data in
Brief 33 (2020), p. 106534. ISSN: 2352-3409. DOI: https : / / doi . org / 10 .
1016/j.dib.2020.106534. URL: https://www.sciencedirect.com/
science/article/pii/S2352340920314165.
[4] Shaojie Bai, J. Zico Kolter, and Vladlen Koltun. An Empirical Evaluation of
Generic Convolutional and Recurrent Networks for Sequence Modeling. 2018. arXiv:
1803.01271 [cs.LG].
[5] Daniel Halperin et al. “Tool release: Gathering 802.11 n traces with channel
state information”. In: ACM SIGCOMM computer communication review 41.1
(2011), pp. 53–53.
[6] Yaxiong Xie, Zhenjiang Li, and Mo Li. “Precise power delay profiling with
commodity WiFi”. In: Proceedings of the 21st Annual international conference on
Mobile Computing and Networking. 2015, pp. 53–64.
[7] Zhe Cao et al. “Realtime multi-person 2d pose estimation using part affinity
fields”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017, pp. 7291–7299.
[8] Hao-Shu Fang et al. “Alphapose: Whole-body regional multi-person pose estimation and tracking in real-time”. In: IEEE Transactions on Pattern Analysis and
Machine Intelligence (2022).
[9] Arindam Sengupta et al. “mm-Pose: Real-Time Human Skeletal Posture Estimation Using mmWave Radars and CNNs”. In: IEEE Sensors Journal 20.17
(Sept. 2020), 10032–10044. ISSN: 2379-9153. DOI: 10 . 1109 / jsen . 2020 .
2991741. URL: http://dx.doi.org/10.1109/JSEN.2020.2991741.
[10] Shuang Zhou et al. “Subject-independent human pose image construction
with commodity Wi-Fi”. In: ICC 2021-IEEE International Conference on Communications. IEEE. 2021, pp. 1–6.
[11] Fei Wang et al. “Can WiFi estimate person pose?” In: arXiv preprint
arXiv:1904.00277 (2019).
[12] Wenjun Jiang et al. “Towards 3D human pose construction using WiFi”. In:
Proceedings of the 26th Annual International Conference on Mobile Computing and
Networking. 2020, pp. 1–14.
[13] Wei Wang et al. “Understanding and modeling of wifi signal based human
activity recognition”. In: Proceedings of the 21st annual international conference on
mobile computing and networking. 2015, pp. 65–76.
[14] Yiming Wang et al. “From Point to Space: 3D Moving Human Pose Estimation Using Commodity WiFi”. In: IEEE Communications Letters 25.7 (2021),
pp. 2235–2239. DOI: 10.1109/LCOMM.2021.3073271.
[15] Danilo Avola et al. “Human Silhouette and Skeleton Video Synthesis Through
Wi-Fi Signals”. In: International Journal of Neural Systems 32.05 (Feb. 2022). ISSN:
1793-6462. DOI: 10.1142/s0129065722500150. URL: http://dx.doi.
org/10.1142/S0129065722500150.
[16] Fei Wang et al. CSI-Net: Unified Human Body Characterization and Pose Recognition. 2019. arXiv: 1810.03064 [cs.LG]. URL: https://arxiv.org/abs/
1810.03064.
[17] Yue Zhou et al. “PerUnet: Deep Signal Channel Attention in Unet for
WiFi-Based Human Pose Estimation”. In: IEEE Sensors Journal 22.20 (2022),
pp. 19750–19760. DOI: 10.1109/JSEN.2022.3204607.
[18] Jianfei Yang et al. “MetaFi: Device-Free Pose Estimation via Commodity WiFi
for Metaverse Avatar Simulation”. In: 2022 IEEE 8th World Forum on Internet
of Things (WF-IoT). 2022, pp. 1–6. DOI: 10 . 1109 / WF - IoT54382 . 2022 .
10152057.
[19] Yunjiao Zhou et al. “MetaFi++: WiFi-Enabled Transformer-Based Human Pose
Estimation for Metaverse Avatar Simulation”. In: IEEE Internet of Things Journal
10.16 (2023), pp. 14128–14136. DOI: 10.1109/JIOT.2023.3262940.
[20] Liming Wang et al. “WiLink: Link Selection-Based 3D Human Pose Estimation
Using Commodity Wi-Fi”. In: 2023 IEEE Wireless Communications and Networking Conference (WCNC). 2023, pp. 1–7. DOI: 10 . 1109 / WCNC55385 . 2023 .
10118926.
[21] Zhengjie Wang et al. “Human Pose Estimation Using Commodity WiFi and
Deep Learning Approach”. In: 2023 5th International Conference on Frontiers
Technology of Information and Computer (ICFTIC). 2023, pp. 1108–1111. DOI: 10.
1109/ICFTIC59930.2023.10456071.
[22] Fei Wang et al. “Person-in-WiFi: Fine-grained person perception using WiFi”.
In: Proceedings of the IEEE/CVF international conference on computer vision. 2019,
pp. 5452–5461.
[23] Lingchao Guo et al. “From signal to image: Capturing fine-grained human
poses with commodity Wi-Fi”. In: IEEE Communications Letters 24.4 (2019),
pp. 802–806.
[24] Yi-Chung Chen et al. “Seeing the unseen: Wifi-based 2D human pose estimation via an evolving attentive spatial-Frequency network”. In: Pattern Recognition Letters 171 (2023), pp. 21–27. ISSN: 0167-8655. DOI: https : / / doi .
org / 10 . 1016 / j . patrec . 2023 . 04 . 020. URL: https : / / www .
sciencedirect.com/science/article/pii/S0167865523001277.
[25] Zhijie Cai et al. “FallDeWideo: Vision-Aided Wireless Sensing Dataset for Fall
Detection with Commodity Wi-Fi Devices”. In: Proceedings of the 3rd ACM MobiCom Workshop on Integrated Sensing and Communications Systems. ISACom
’23. Madrid, Spain: Association for Computing Machinery, 2023, 7–12. ISBN:
9798400703645. DOI: 10.1145/3615984.3616501. URL: https://doi.
org/10.1145/3615984.3616501.
[26] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-Net: Convolutional Networks for Biomedical Image Segmentation. 2015. arXiv: 1505.04597 [cs.CV].
[27] Kaiming He et al. Mask R-CNN. 2018. arXiv: 1703.06870 [cs.CV].
[28] Chih-Yang Lin et al. WiFi-TCN: Temporal Convolution for Human Interaction
Recognition based on WiFi signal. 2024. arXiv: 2305.18211 [eess.SP].
[29] Fangxin Wang, Jiangchuan Liu, and Wei Gong. “WiCAR: WiFi-based in-Car
Activity Recognition with Multi-Adversarial Domain Adaptation”. In: 2019
IEEE/ACM 27th International Symposium on Quality of Service (IWQoS). 2019,
pp. 1–10. DOI: 10.1145/3326285.3329054.
[30] Jianyang Ding and Yong Wang. “WiFi CSI-Based Human Activity Recognition
Using Deep Recurrent Neural Network”. In: IEEE Access 7 (2019), pp. 174257–
174269. DOI: 10.1109/ACCESS.2019.2956952.
[31] Yogita Chapre et al. “CSI-MIMO: Indoor Wi-Fi fingerprinting system”. In: 39th
Annual IEEE Conference on Local Computer Networks. 2014, pp. 202–209. DOI:
10.1109/LCN.2014.6925773.
[32] Weipeng Jiang et al. “For Better CSI Fingerprinting Based Localization: A
Novel Phase Sanitization Method and a Distance Metric”. In: 2017 IEEE 85th
Vehicular Technology Conference (VTC Spring). 2017, pp. 1–7. DOI: 10 . 1109 /
VTCSpring.2017.8108351.
[33] Jiaqi Geng, Dong Huang, and Fernando De la Torre. DensePose From WiFi. 2022.
arXiv: 2301.00250 [cs.CV].
[34] Aaron van den Oord et al. WaveNet: A Generative Model for Raw Audio. 2016.
arXiv: 1609.03499 [cs.SD].
[35] Fisher Yu and Vladlen Koltun. Multi-Scale Context Aggregation by Dilated Convolutions. 2016. arXiv: 1511.07122 [cs.CV].
[36] Ozan Oktay et al. Attention U-Net: Learning Where to Look for the Pancreas. 2018.
arXiv: 1804.03999 [cs.CV].
[37] Diederik P. Kingma and Jimmy Ba. Adam: A Method for Stochastic Optimization.
2017. arXiv: 1412.6980 [cs.LG].

指導教授

施國琛

審核日期

2024-7-29

推文