非接觸式遠程光體積變化描記圖法(rPPG)雖在遠端健康監測領域具高度潛力,然其準確性與泛化能力易受現實環境中的光影變化與頭部移動等因素干擾,為該技術普及應用之挑戰。為解決此問題,本研究提出一個結合三維卷積神經網路(3D CNN)與選擇性狀態空間模型(Mamba)的新型深度學習架構。此架構利用 3D CNN 提取臉部影像序列中的時空聯合特徵,再藉由 Mamba 模型的長序列建模能力捕捉 rPPG 訊號的時間依賴關係。實驗結果表明,本研究提出的模型在跨資料集驗證中展現出卓越的性能,其平均絕對誤差(MAE)最低可達 0.39 bpm,方均根誤差(RMSE)最低可達 0.9 bpm。此外,效能分析證實本模型在維持高準確度的同時,具備超過 96 FPS 的即時推論能力與參數量1.21M的輕量化架構。;Remote Photoplethysmography (rPPG) shows significant potential in remote health monitoring; however, its accuracy and generalization ability are susceptible to real-world factors such as varying illumination and head movements, posing a key challenge to its widespread application. To address this issue, this research proposes a novel deep learning architecture that combines a 3D Convolutional Neural Network (3D CNN) with a Selective State Space Model (Mamba). This architecture utilizes the 3D CNN to extract spatio-temporal features from facial video sequences, and then leverages Mamba′s long-sequence modeling capability to capture the temporal dependencies within the rPPG signal. Experimental results demonstrate that the proposed model exhibits superior performance in cross-dataset validation, achieving a Mean Absolute Error (MAE) as low as 0.39 bpm and a Root Mean Squared Error (RMSE) as low as 0.9 bpm. Furthermore, performance analysis confirms that the model maintains high accuracy while featuring a lightweight architecture with only 1.21M parameters and real-time inference capabilities exceeding 96 FPS.