![]() |
以作者查詢圖書館館藏 、以作者查詢臺灣博碩士 、以作者查詢全國書目 、勘誤回報 、線上人數:15 、訪客IP:3.140.184.21
姓名 鄧皓(Teng Hao) 查詢紙本館藏 畢業系所 資訊工程學系 論文名稱 動態與靜態特徵融合應用於跨資料集的深偽偵測
(Fusion of Dynamic and Static Features for Robust Cross-Dataset Deepfake Detection)相關論文 檔案 [Endnote RIS 格式]
[Bibtex 格式]
[相關文章]
[文章引用]
[完整記錄]
[館藏目錄]
至系統瀏覽論文 (2030-2-1以後開放)
摘要(中) 本論文提出了一種結合 Blended 和光流圖的深度偽造辨識模型,對於提升偽造手法
的識別能力提供了有效的改進。在深度學習快速發展的背景下,除了熟知的圖像識別
應用,推薦系統和醫療診斷等領域也深刻影響著我們的日常生活。然而,這些技術進
步的同時也伴隨著潛在的風險,例如可能威脅隱私和安全的深偽技術。
深偽技術因其生成虛假圖像和合成真實影片的能力而成為一個日益嚴峻的問題。
隨著技術不斷演進,偽造方法也在推陳出新,導致訓練深偽檢測模型面臨諸多挑戰,
特別是需要大量資料來支持模型訓練,而新型偽造方法的數據獲取往往困難重重。
深偽技術的快速發展引發了關於交叉偽造(cross-manipulation)問題的重要議題。
交叉偽造指的是模型在經過針對特定偽造手法的訓練後,必須能夠對抗不同的未知偽
造手法,這對模型的泛化能力提出了更高要求,進一步增加了辨識真偽的挑戰,而又
衍伸出跨資料集(cross-dataset)的問題,除了要能夠識別不同手法也需要處理因不同
資料集產生分布上的差異導致深偽檢測準確度下降的問題。
為了解決跨資料集(cross-dataset)問題,本研究提出了一種結合動態與靜態特
徵的偽造檢測方法。該模型不僅在 cross-dataset 測試中顯著提升了檢測效能,還在
cross-manipulation 場景中保持了穩定且高效的檢測能力。
實驗結果證明了動態與靜態特徵結合的優勢:動態特徵能捕捉連續幀間的微妙變
化,特別是偽造技術引入的時間域異常;靜態特徵則有效提取單幀圖像的局部細節與
紋理信息。通過整合這兩類特徵,模型能更加準確地識別真偽,在交叉偽造場景中展
現出卓越的分類能力與更高的泛化性。摘要(英) This paper proposes a deepfake detection model that combines blended and optical flow
features, offering effective improvements in the ability to identify various forgery methods.
In the context of rapid advancements in deep learning, its applications extend beyond image
recognition to domains such as recommendation systems and medical diagnostics, profoundly
impacting daily life. However, alongside these advancements come potential risks, such as the
misuse of technologies that threaten privacy and security, exemplified by deepfake technology.
Deepfake technology has become an increasingly pressing issue due to its ability to gen-
erate fake images and synthesize realistic videos. As the technology evolves, forgery methods
continue to emerge, posing challenges to training deepfake detection models. These challenges
include the need for large datasets to support model training, while acquiring data for novel
forgery methods remains a significant obstacle.
The rapid development of deepfake technology has raised critical concerns about the cross-
manipulation problem. Cross-manipulation refers to the requirement for a model trained to
recognize specific types of forgeries to generalize effectively against unseen forgery methods.
This necessitates stronger generalization capabilities, further complicating the task of identifying
authenticity. Additionally, this issue extends to the cross-dataset problem, where the model must
not only identify various forgery techniques but also address distributional differences between
datasets that lead to a decline in detection accuracy.
To address the cross-dataset problem, this study proposes a forgery detection method that
integrates dynamic and static features. The model demonstrates significant improvements in
detection performance during cross-dataset testing and maintains stable and efficient detection
capabilities in cross-manipulation scenarios.
Experimental results highlight the advantages of combining dynamic and static features.
ii
Dynamic features capture subtle variations between consecutive frames, particularly temporal
anomalies introduced by forgery techniques. Meanwhile, static features effectively extract local
details and texture information from individual frames. By integrating these two types of fea-
tures, the model achieves more accurate forgery detection, demonstrating robust classification
performance and superior generalization in cross-manipulation scenarios.
In summary, this study provides an innovative and robust solution to deepfake detection,
addressing both cross-dataset and cross-manipulation challenges, and laying a solid foundation
for future research in this critical area.關鍵字(中) ★ 深度偽造偵測
★ 交叉偽造
★ 光流
★ 特徵融合關鍵字(英) ★ Deepfake detection
★ cross dataset
★ optical flow
★ feature fusion論文目次 目錄
頁次
摘要 i
Abstract ii
誌謝
Acknowledgements iv
目錄 v
圖目錄 vii
表目錄 ix
使用符號與定義 x
一、 緒論 1
1.1 研究背景 . . . 1
1.2 研究動機 . . . 2
1.3 問題定義 . . . 3
1.4 貢獻 . . . 3
二、 相關文獻 5
2.1 Deepfake Detection . . . 5
2.2 Opticla flow Deepfake Detection . . . 5
2.3 Blended Deepfake Detection . . . 6
三、 方法 9
四、 實驗結果 19
4.1 實作細節 . . . . 19
4.1.1 資料前處理 . . . 19
4.1.2 訓練 . . . . 20
4.1.3 測試 . . . 20
4.2 資料集 . . . 20
4.2.1 FaceForensics++ (FF++) . . . 20
4.2.2 Celeb-DF (CDF) . . . 21
4.2.3 FaceForensics in the Wild (FFIW). . . 21
4.3 評估指標 . . . 21
4.4 實驗 . . . 22
4.4.1 實驗一 . . . 22
4.4.2 實驗二 . . . 24
4.4.3 實驗三 . . . 24
4.4.4 實驗四 . . 25
五、 總結 26
六、 參考文獻 27
附錄 A Appendix 29
A.1 t-SNE 圖 . . . 29
A.2 AUC . . . 34
A.3 Grad-cam . . . 37參考文獻 [1] I. Goodfellow, J. Pouget-Abadie, M. Mirza, et al., “Generative adversarial networks,” Communi-
cations of the ACM, vol. 63, no. 11, pp. 139–144, 2020.
[2] A. Rossler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and M. Niesner, “Faceforensics++:
Learning to detect manipulated facial images,” in Proceedings of the IEEE/CVF International
Conference on Computer Vision, 2019, pp. 1–11.
[3] Deepfakes Community, Deepfakes: Faceswap, https://github.com/deepfakes/faceswap, Accessed:
2025-01-20, 2025.
[4] J. Thies, M. Zollhofer, M. Stamminger, C. Theobalt, and M. Niesner, “Face2face: Real-time face
capture and reenactment of rgb videos,” in Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, 2016, pp. 2387–2395.
[5] M. Kowalski, Faceswap by marek kowalski, https : / / github . com / MarekKowalski / FaceSwap/,
Accessed: 2025-01-20, 2025.
[6] J. Thies, M. Zollhofer, and M. Niesner, “Deferred neural rendering: Image synthesis using neural
textures,” ACM Transactions on Graphics (TOG), vol. 38, no. 4, pp. 1–12, 2019.
[7] J. L. Barron, D. J. Fleet, and S. S. Beauchemin, “Performance of optical flow techniques,” Inter-
national Journal of Computer Vision, vol. 12, pp. 43–77, 1994.
[8] K. Shiohara and T. Yamasaki, “Detecting deepfakes with self-blended images,” in Proceedings of
the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 18 720–18 729.
[9] D. Sun, X. Yang, M.-Y. Liu, and J. Kautz, “Pwc-net: Cnns for optical flow using pyramid, warping,
and cost volume,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition, 2018, pp. 8934–8943.
[10] I. Amerini, L. Galteri, R. Caldelli, and A. Del Bimbo, “Deepfake video detection through optical
flow based cnn,” in Proceedings of the IEEE/CVF International Conference on Computer Vision
workshops, 2019, pp. 0–0.
[11] R. Caldelli, L. Galteri, I. Amerini, and A. Del Bimbo, “Optical flow based cnn for detection of
unlearnt deepfake manipulations,” Pattern Recognition Letters, vol. 146, pp. 31–37, 2021.
[12] P. Saikia, D. Dholaria, P. Yadav, V. Patel, and M. Roy, “A hybrid cnn-lstm model for video deep-
fake detection by leveraging optical flow features,” in IEEE International Joint Conference on
Neural Networks (IJCNN), 2022, pp. 1–7.
[13] L. Li, J. Bao, T. Zhang, et al., “Face x-ray for more general face forgery detection,” in Proceedings
of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 5001–5010.
27
[14] W. Bai, Y. Liu, Z. Zhang, B. Li, and W. Hu, “Aunet: Learning relations between action units for
face forgery detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition, 2023, pp. 24 709–24 719.
[15] H. Li, Y. Li, J. Zhou, B. Li, and J. Dong, “Freqblender: Enhancing deepfake detection by blending
frequency knowledge,” arXiv preprint arXiv:2404.13872, 2024.
[16] A. Hore and D. Ziou, “Image quality metrics: Psnr vs. ssim,” in IEEE International Conference
on Pattern Recognition, 2010, pp. 2366–2369.
[17] F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proceedings of
the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017, pp. 1251–1258.
[18] A. Waswani, N. Shazeer, N. Parmar, et al., “Attention is all you need,” in NIPS, 2017.
[19] P. Foret, A. Kleiner, H. Mobahi, and B. Neyshabur, “Sharpness-aware minimization for efficiently
improving generalization,” arXiv preprint arXiv:2010.01412, 2020.
[20] J. Deng, J. Guo, Y. Zhou, J. Yu, I. Kotsia, and S. Zafeiriou, “Retinaface: Single-stage dense face
localisation in the wild,” arXiv preprint arXiv:1905.00641, 2019.
[21] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical
image database,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition, 2009, pp. 248–255.
[22] Y. Li, X. Yang, P. Sun, H. Qi, and S. Lyu, “Celeb-df: A large-scale challenging dataset for deep-
fake forensics,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition, 2020, pp. 3207–3216.
[23] T. Zhou, W. Wang, Z. Liang, and J. Shen, “Face forensics in the wild,” in Proceedings of the IEEE/
CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 5778–5788.
[24] L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.,” Journal of machine learning
research, vol. 9, no. 11, 2008.指導教授 林家瑜(Jia-Yu Lin) 審核日期 2025-1-22 推文 plurk
funp
live
udn
HD
myshare
netvibes
friend
youpush
delicious
baidu
網路書籤 Google bookmarks
del.icio.us
hemidemi
myshare