摘要(英) |
With the rapid development of deep learning, a growing number of deep
learning systems are associated with our daily life, such as auto-driving system, face recognition system, ..., etc. However, we often ignore that the deep- learning system may make a wrong prediction caused by the attacker, and lead the serious personal accidents and property damage. For example, the attacker
may feed the adversarial example into the deep-learning system and lead the model make a wrong decision. This fact also verified the unreliability of deep learning model and increase the potential risk of the downstream task. For example, the speed violation detection sub-system may be subject to adversarial attacks, causing the auto-driving system to take the unexpected
behavior and increasing the corresponding risk.
In order to defend the system against the adversarial attacks, the common
method is the adversarial training, which allow the model trained on the adversarial examples generated by the adversarial attacks. Although the model is capable to defend the adversarial attack in some degree, it also decreases the performance of the corresponding task and reduces the generalization of the model. Therefore, we propose the framework to train the model in self- supervised learning, which learns to distinguish the adversarial example from
the original data by without providing the correct label. The proposed framework enhances the robustness as well as the generalization of the trained model to
against the adversarial attack. |
參考文獻 |
1. Wiyatno, Rey Reza. "Tricking a Machine into Thinking You’re Milla Jovovich." 2018. Web. Available from: https://medium.com/element-ai- research-lab/tricking-a-machine-into-thinking-youre-milla-jovovich- b19bf322d55c.
2. Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. "Explaining and Harnessing Adversarial Examples." (2014): arXiv:1412.6572. Web. December 01, 2014.
3. “Grad-cam: Visual explanations from deep networks via gradient-based localization.” Proceedings of the IEEE International Conference on Computer Vision. 2017. Print.
4. Simonyan, Karen, Andrea Vedaldi, and Andrew Zisserman. "Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps." (2013): arXiv:1312.6034. Web. December 01, 2013.
5. Smilkov, Daniel, et al. "SmoothGrad: removing noise by adding noise." (2017): arXiv:1706.03825. Web. June 01, 2017.
6. LeCun, Yann, et al. "Object recognition with gradient-based learning." Shape, Contour and Grouping in Computer Vision. Springer, 1999. 319- 45. Print.
7. Szegedy, Christian, et al. "Intriguing properties of neural networks." (2013): arXiv:1312.6199. Web. December 01, 2013.
8. Lei, Qi, et al. "Discrete adversarial attacks and submodular optimization with applications to text classification." Proceedings of Machine Learning and Systems 1 (2019): 146-65. Print.
9. “Hidden voice commands.” 25th USENIX security symposium (USENIX security 16). 2016. Print.
10. Tramèr, Florian, et al. "The Space of Transferable Adversarial Examples." (2017): arXiv:1704.03453. Web. April 01, 2017.
11. “Towards evaluating the robustness of neural networks.” 2017 IEEE Symposium on Security and Privacy (sp). 2017. IEEE. Print.
12. Shaham, Uri, Yutaro Yamada, and Sahand Negahban. "Understanding adversarial training: Increasing local stability of supervised models through robust optimization." Neurocomputing 307 (2018): 195-204. Print.
13. “Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks.” 2016 IEEE Symposium on Security and Privacy (SP). 22-26 May 2016 2016. Print.
14. “Tbt: Targeted neural network attack with bit trojan.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020. Print.
15. Liu, Ninghao, et al. "Adversarial attacks and defenses: An interpretation perspective." ACM SIGKDD Explorations Newsletter 23.1 (2021): 86-99. Print.
16. Kurakin, Alexey, Ian Goodfellow, and Samy Bengio. "Adversarial examples in the physical world." (2016): arXiv:1607.02533. Web. July 01, 2016.
17. Mądry, Aleksander, et al. "Towards deep learning models resistant to adversarial attacks." STAT 1050 (2017): 9. Print.
18. “Boosting adversarial attacks with momentum.” Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2018. Print.
19. “Deepfool: a simple and accurate method to fool deep neural networks.” Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2016. Print.
20. “Improving transferability of adversarial examples with input diversity.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. Print.
21. Chen, Pin-Yu, et al. "ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models." Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. Association for Computing Machinery, 2017. 15–26. Print.
22. “Black-box adversarial attacks with limited queries and information.” International Conference on Machine Learning. 2018. PMLR. Print.
23. Uesato, Jonathan, et al. "Adversarial Risk and the Dangers of Evaluating Against Weak Attacks." Proceedings of the 35th International Conference on Machine Learning. Ed. Jennifer, Dy and Krause Andreass.: PMLR, 2018. Print.
24. “Nattack: Learning the distributions of adversarial examples for an improved black-box attack on deep neural networks.” International Conference on Machine Learning. 2019. PMLR. Print.
25. “Defense against adversarial attacks using high-level representation guided denoiser.” Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2018. Print.
26. “Feature denoising for improving adversarial robustness.” Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 2019. Print.
27. Xie, Cihang, et al. "Mitigating adversarial effects through randomization." (2017). Print.
28. Xu, Weilin, David Evans, and Yanjun Qi. "Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks." (2017): arXiv:1704.01155. Web. April 01, 2017.
29. “Theoretically principled trade-off between robustness and accuracy.” International Conference on Machine Learning. 2019. PMLR. Print.
30. Tan, Mingxing, et al. "Smooth Adversarial Training." (2020). Print.
31. Xiao, Chang, Peilin Zhong, and Changxi Zheng. "Enhancing Adversarial Defense by k-Winners-Take-All." (2019): arXiv:1905.10510. Web. May 01, 2019.
32. Gong, Zhitao, Wenlu Wang, and Wei-Shinn Ku. "Adversarial and Clean Data Are Not Twins." (2017): arXiv:1704.04960. Web. April 01, 2017.
33. “Magnet: a two-pronged defense against adversarial examples.” Proceedings of the 2017 ACM SIGSAC conference on Computer and Communications Security. 2017. Print.
34. “Unsupervised feature learning via non-parametric instance discrimination.” Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2018. Print.
35. Grill, Jean-Bastien, et al. "Bootstrap your own latent-a new approach to self-supervised learning." Advances in Neural Information Processing Systems 33 (2020): 21271-84. Print.
36. “A simple framework for contrastive learning of visual representations.” International Conference on Machine Learning. 2020. PMLR. Print.
37. “ImageNet: A large-scale hierarchical image database.” 2009 IEEE Conference on Computer Vision and Pattern Recognition. 20-25 June 2009 2009. Print.
38. “Benchmarking Adversarial Robustness on Image Classification.” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 13-19 June 2020 2020. Print. |