隨著深度學習技術的發展,市面上出現越來越多的深度學習系統融入於我們 的生活中,例如:自駕車系統、人臉辨識系統等等。然而,我們往往忽略了深度 學習系統若出現錯誤決策,可能會導致嚴重的人身事故和財產損失問題。實際上, 現今有許多深度模型可能被惡意攻擊,而導致做出錯誤的決策,例如:在輸入的 資料中插入對抗性擾動以影響深度學習系統的判斷能力,導致模型做出錯誤的決 策。這也證實了深度神經網絡的不安全性。這樣的問題為下游任務帶來了相應的 風險,例如,汽車自動駕駛系統中的速限偵測系統可能會受到對抗性攻擊,使汽 車錯誤辨識導致行駛在高速公路上突然停止或降速等其他非預期之行為,相應地 增加了交通風險。 為了抵擋對抗性攻擊,目前普遍的方法為對抗式訓練,即將對抗性攻擊產生 的對抗樣本也作為訓練資料讓模型進行學習。雖然經過訓練後,模型可以有效的 防禦對抗樣本,但也影響了對普通樣本的分類能力,進而降低模型的泛化性。於 是,我們提出了使用自監督式學習方式,在不提供正確的標記下,模型自行學習 對抗樣本與原始資料的差異。透過這樣的學習方式來增強模型的強健性,利用少 量標記資料訓練的同時,加強模型對於攻擊樣本的防禦能力。;With the rapid development of deep learning, a growing number of deep learning systems are associated with our daily life, such as auto-driving system, face recognition system, ..., etc. However, we often ignore that the deep- learning system may make a wrong prediction caused by the attacker, and lead the serious personal accidents and property damage. For example, the attacker may feed the adversarial example into the deep-learning system and lead the model make a wrong decision. This fact also verified the unreliability of deep learning model and increase the potential risk of the downstream task. For example, the speed violation detection sub-system may be subject to adversarial attacks, causing the auto-driving system to take the unexpected behavior and increasing the corresponding risk. In order to defend the system against the adversarial attacks, the common method is the adversarial training, which allow the model trained on the adversarial examples generated by the adversarial attacks. Although the model is capable to defend the adversarial attack in some degree, it also decreases the performance of the corresponding task and reduces the generalization of the model. Therefore, we propose the framework to train the model in self- supervised learning, which learns to distinguish the adversarial example from the original data by without providing the correct label. The proposed framework enhances the robustness as well as the generalization of the trained model to against the adversarial attack.