摘要: | 深度學習應用中不論是偵測或辨識,影響預測正確率的最大因素就是資料,特別是訓練資料。因此收集資料是深度學習中最重要的課題,就二分類的問題而言,資料須要達到一定比例的平衡。如果正確樣本的資料很多,瑕疵樣本的資料很少,預測或測試的結果很容易因資料的不均衡 (imbalance) 而產生錯誤,所以本研究將採用生成對抗網路,來產生適當的瑕疵樣本以解決資料不平衡的問題。 本研究大致分為二個步驟,第一步驟為生成合理的瑕疵樣本,第二步驟為合成瑕疵樣本的驗證與比較。 由於生成對抗網路和其他生成模型相比多了一個判別器負責監督網路訓練,所以生成樣本的效果好,但訓練困難;除了很有可能遇到梯度消失或梯度爆炸問題外,也很容易遇到模型崩塌也就是生成影像缺乏多樣性的問題。我們使用了不同的訓練方式,希望能讓生成對抗網路訓練更穩定。 除了訓練穩定外,我們還希望生成樣本能夠具有特殊的瑕疵特徵,因此我們從不同的觀點創造兩種不同網路架構,來學習瑕疵特徵合成瑕疵影像。第一我們使用適應性實例正規化 (adaptive instance normalization) 來學習每個解析度層的特徵,由於每個不同解析度表達的特徵不盡相同,所以我們從中觀察出每個解析度表達的意義,使得生成樣本更具意義。另外,我們使用調變 (modulation) 與解調變 (demodulation) 的方法,學習每個解析度層的特徵,並透過捷徑連接和殘差網路提升生成樣本的品質。第二種網路架構我們使用變分自動編碼生成對抗網路,透過編碼器和解碼器能從中學習到瑕疵特徵,訓練完成後輸入正常樣本使得正常樣本融合瑕疵特徵,得到我們想要的瑕疵影像;最後,更換變分自動編碼器架構提升生成效果。 在實驗中,我們使用了風格生成對抗網路做風格混合,在正常樣本和瑕疵樣本的亂數組合中,找出最佳的搭配;另外我們結合自動編碼器與生成對抗網,沒有更改網路架構的情況下,生成樣本與真實樣本的平均 SSIM 數值為0.3917。加入殘差自我注意力層後,平均 SSIM 提升至0.5039,最後加入殘差網路架構平均 SSIM 提升至0.6157。 ;Whether it is detection or identification in deep learning applications, the biggest factor affecting prediction accuracy is amount of data, especially training data. Therefore, collecting data is the most important issue in deep learning. As far as the problem of binary classification, data need to achieve balance. If there are a lot of data from correct samples and few data from defective samples, the results of prediction or testing are likely to be wrong due to imbalance of data. In our research, we will use the generative adversarial network to generate properly defective samples to solve the problem of data imbalance. our research is divided into two steps. The first step is to generate reasonable defective samples, and the second step is to verify and compare the synthetic defective samples. The results of the generative adversarial network are good, but training is difficult. In addition to the possibility of encountering gradient disappearance or gradient explosion, it is also easy to encounter the problem of model collapse which is the lack of diversity in the generated images. We used different training methods, expected to lead the network more stable during training. In addition to training stability, we also expect that the generated samples can have particularly defective features. we create four different network architectures from different perspectives to learn the defective features and synthesize defective images. First, we use adaptive instance normalization to learn the features of each resolution layer. Because features expressed by each different resolution are not the same, we observe the meaning of each resolution expression, then make the generated samples more meaningful. Second, we use modulation and demodulation methods to learn the characteristics of each resolution layer, and improve the quality of the generated samples through skip connection and residual network. Third, we use adversarial variational autoencoder to learn the defect features from the encoder and decoder. After training, we input the normal samples, so that the normal samples are fused with the defect features to get the defective images which we want. Finally, replace the adversarial variational autoencoder architecture to improve the generative effect. In our experiment, we used styleGAN to mix style, and found the best match among the random number combination of normal samples and defective samples. In addition, we used adversarial variational autoencoder to replace different architectures. Since then, the SSIM value has improved significantly. |