使用表徵學習和機器學習方法於晶圓線切割機台之異常偵測

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：28

、訪客IP：3.143.218.37

姓名

蔡祐翔(Yu-Hsiang Tsai) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

使用表徵學習和機器學習方法於晶圓線切割機台之異常偵測
(Anomaly detection in wafer wire saw machines using representation learning and machine learning methods)

相關論文

★ 基於質譜儀資料使用機器學習辨識克雷伯氏肺炎桿菌之多重抗藥性	★ 結合多種訊號預處理方法於質譜儀資料以辨識細菌對抗生素之抗藥性
★ 利用機器學習預測濁水溪沖積扇區域之地下水位	★ 基於質譜儀資料利用人工智慧方法辨識革蘭氏陰性菌對環丙沙星抗藥性之特徵峰值
★ 應用數位分身於馬達軸承之異常偵測	★ 基於光誘導介電泳影像處理檢測流體抗藥性
★ 利用機器學習方法基於多類型地層監測資料預測濁水溪沖積扇地區之地層下陷	★ 基於人工智慧模型預測抗菌肽的最小抑菌濃度於特定菌株上
★ 使用語言模型嵌入和不平衡調整之深度學習方法識別多功能抗菌肽	★ 使用權重組合模型預測雲林縣地層下陷
★ 基於深度學習從核醣核酸定序表達譜推斷外周血單核細胞之細胞組成

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

晶圓線切割機台用於矽晶圓製造的一環，機台的作用是將晶圓切片。然而，當機台無預警地停機或者切割的線鉅斷裂，整批的晶圓將會成為次等品甚至必須拋棄，這將導致成本增加，而這伴隨著一個大挑戰-資料集不平衡問題。正常與異常資料比例懸殊，為21:1。因此提出了一套異常偵測的策略，由三個部分組成：表徵學習、監督式分類器、警報機制來做出最後決策。k-平均演算法與自編碼器之表徵學習法，僅使用正常資料來學習正常的特徵，這不僅解決資料集不平衡的問題，也讓實驗採用的四個監督式學習分類器:隨機森林、單純貝氏分類器、支援向量機器、極限學習機表現得更好，並且設計一套發警報機制有效減少假警報數量。我們評定這套異常偵測策略的好壞是根據一家合作半導體矽晶圓材料製造公司的真實機台收集之資料，在測試資料集達到偵錯率0.57以及錯報率0.10。此外，這套預測系統已經在產線上實裝和測試，我們也提出了最影響模型好壞的原始工程資料。

摘要(英)

Wafer wire saw machines are used in a link of silicon wafer manufacturing, that saws wafers into individual die. However, while the machine shutdown or the sawing wire broken unexpectedly, that batch of wafers will be secondary products or wasted wafers leading to cost increase. Also, it comes up with a challenging issue - the imbalance dataset. The ratio of normal and abnormal data is 21:1. Therefore, an anomaly detection strategy is proposed, composed of three parts: representation learning methods, supervised classifiers and alarm rules. K-means clustering and autoencoders are the representation learning methods that learn normal features from normal data only, that not merely solves the imbalanced data challenge, but also helps the 4 experimental supervised classifiers: random forest, Naïve Bayes, support vector machine, extreme learning machine perform better, whereas the alarm rules help reduce false alarm. The anomaly detection strategy is evaluated on two machines from a real semiconductor silicon wafer material manufacturing company, where the catching rate is 0.57 and false alarm is 0.10. Moreover, this predictive system has been implemented and tested in production line, and we put forward the considerable engineering profiles that are highly related to the models.

關鍵字(中)

★ 異常偵測
★ 不平衡資料集
★ 機器學習
★ 表徵學習

關鍵字(英)

★ anomaly detection
★ imbalance dataset
★ machine learning
★ representation learning

論文目次

中文摘要 i
Abstract ii
致謝 iii
Table of Contents v
List of Figures vii
List of Tables viii
Chapter 1. Introduction 1
1.1. Background 1
1.2. Related Works 3
1.3. Motivation & Goal 4
Chapter 2. Materials and Methods 5
2.1. Data 7
2.1.1. Raw Data Pattern 7
2.1.2. Dataset 9
2.1.3. Data Cleaning 10
2.2. Evaluations 13
2.3. Representation Learning 15
2.3.1. K-means Clustering 15
2.3.2. Autoencoder (AE) 16
2.4. Machine Learning Models 18
2.4.1. Random Forest (RF) 18
2.4.2. Naïve Bayes (NB) 19
2.4.3. Support Vector Machine (SVM) 19
2.4.4. Extreme Learning Machine (ELM) 20
2.5. Alarm Rules 21
2.6. Predictive System 23
Chapter 3. Results 24
3.1. Summary of the Results 24
3.2. Results of all Datasets 27
3.3. Important Profiles Analysis 29
Chapter 4. Discussions and Conclusions 30
4.1. Discussions 30
4.2. Conclusions 31
References 32
Appendix 34

參考文獻

[1] Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: a survey. ACM computing surveys (CSUR), 41(3), 1-58.
[2] Tang, Z. et al. (2019). Convolutional neural network‐based data anomaly detection method using multiple information for structural health monitoring. Structural Control and Health Monitoring, 26(1), e2296.
[3] Ergen, T., Mirza, A. H., & Kozat, S. S. (2017). Unsupervised and semi-supervised anomaly detection with lstm neural networks. arXiv preprint arXiv:1710.09207.
[4] Nawir, M. et al. (2019). Effective and efficient network anomaly detection system using machine learning algorithm. Bulletin of Electrical Engineering and Informatics, 8(1), 46-51.
[5] Bengio, Y., Courville, A., & Vincent, P. (2013). Representation learning: a review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 35(8), 1798-1828.
[6] Yan, W., & Yu, L. (2019). On accurate and reliable anomaly detection for gas turbine combustors: a deep learning approach. arXiv preprint arXiv:1908.09238.
[7] Liu, J. et al. (2019). Deep anomaly detection in packet payload. arXiv preprint arXiv:1912.02549.
[8] Yousefi-Azar, M. et al. (2017). Autoencoder-based feature learning for cyber security applications. In 2017 International joint conference on neural networks (IJCNN) , 3854-3861.
[9] Lindemann, B. et al. (2019). Anomaly detection in discrete manufacturing using self-learning approaches. Procedia CIRP, 79, 313-318.
[10] Chen, T. (2018). Anomaly detection in semiconductor manufacturing through time series forecasting using neural networks. Doctoral dissertation, Massachusetts Institute of Technology.
[11] Susto, G. A., Terzi, M., & Beghi, A. (2017). Anomaly detection approaches for semiconductor manufacturing. Procedia Manufacturing, 11, 2018-2024.
[12] Wang, X. et al. (2016). A self-learning and online algorithm for time series anomaly detection, with application in CPU manufacturing. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management , 1823-1832.
[13] Bektas, J., Ibrikci, T., & Ozcan, I. T. (2017). Classification of real imbalanced cardiovascular data using feature selection and sampling methods: a case study with neural networks and logistic regression. International Journal on Artificial Intelligence Tools, 26(06), 1750019.
[14] Luo, M. et al. (2019). Using imbalanced triangle synthetic data for machine learning anomaly detection. Comput., Mater. Continua, 58(1), 15-26.
[15] Wang, Q. et al. (2017). A novel ensemble method for imbalanced data learning: bagging of extrapolation-SMOTE SVM. Computational intelligence and neuroscience, 2017, 1827016.
[16] Abadi, M. et al. (2016). Tensorflow: a system for large-scale machine learning. In 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16), 265-283.
[17] Kluyver, T. et al. (2016). Jupyter notebooks-a publishing format for reproducible computational workflows. In ELPUB , 87-90.
[18] Walt, S. V. D., Colbert, S. C., & Varoquaux, G. (2011). The numpy array: a structure for efficient numerical computation. Computing in science & engineering, 13(2), 22-30.
[19] Hunter, J. D. (2007). Matplotlib: a 2D graphics environment. Computing in science & engineering, 9(3), 90-95.
[20] Keogh, E., & Lin, J. (2005). Clustering of time-series subsequences is meaningless: implications for previous and future research. Knowledge and information systems, 8(2), 154-177.
[21] Pedregosa, F. et al. (2011). Scikit-learn: machine learning in python. the Journal of machine Learning research, 12, 2825-2830.
[22] Huang, G. B., Zhu, Q. Y., & Siew, C. K. (2006). Extreme learning machine: theory and applications. Neurocomputing, 70(1-3), 489-501.
[23] Huang, G. B., Zhu, Q. Y., & Siew, C. K. (2004). Extreme learning machine: a new learning scheme of feedforward neural networks. In 2004 IEEE international joint conference on neural networks (IEEE Cat. No. 04CH37541), 2, 985-990.

指導教授

洪炯宗吳立青(Jorng-Tzong Horng Li-Ching Wu)

審核日期

2020-7-28

推文