基於統計與深度學習之單變數時間序列異常檢測

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：12

、訪客IP：18.119.126.168

姓名

高健賓(Jian-Bin Kao) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

基於統計與深度學習之單變數時間序列異常檢測
(Anomaly Detection for Univariate Time-Series with Statistics and Deep Learning)

相關論文

★ 以IEEE 802.11為基礎行動隨意無線網路之混合式省電通訊協定	★ 以范諾圖為基礎的對等式網路虛擬環境相鄰節點一致性研究
★ 行動隨意網路可調適及可延展之位置服務協定	★ 同儕式網路虛擬環境高效率互動範圍群播
★ 巨量多人線上遊戲之同儕網路互動範圍語音交談	★ 基於范諾圖之同儕式網路虛擬環境狀態管理
★ 利用多變量分析之多人線上遊戲信任使用者選擇	★ 無位置資訊無線感測網路之覆蓋及連通維持
★ 同儕網路虛擬環境3D串流同儕選擇策略	★ 一個使用802.11與RFID技術的無所不在導覽系統U-Guide之設計與實作
★ 同儕式三維資料串流	★ IM Finder: 透過即時通訊網路線上使用者找尋解答
★ 無位置資訊無線感測網路自走車有向天線導航與協調演算法	★ 多匯點無線感測網路省能及流量分散事件輪廓追蹤
★ 頻寬感知同儕式3D串流	★ 無線感測網路旋轉指向天線定位法

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

近年由於物聯網(Internet of Things, IoT)技術的迅速發展，各式各樣遍佈在我們生活周遭的感測器不斷累積巨量的時間序列(time series)資料(簡稱時序資料)，因此，對於時序資料的分析需求快速增加，而異常檢測(anomaly detection)是各種需求中的重要項目之一。本篇論文提出單變數時序資料之異常檢測框架，先依照時序資料的特徵，透過迪基-福勒檢驗、快速傅立葉轉換以及皮爾森積矩相關係數將時序資料分為三類: (1)平穩時序資料、(2)週期性時序資料與(3)非平穩且非週期時序資料；然後再針對不同類型的時序資料使用基於統計以及深度學習的不同方法進行異常檢測。
在平穩時序資料方面，我們利用一個較大及一個較小的滑動窗口之平均值計算其變化率，並設定變化率閥值來即時偵測異常；在週期性時序資料方面，我們計算當前週期與前一週期之時間視窗內資料的標準差比值，並設定閥值來偵測異常；最後在非平穩且非週期時序資料方面，我們則使用閘遞迴單元(gated recurrent unit, GRU)神經網路模型針對時序資料進行預測，並以預測誤差透過常態分佈的累積密度函數進行異常偵測。
我們以美國Numenta公司在其開發的Nupic平台上公開的四個真實資料集以及一個人工資料集作為實驗數據，並與ADSaS、STL、SARIMA、LSTM、LSTM with STL等相關方法進行比較，實驗比較結果顯示，本論文所提的異常檢測框架具有最佳的F1-score。

摘要(英)

A wide variety of time series data have recently been accumulated from sensors around our daily lives, due to the rapid development of the Internet of Things (IoT) technology. As a result, demands for analyzing time series data are rapidly increasing, and anomaly detection is one of the important tasks in various demands. This paper proposes an anomaly detection framework for univariate time series data. First, the time series data are divided into three categories according to the data characteristics. The three categories of data are (1) stationary time series data, (2) periodic time series data, and (3) non-stationary and non-periodic time series data based on the Dickey-Fuller test, fast Fourier transform (FFT), and Pearson product-moment correlation coefficient. Different schemes using statistics and deep learning concepts are then applied to different categories of data for performing anomaly detection.
For stationary time series data, the ratio of the means of a large sliding time window and a small window is calculated. An anomaly is assumed to occur, if the ratio exceeds a threshold value. For periodic time series data, the period of the data is first derived. Afterwards, the standard deviation ratio of data in two consecutive periods is calculated. It is assumed that an anomaly occurs if the ratio exceeds a threshold value. For non-stationary and non-periodic time series data, the neural network of the gated recurrent unit (GRU) model is applied for predicting time series data value. The anomaly is detected on the basis of the cumulative density function of the normal distribution over prediction error.
Four open real-word datasets and an artiﬁcial dataset released on Nupic platform maintained by Numenta corporation are used for performance evaluation of the proposed framework. The evaluation results are compared with those of related methods, namely the ADSaS, STL, SARIMA, LSTM, and LSTM with STL methods. The comparisons show that the proposed framework has the best F1 score for anomaly detection.

關鍵字(中)

★ 物聯網
★ 統計分析
★ 異常偵測
★ 迪基-福勒檢驗
★ 快速傅立葉轉換
★ 皮爾森積矩相關係數
★ 閘遞迴單元神經網路
★ 深度學習
★ 單變數時間序列

關鍵字(英)

★ Internet of Things
★ big data analysis
★ statistical analysis
★ anomaly detection
★ Dickey-Fuller test
★ fast Fourier transform
★ Pearson product-moment correlation coefficient
★ GRU neural network
★ deep learning
★ univariate time series

論文目次

中文摘要 I
Abstract II
誌謝 III
目錄 IV
圖目錄 VI
表目錄 VII
一、緒論 1
1.1. 研究背景與動機 1
1.2. 研究目的與貢獻 2
1.3. 相關文獻研究 2
1.4. 論文架構 3
二、背景知識 4
2.1. 異常檢測(Anomaly Detection) 4
2.2. 週期性時間序列(Periodic Time Series) 4
2.2.1. 快速傅立葉變換 4
2.2.2. 皮爾森相關係數 6
2.2.3. 基於Loess函數的季節-趨勢分解 7
2.2.4. 季節性自迴歸移動平均模型 10
2.3. 平穩時間序列 11
2.3.1. 迪基-福勒檢驗 11
2.4. 深度學習 12
2.4.1. 類神經網路 12
2.4.1.1. 前饋式神經網路 16
2.4.1.2. 反向傳播演算法 16
2.4.2. 深度學習介紹 18
2.4.2.1. 監督式學習 20
2.4.2.2. 非督式學習 20
2.4.2.3. 半督式學習 21
2.4.2.4. 增強式學習 21
2.4.3. 遞迴神經網路 21
2.4.4. 長短期記憶 22
2.4.5. 閘遞迴單元 23
三、研究方法 27
3.1. 資料集 27
3.2. 資料前處理 28
3.3. 作法架構 28
3.4. 評估標準 34
四、實驗與分析 37
4.1. 實驗環境 37
4.2. 實驗結果與分析 38
五、結論與未來展望 41
參考文獻 42

參考文獻

[1] The Numenta Anomaly Benchmark, https://github.com/numenta/NAB
[2] S. Lee, H. K. Kim (November 2018). ADSaS: Comprehensive Real-time Anomaly Detection System. arXiv preprint arXiv:1811.12634v1
[3] V. Chandola, V. Mithal, V. Kumar (2008). Comparative evaluation of anomaly detection techniques for sequence data.Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pp. 743-748, doi:10.1109/ICDM.2008.151.
[4] C. V. Loan (SIAM, 1992). Computational Frameworks for the Fast Fourier Transform. Cornell University, Ithaca, New York.
[5] W. James-Cooley, W. John-Tukey (1965). An algorithm for the machine calculation of complex Fourier series.
[6] K. Pearson (20 June 1895). Notes on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, 58 : 240–242.
[7] B. Robert, S. William, I. Terpenning (1990). STL: A seasonal-trend decomposition procedure based on loess.Journal of Official Statistics 6.1.
[8] D. A. Dickey, W. A. Fuller (1979). Distribution of the Estimators for Autoregressive Time Series with a Unit Root. Journal of the American Statistical Association，74，p 427–431.
[9] Y. LeCun, D. Touresky, G. Hinton, T. Sejnowski (June 1988). A theoretical framework for back-propagation. In Proceedings of the 1988 connectionist models summer school(pp. 21-28). CMU, Pittsburgh, Pa: Morgan Kaufmann.
[10] S. Hochreiter, J. Schmidhuber (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
[11] C. Kyung-hyun, B. Fethi, S. Holger, B. Dzmitry, B. Yoshua. Learning Phrase Representations using RNN Encoder–Decoderfor Statistical Machine Translation. Association for Computational Linguistics.
[12] W. Wang, R. Battiti (2005). Identifying Intrusions in Computer Networks based on Principal Component Analysis. First International Conference on Availability, Reliability and Security,IEEE.
[13] L. Norman-Tasfi, A. Wilson-Higashino, G. Katarina, Miriam A. M. Capretz(2017).Deep Neural Networks With Conﬁdence Sampling For Electrical Anomaly Detection. 2017 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData).
[14] H. Takanori, O. Jun, M. Masahiro, O. Tetsuji (2018). Tandem Connectionist Anomaly Detection Use of Faulty Vibration Signals in Feature Representation Learning. 2018 IEEE International Conference on Prognostics and Health Management (ICPHM)
[15] D. P. Kingma, J. Ba (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
[16] I. Kang, M. K. Jeong, and D. Kong. A differentiated one-class classiﬁcation method with applications to intrusion detection. Expert Syst. Appl., vol. 39, no. 4, pp. 3899–3905, 2012.
[17] P. Casas, J. Mazel, and P. Owezarski. Unsupervised network intrusion
detection systems: Detecting the unknown without knowledge. Comput. Commun., vol. 35, no. 7, pp. 772–783, 2012.
[18] F. Simmross-Wattenberg, J. I. Asensio-Perez, P. Casaseca-de-la-Higuera, M. Martin-Fernandez, I. A. Dimitriadis, C. Alberola-Lopez. Anomaly detection in network trafﬁc based on statistical inference and alpha-stable modeling. IEEE Trans. Depend. Sec. Comput, vol. 8, no. 4,
pp. 494–509, 2011.
[19] C. Raghavendra, C. Sanjay (2019). DEEP LEARNING FOR ANOMALY DETECTION: A SURVEY. arXiv preprint arXiv:1901.03407

指導教授

江振瑞(Jehn-Ruey Jiang)

審核日期

2019-7-25

推文