摘要(英) |
A wide variety of time series data have recently been accumulated from sensors around our daily lives, due to the rapid development of the Internet of Things (IoT) technology. As a result, demands for analyzing time series data are rapidly increasing, and anomaly detection is one of the important tasks in various demands. This paper proposes an anomaly detection framework for univariate time series data. First, the time series data are divided into three categories according to the data characteristics. The three categories of data are (1) stationary time series data, (2) periodic time series data, and (3) non-stationary and non-periodic time series data based on the Dickey-Fuller test, fast Fourier transform (FFT), and Pearson product-moment correlation coefficient. Different schemes using statistics and deep learning concepts are then applied to different categories of data for performing anomaly detection.
For stationary time series data, the ratio of the means of a large sliding time window and a small window is calculated. An anomaly is assumed to occur, if the ratio exceeds a threshold value. For periodic time series data, the period of the data is first derived. Afterwards, the standard deviation ratio of data in two consecutive periods is calculated. It is assumed that an anomaly occurs if the ratio exceeds a threshold value. For non-stationary and non-periodic time series data, the neural network of the gated recurrent unit (GRU) model is applied for predicting time series data value. The anomaly is detected on the basis of the cumulative density function of the normal distribution over prediction error.
Four open real-word datasets and an artificial dataset released on Nupic platform maintained by Numenta corporation are used for performance evaluation of the proposed framework. The evaluation results are compared with those of related methods, namely the ADSaS, STL, SARIMA, LSTM, and LSTM with STL methods. The comparisons show that the proposed framework has the best F1 score for anomaly detection.
|
參考文獻 |
[1] The Numenta Anomaly Benchmark, https://github.com/numenta/NAB
[2] S. Lee, H. K. Kim (November 2018). ADSaS: Comprehensive Real-time Anomaly Detection System. arXiv preprint arXiv:1811.12634v1
[3] V. Chandola, V. Mithal, V. Kumar (2008). Comparative evaluation of anomaly detection techniques for sequence data.Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pp. 743-748, doi:10.1109/ICDM.2008.151.
[4] C. V. Loan (SIAM, 1992). Computational Frameworks for the Fast Fourier Transform. Cornell University, Ithaca, New York.
[5] W. James-Cooley, W. John-Tukey (1965). An algorithm for the machine calculation of complex Fourier series.
[6] K. Pearson (20 June 1895). Notes on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, 58 : 240–242.
[7] B. Robert, S. William, I. Terpenning (1990). STL: A seasonal-trend decomposition procedure based on loess.Journal of Official Statistics 6.1.
[8] D. A. Dickey, W. A. Fuller (1979). Distribution of the Estimators for Autoregressive Time Series with a Unit Root. Journal of the American Statistical Association,74,p 427–431.
[9] Y. LeCun, D. Touresky, G. Hinton, T. Sejnowski (June 1988). A theoretical framework for back-propagation. In Proceedings of the 1988 connectionist models summer school(pp. 21-28). CMU, Pittsburgh, Pa: Morgan Kaufmann.
[10] S. Hochreiter, J. Schmidhuber (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
[11] C. Kyung-hyun, B. Fethi, S. Holger, B. Dzmitry, B. Yoshua. Learning Phrase Representations using RNN Encoder–Decoderfor Statistical Machine Translation. Association for Computational Linguistics.
[12] W. Wang, R. Battiti (2005). Identifying Intrusions in Computer Networks based on Principal Component Analysis. First International Conference on Availability, Reliability and Security,IEEE.
[13] L. Norman-Tasfi, A. Wilson-Higashino, G. Katarina, Miriam A. M. Capretz(2017).Deep Neural Networks With Confidence Sampling For Electrical Anomaly Detection. 2017 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData).
[14] H. Takanori, O. Jun, M. Masahiro, O. Tetsuji (2018). Tandem Connectionist Anomaly Detection Use of Faulty Vibration Signals in Feature Representation Learning. 2018 IEEE International Conference on Prognostics and Health Management (ICPHM)
[15] D. P. Kingma, J. Ba (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
[16] I. Kang, M. K. Jeong, and D. Kong. A differentiated one-class classification method with applications to intrusion detection. Expert Syst. Appl., vol. 39, no. 4, pp. 3899–3905, 2012.
[17] P. Casas, J. Mazel, and P. Owezarski. Unsupervised network intrusion
detection systems: Detecting the unknown without knowledge. Comput. Commun., vol. 35, no. 7, pp. 772–783, 2012.
[18] F. Simmross-Wattenberg, J. I. Asensio-Perez, P. Casaseca-de-la-Higuera, M. Martin-Fernandez, I. A. Dimitriadis, C. Alberola-Lopez. Anomaly detection in network traffic based on statistical inference and alpha-stable modeling. IEEE Trans. Depend. Sec. Comput, vol. 8, no. 4,
pp. 494–509, 2011.
[19] C. Raghavendra, C. Sanjay (2019). DEEP LEARNING FOR ANOMALY DETECTION: A SURVEY. arXiv preprint arXiv:1901.03407
|