Discovering time series discord based on a decrete method

Các tác giả

  • Thanh Son Nguyen Đại học Sư phạm Kỹ thuật Thành phố Hồ Chí Minh (HCMUTE), Việt Nam

Email tác giả liên hệ:

sonnt@hcmute.edu.vn

Từ khóa:

time series, time series discord, SAX method, discord discovery, early abandoning

Tóm tắt

A time series is a series of data points indexed in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Discord in a long time series is a subsequence which is the most different from all the rest of subsequences of that time series. Time series discord discovery is one of problems which has received a lot of attention lately. In this paper, we propose a new algorithm for time series discord discovery which is based on the discrete method called Symbolic Aggregate approXimation (SAX) method using distance measure in SAX space and Euclidean distance associated with the idea of early abandoning. Our proposed method only need to scan the database two times to discover time series discord exactly and it is very simple to implement. The experimental results showed that our proposed method outperforms the similar method proposed by Yankov et al., in terms of runtime while the accuracy is the same.

Tải xuống: 0

Dữ liệu tải xuống chưa có sẵn.

Tài liệu tham khảo

E. Keogh, J. Lin, J and A. Fu, "HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence," in the 5th IEEE International Conference on Data Mining (ICDM 2005), 2005.

D. Yankov, E. Keogh and U. Rebbapragada, "Disk aware discord discovery: Finding unusual time series in terabyte sized datasets," Knowledge and Information Systems, vol. 17, no. 2, pp. 241-261, 2008.

E. Keogh and S. Kasetty, "On the Need for Time Series Data Mining Benchmarks: A Survey and," in the 8th ACM SIGKDD Int'l Conference on Knowledge, Edmonton, Alberta, Canada, 2002.

A. Mueen, E. Keogh, Q. Zhu, S. Cash and B. Westover, "Exact Discovery of Time Series Motifs," in Proc. of SIAM Int. Conf. on Data Mining, 2009.

Jessica Lin, Eamonn Keogh, Li Wei, Stefano Lonardi, "Experiencing SAX: a novel symbolic representation of time series," Journal of Data Mining and Knowledge Discovery, vol. 15, no. 2, pp. 107-144, 2007.

A. Fu, O. Leung, E. Keogh and J. Lin, "Finding Time Series Discords Based on Haar Transform," in Lecture Notes in Computer Science, Advanced Data Mining and Applications, Heidelberg, Springer Berlin, 2006.

Y. Bu, T-W. Leung, A. Fu, E. Keogh, J. Pei and S. Meshkin , "WAT: Finding Top-K Discords in Time Series Database," in the 2007 SIAM International Conference on Data Mining (SDM'07), Minneapolis, MN, USA, 2007.

Chuah, Mooi Choo, and Fen Fu, "ECG anomaly detection via time series analysis," in Frontiers of High Performance Computing and Networking ISPA 2007 Workshops, Springer Berlin Heidelberg, 2007.

Lin Yi, Michael D. McCool, and Ali A. Ghorbani, "Motif and anomaly discovery of time series based on subseries join," in IAENG International Conference on Data Mining and Applications, 2010.

H. T. Q. Buu and D. T. Anh, "Time Series Discord Discovery Based on iSAX Symbolic Representation," in the Third International Conference on Knowledge and System Engineering (KSE 2011), Hanoi, Vietnam, 2011.

N. D. K. Khanh and D. T. Anh, "Time series discord discovery using WAT algorithm and iSAX representation," in the Third Symposium on Information and Communication Technology (SoICT’12), ACM New York, NY, USA, 2012.

W. Luo, M. Gallagher and J. Wiles, "Parameter-free search of time-series discord," Journal Of Computer Science And Technology, vol. 28, no. 2, pp. 300-310, 2013.

M. Jones, D. Nikovski, M. Imamura, T. Hirata, "Anomaly Detection in Real-Valued Multidimensional Time Series," in Proc. of 2014 ASE BIGDATA/ SOCIALCOM/ CYBERSECURITY Conference, Stanford University, 2014.

Pavel Senin, Jessica Lin, Xing Wang, Tim Oates, Sunil Gandhi, "Time series anomaly discovery with grammar-based compression," in 18th International Conference on Extending Database Technology (EDBT), Brussels, Belgium, 2015.

T. S. Nguyen, "time series discord discovery based on r*-tree," Journal of Science, HCM City University of Education, Special Issue: Natural Science and Technology, vol. 12, no. 90, pp. 133-144, 2016.

Chunkai Zhang, Haodong Liu and Ye Li, "Time Series Discord Discovery Under Multi-party Privacy Preserving," in 2017 IEEE Second International Conference on Data Science in Cyberspace (DSC), 2017.

C. M. Pham, M. D. Bui and T. A. Duong, "Discord Discovery in Streaming Time Series based on an Improved HOT SAX Algorithm," in The Ninth International Symposium on Information and Communication Technology (SoICT 2018), Danang, Vietnam, 2018.

Min Hu, Xiaowei Feng, Zhiwei Ji, Ke Yan, Shengchen Zhou, "A novel computational approach for discord search with local recurrence rates in multivariate time series," Information Sciences, vol. 477, no. March 2019, pp. 220-233, 2019.

E. Keogh and T. Folias, "The UCR Time Series Data Mining Archive," 2013. [Online]. Available: http://www.cs.ucr.edu/~eamonn/.

E. Keogh, S. Lonardi, B. Chiu, "Finding surprising patterns in a time series database in linear time and space," in KDD 2002: Proceedings of 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 2002.

Tải xuống

Đã Xuất bản

2019-12-27

Cách trích dẫn

[1]
T. S. Nguyen, “Discovering time series discord based on a decrete method”, JTE, vol 14, số p.h 5, tr 64–72, tháng 12 2019.

Số

Chuyên mục

Bài báo khoa học

Categories