dc.description.abstract | A time series data is a collection of measurements obtained sequentially, which is common in many application domains, e.g., fluctuations of stock market, observations from sensor networks, medical and biological signals. Since time series data usually contains large number of data points, i.e., high-dimensionality, directly dealing with such data in its raw format is very expensive in terms of processing and storage loading. To effectively and efficiently manage time series data, several representation methods were proposed. Representation methods can reduce the dimensionality of a time series data while preserving its fundamental characteristics. However, each representation method is most suitable for certain time series data types in terms of compression rate and information loss, which means no single method is effective enough for all possible types. Therefore, this study aims at proposing a system that can identify the most suitable representation method for different types of time series data. To be specific, this study first conducts an extensive performance evaluation to identify the most suitable representation methods for each training time series data. Afterward, by computing similarities between a new time series and training time series, the system can determine the most suitable representation method for the new time series data. Finally, our experimental result shows that the proposed system can identify the most suitable representation method for 46% to 76% of time series data. For the remaining time series data, the evaluation results also show that the selected representation can produce acceptable results with only less than 2.19% difference comparing to the best representation method. In addition, the experimental result demonstrates that the proposed system can identify the most suitable representation 17 to 300 times faster than the naïve solution. | en_US |