时间序列上的变化查询的高效处理算法

FAST QUERY METHODS OF FINDING SUBSEQUENCES OF SATISFIED CHANGES ON TIME SERIES

  • 摘要: 时间序列数据的值的变化往往代表着事件的发生。时间序列数据上的变化查询,即查找在一定长度内,满足增长或减少一定阈值的子序列,可以挖掘事件,有重要实际意义。现有方法无法高效解决该问题。为此,一种基于分段并构建分段关系图的方法被提出。实验表明,该方法在百万长度的时间序列下仍可在百毫秒内返回结果,且分段关系图的存储开销也较小。对于波动较少的数据集,存储大小可达到原数据集大小的30%以下。且进一步提出了两种优化手段,可在原有基础上再减少约50%的存储开销,同时不过多影响查询效率。

     

    Abstract: Changes in the value of time series data often represent the occurrence of events. The change query on time series data, that is, to find the subsequences within a certain length that meets a certain threshold of increase or decrease, can mine events and has important practical significance. Existing methods cannot efficiently solve this problem. To this end, a method based on segmentation and constructing a segmentation relationship graph is proposed. Experiments show that this method can still return results within 100 milliseconds under a million-length time series, and the storage overhead of the segmentation relationship graph is also small. For data sets with less fluctuation, the storage size can reach less than 30% of the original data set size. Moreover, two optimization methods are further proposed, which can reduce the storage overhead by about 50% on the original basis, and at the same time do not affect the query efficiency too much.

     

/

返回文章
返回