Python中的时间序列分解函数

时间:2013-12-19 02:10:32

标签: python time-series

时间序列分解是一种将时间序列数据集分成三个(或更多)组件的方法。例如:

x(t) = s(t) + m(t) + e(t)

其中

t is the time coordinate
x is the data
s is the seasonal component
e is the random error term
m is the trend

在R中我会执行函数decomposestl。我怎么能在python中做到这一点?

4 个答案:

答案 0 :(得分:58)

我一直有类似的问题,我正在努力找到最好的前进道路。尝试将您的数据移至Pandas数据框,然后致电StatsModels tsa.seasonal_decompose。请参阅following example

import statsmodels.api as sm

dta = sm.datasets.co2.load_pandas().data
# deal with missing values. see issue
dta.co2.interpolate(inplace=True)

res = sm.tsa.seasonal_decompose(dta.co2)
resplot = res.plot()

Three plots produced from above input

然后,您可以从以下位置恢复分解的各个组成部分:

res.resid
res.seasonal
res.trend

我希望这有帮助!

答案 1 :(得分:8)

我已经回答了这个问题here,但下面是关于如何使用rpy2执行此操作的快速功能。这使您可以使用R&W强大的统计分解与黄土,但在python!

    import pandas as pd

    from rpy2.robjects import r, pandas2ri
    import numpy as np
    from rpy2.robjects.packages import importr


def decompose(series, frequency, s_window = 'periodic', log = False,  **kwargs):
    '''
    Decompose a time series into seasonal, trend and irregular components using loess, 
    acronym STL.
    https://www.rdocumentation.org/packages/stats/versions/3.4.3/topics/stl

    params:
        series: a time series

        frequency: the number of observations per “cycle” 
                   (normally a year, but sometimes a week, a day or an hour)
                   https://robjhyndman.com/hyndsight/seasonal-periods/

        s_window: either the character string "periodic" or the span 
                 (in lags) of the loess window for seasonal extraction, 
                 which should be odd and at least 7, according to Cleveland 
                 et al.

        log:    boolean.  take log of series



        **kwargs:  See other params for stl at 
           https://www.rdocumentation.org/packages/stats/versions/3.4.3/topics/stl
    '''

    df = pd.DataFrame()
    df['date'] = series.index
    if log: series = series.pipe(np.log)
    s = [x for x in series.values]
    length = len(series)
    s = r.ts(s, frequency=frequency)
    decomposed = [x for x in r.stl(s, s_window).rx2('time.series')]
    df['observed'] = series.values
    df['trend'] = decomposed[length:2*length]
    df['seasonal'] = decomposed[0:length]
    df['residuals'] = decomposed[2*length:3*length]
    return df

上述函数假定您的系列具有日期时间索引。它返回一个包含各个组件的数据框,然后您可以使用您喜欢的图形库进行图形化。

您可以传递stl see here的参数,但是将任何句点更改为下划线,例如上面函数中的位置参数是s_window,但在上面的链接中它是s.window。另外,我在this repository上找到了上面的一些代码。

答案 2 :(得分:5)

您可以使用rpy2从python调用R函数 使用pip安装rpy2:pip install rpy2 然后使用此包装器:https://gist.github.com/andreas-h/7808564来调用R

提供的STL功能

答案 3 :(得分:0)

你有没有被介绍过scipy?从我在一些PDF /站点中看到的

HereHere

这是可行的。但是,如果没有看到具体的例子,那么某人很难向您展示代码示例。 Scipy很棒,我在研究中使用它,仍然没有让它失望。