仅导入少量库

Question

我正在尝试执行时间序列分析，并且在此过程中我正在执行Dickey Fuller测试以检查数据框的平稳性。

我一直收到错误ValueError: too many values to unpack (expected 2)。我已从数据框中删除了NaN的所有行。我唯一能想到的是dftest[0:4]（下面代码的第4行）和dftest[4]（第6行）。我不知道这些值是什么意思，这可能导致错误。我已经尝试过shift标签来获得解释，但没有任何帮助。我也试过了dftest[0:1]但没有用。仅供参考我的数据框只有2列

from statsmodels.tsa.stattools import adfuller
def test_stationarity(homepriceTS):

    #Determing rolling statistics
    rolmean = pd.rolling_mean(homepriceTS, window=12)
    rolstd = pd.rolling_std(homepriceTS, window=12)

    #Plot rolling statistics:
    orig = plt.plot(homepriceTS, color='blue',label='Original')
    mean = plt.plot(rolmean, color='red', label='Rolling Mean')
    std = plt.plot(rolstd, color='black', label = 'Rolling Std')
    plt.legend(loc='best')
    plt.title('Rolling Mean & Standard Deviation')
    plt.show(block=False)

    #Perform Dickey-Fuller test:
    print 'Results of Dickey-Fuller Test:'
    dftest = adfuller(homepriceTS, autolag='AIC')
    dfoutput = pd.Series(dftest[0:4], index=['Test Statistic','p-value','#Lags Used','Number of Observations Used'])
    for key,value in dftest[4].items():
        dfoutput['Critical Value (%s)'%key] = value
        dfoutput

我一直在跟着这个非常好的时间序列一步一步： https://www.analyticsvidhya.com/blog/2016/02/time-series-forecasting-codes-python/

Answer 1

您需要将适当的序列传递给test_stationarity()

如果您的时间序列采用以下格式：

ts.head()

            value
month
2015-08-01    120
2015-09-01    130
2015-10-01    133
2015-11-01    178
2015-12-01    135
...

尝试以下操作：test_stationarity(ts['value'])

这会将数据帧转换为适当的序列，这正是函数所期望的。或者您可以在传递给函数之前将其转换：

ts = ts['value']
ts.head()

month
2015-08-01    120
2015-09-01    130
2015-10-01    133
2015-11-01    178
2015-12-01    135
...
test_stationarity(ts)

尽管由于没有示例数据，我不能确定这是您的确切问题，但是最近我在测试时间序列时遇到了相同的错误消息，我可以肯定地说，传递未转换的时间序列会抛出相同的错误消息ValueError: too many values to unpack (expected 2)条消息。

Answer 2

这一行，

for key,value in dftest[4].items():

在我看来代码中唯一需要解压缩值的地方。在这种情况下，dftest[4].items()需要成为这些作业右侧的项目之一：

>>> k,v = 1,2
>>> k,v = 2,[3,4,5]
>>> k,v = [1,2], [3,4,5]
>>> k,v = [1,2], {2: 3}

如果我在你的位置，我会在那个for循环之前打印dftest[4].items()，看看它是什么样的结构。（我猜错了，我说它是一个列表。）或者，再看看文档。

Answer 3

我要添加我已进行的次要代码更改：

仅导入少量库

 import statsmodels.api as sm
    from statsmodels.tsa.stattools import adfuller, kpss
    def test_stationarity(timeseries):

用于迪基·富勒的测试

 print("Results of Dickey-Fuller Test:")
    dftest=adfuller(timeseries,autolag='AIC')
    dfoutput=pd.Series(dftest[0:4], index=['Test Statistic','p-value','#Lags Used','Number of Observations Used'])
    for key,value in dftest[4].items():
        dfoutput['Critical Value (%s)'%key] = value
    print(dfoutput)

test_stationarity(ts)

希望这会有所帮助。

Python时间序列Dickey-Fuller ValueError：解压缩的值太多（预期2）

3 个答案:

仅导入少量库

用于迪基·富勒的测试