Question

我有两个数据框，分别显示来自两个不同种类的人的相同参数的值和时间戳。值之间的时间序列有所不同，我想找到一个临界值，其中某个值不再位于数据帧中，而可能出现在另一个值中：

例如，如果第一个数据帧中的值太大，则可以将其视为第二个数据帧中的值。

为此，我计算出以下结果：

df1['val'].describe()
df2['val'].describe()

具有以下结果： df1：数5699.000000 均值76.170276 标准12.266000 最低34.888889 25％67.583333 50％75.083333 75％83.626786 最高119.000000 名称：val，dtype：float64

df2:
count    18070.000000
mean        74.344559
std          8.757128
min         34.000000
25%         69.154583
50%         73.917517
75%         79.250000
max        119.000000
Name: val, dtype: float64

现在，在此之后，我尝试使用AR（）或ARIMA（）方法执行一些操作并发现平稳性，但现在不确定如何进行。这是我尝试的一些代码：

rolling_mean =  df.rolling(window = 12).mean()
rolling_std = df.rolling(window = 12).std()
plt.plot(df, color = 'blue', label = 'Original')
plt.plot(rolling_mean, color = 'red', label = 'Rolling Mean')
plt.plot(rolling_std, color = 'black', label = 'Rolling Std')
plt.legend(loc = 'best')
plt.title('Rolling Mean & Rolling Standard Deviation')
plt.show()

result = adfuller(df['Value'])
print('ADF Statistic: {}'.format(result[0]))
print('p-value: {}'.format(result[1]))
print('Critical Values:')
for key, value in result[4].items():
   print('\t{}: {}'.format(key, value))

'''
ARIMA model with AR of order 2, differencing of order 1 and MA of order 
2
'''
decomposition = seasonal_decompose(df_log, freq=1)
model = ARIMA(df_log, order=(10,1,2))
results = model.fit(disp=-1)
plt.plot(df_log_shift)
plt.plot(results.fittedvalues, color='red')
plt.show()

我得到了一些结果，以及更好的线性和线条，但我不知道如何找到截止点或进行更多分析。

你知道我该怎么做吗？或者，如果有的话，更好的解决方案？

非常感谢您

查找数据框熊猫之间的阈值线

0 个答案: