我正在使用出色的pmdarima软件包和auto_arima()
对股票数据进行时间序列分析和预测。我无法真正了解它的工作原理,因为我对正确预测测试数据的所有尝试都没有成功。
欢迎任何帮助:任何解决方案都可以作为将来有经验的交易者的指南。
使用保存在csv
文件中的两年小时数据的历史OHLC,我让auto_arima()
找到最佳的ARIMA模型。市场是加密货币,因此没有季节性条款:
import pmdarima as pm
from pmdarima.arima.utils import ndiffs
df = pd.read_csv(file, sep='\t')
# Divide the dataframe in train & test parts
train_size = int(0.8*len(df))
df_train, df_test = df[:train_size], df[train_size:]
# Estimate the number of differences to apply using the ADF test :
n_adf = ndiffs(df['close'], test='adf') # -> 0
# Use auto_arima to find the best p, q parameters for the ARIMA model, that minimizes AIC
model = pm.auto_arima(y = df_train['close'],
d = n_adf,
start_p = 1,
start_q = 1,
max_p = 5,
max_q = 5,
trend = 'ct', # add a constant and trend term to the equation
seasonal = False,
stepwise = True)
print(model.summary())
哪个给:
模型是AR(1):很简单,但是为什么不呢?让我们称之为model
并尝试预测测试数据的前50点:
predictions = list(model.predict(n_periods=50))
如果我们绘制这个图:
auto_arima()
选择了它?ct
选项(截取项和漂移项)也会导致AR(1),但结果要差得多(也是一条直线)。任何指导将不胜感激,谢谢!