我有一个时间序列模型名称矢量如下,将矢量名称视为模型:
[1] "ARIMA(2,1,0) with drift" "ARIMA(2,0,0) with non-zero mean" "ARIMA(2,0,0) with non-zero mean" "ARIMA(2,0,0) with non-zero mean" "ARIMA(0,0,1) with non-zero mean"
这些载体包含五个不同的部分:
1)模型名称:在括号前总是有一个模型名称,在这种情况下" ARIMA"是一个模型名称(ARIMA是一种预测技术,它完全基于其惯性,自回归整合移动平均的简写来预测系列的未来值)
2)自动回归部分(AR部分称为" p"):逗号前面的括号后面的第一个数字是自回归部分,例如这些向量如上所示,AR部分的值为2,2,2,2,0。
3)移动平均线部分(简称" d"):第一个逗号后面的括号中的第二个元素称为移动平均线部分。 在这个例子中,我有1,0,0,0,0作为移动平均线
4)差异部分(简称" q"):括号中的最后一个元素是差异部分,主要称为" q"在术语中。 在这个例子中,我有0,0,0,0,1作为值。
5)""之后的另外两个部分漂移和非零部分。
问题是我需要从模型向量中提取这些元素。
通过查看模型名称,我想编写一个程序来提取以下内容:
1. Name of the model eg: ARIMA
2. Number of AR coefficients
3. Number of MA coefficients
4. Order of differencing
5. Whether the model has a drift or not
6. whether it has a zero mean or not
我的输出应如下所示:
Model p d q outcome_with_drift outcome_with_non_zero_mean
1 ARIMA 2 1 0 1 0
2 ARIMA 2 0 0 0 1
3 ARIMA 2 0 0 0 1
4 ARIMA 2 0 0 0 1
5 ARIMA 0 0 1 0 1
答案 0 :(得分:2)
您可以使用library(stringr)
将矢量提取到不同的列中,例如,如果vect是具有以下输入的矢量:
vect <- c("ARIMA(2,1,0) with drift", "ARIMA(2,0,0) with non-zero mean" ,"ARIMA(2,0,0) with non-zero mean" ,
"ARIMA(2,0,0) with non-zero mean" ,"ARIMA(0,0,1) with non-zero mean")
然后使用str_split_fixed
将其提取到单独的列中,如下所示:
library(stringr)
df <- data.frame(str_split_fixed(vect,"\\s|\\(|\\)|,",n=5))
###Here we have choosen the separator as space(\\s), parenthesis ( \\( and \\) ) and commas (,)
names(df) <- c("Model","p","d","q","outcome")
#Rename basis the question, into follwing:
#p is the number of autoregressive terms(AR)
#d is the number of nonseasonal differences needed for stationarity(MA)
#q is the number of lagged forecast errors in the prediction equation(order of differencing)
df$outcome_ <- gsub("\\s|-","_",trimws(df$outcome))
#cleaning the outcome column by replacing spaces and dashes with underscores
dummy_mat <- data.frame(model.matrix(~outcome_-1,data=df))
#using model.matrix to calculate the dummies for drift and non zero mean, for the value of 1 meaning True and 0 meaning False
df_final <- data.frame(df[,1:4],dummy_mat)
<强>结果强>:
# Model p d q outcome_with_drift outcome_with_non_zero_mean
# 1 ARIMA 2 1 0 1 0
# 2 ARIMA 2 0 0 0 1
# 3 ARIMA 2 0 0 0 1
# 4 ARIMA 2 0 0 0 1
# 5 ARIMA 0 0 1 0 1