我试图自动化一个过程,在这个过程中,我有几个类似结构的数据框,我想逐行比较它们(注意下面的精度来自两个不同的arima模型)。
#example:
>rfaList
[[1]]
ME RMSE MAE MPE MAPE MASE ACF1 Theil's U
Accuracy of arima: training set with 25 observations -16.33875 88.05937 44.30480 -3.328660 11.07727 0.4333803 0.27705862 NA
test set: Observations 26 to 31 -182.39756 230.02043 182.39756 -49.975717 49.97572 1.7841747 0.09691896 0.6185846
Accuracy of arima: training set with 26 observations -17.85131 87.38139 45.85696 -5.374825 12.98175 0.4412353 0.27525790 NA
test set: Observations 27 to 32 -211.75054 260.66359 211.75054 -45.792264 45.79226 2.0374623 0.29132058 0.5969906
Accuracy of arima: training set with 27 observations -17.91701 85.88447 45.52783 -5.639418 13.05428 0.4380484 0.27217167 NA
test set: Observations 28 to 33 -202.97126 255.09522 202.97126 -43.606832 43.60683 1.9528986 0.10820455 0.4102644
Accuracy of arima: training set with 28 observations -21.68377 89.78126 50.36505 -7.270820 14.70567 0.4504420 0.23118459 NA
test set: Observations 29 to 34 -139.19230 219.54884 170.97994 -37.980345 41.73775 1.5291666 0.18220954 1.0032026
Accuracy of arima: training set with 29 observations -19.83452 90.02895 51.04194 -7.174426 14.72279 0.4718395 0.17271443 NA
test set: Observations 30 to 35 -194.76221 246.17176 194.76221 -64.531004 64.53100 1.8004120 0.37381607 1.6133421
Accuracy of arima: training set with 30 observations -25.56542 97.05813 56.99899 -7.840041 15.30587 0.4913706 0.05417828 NA
test set: Observations 31 to 36 -114.66254 166.27998 122.79252 -42.088922 46.53154 1.0585562 0.27582130 0.6896271
[[2]]
ME RMSE MAE MPE MAPE MASE ACF1 Theil's U
Accuracy of arima: training set with 25 observations -4.3705239 133.8019 98.73919 -7.131231 23.93680 0.9658461 0.322173290 NA
test set: Observations 26 to 31 -47.3515232 123.8478 94.81189 -15.432437 21.46345 0.9274301 -0.133914435 0.3455463
Accuracy of arima: training set with 26 observations -5.2737700 129.5009 95.00013 -7.538341 23.56918 0.9140906 0.319144732 NA
test set: Observations 27 to 32 -48.3440533 131.9764 111.74678 -11.416319 22.63950 1.0752267 0.119835485 0.3682129
Accuracy of arima: training set with 27 observations -4.0336280 125.4884 91.61947 -6.562458 22.94511 0.8815215 0.311706961 NA
test set: Observations 28 to 33 -71.0070836 150.0291 133.95266 -13.869465 30.24818 1.2888325 -0.004516046 0.3097133
Accuracy of arima: training set with 28 observations -7.5355474 125.2524 92.29724 -7.760418 23.60083 0.8254644 0.298939410 NA
test set: Observations 29 to 34 51.1304141 143.2634 128.10709 37.084918 51.30762 1.1457314 0.177443928 0.8306461
Accuracy of arima: training set with 29 observations 0.4647981 128.6868 96.13299 -6.450063 23.46775 0.8886682 0.214122590 NA
test set: Observations 30 to 35 -196.8589380 228.0361 196.85894 -81.747826 81.74783 1.8197944 0.304335026 2.0842492
Accuracy of arima: training set with 30 observations -5.9968050 133.1665 100.46128 -7.364505 23.96390 0.8660455 0.117534084 NA
test set: Observations 31 to 36 17.5029009 120.7836 115.17301 46.629747 64.77070 0.9928707 0.316970745 1.0940114
我最初的想法是将数据框放在一个列表中,然后从列表中的所有项目中选择所有相同编号的行,但我无法使其工作。
我可以轻松找到从列表中选择多个项目的方法
rfaList[c(1,2)]
或从列表中选择一个项目并从中选择项目
rfaList[[1]][2,]
但我无法找到办法做到这两点。
理想情况下,我想分解数据框并将它们重新组合成一对(或三倍或四倍或我需要比较的模型数量)相同数量的所有行。
#Example
>rbind(model1rfa[2,], model2rfa[2,])
ME RMSE MAE MPE MAPE MASE ACF1 Theil's U
test set: Observations 26 to 31 -182.39756 230.0204 182.39756 -49.97572 49.97572 1.7841747 0.09691896 0.6185846
test set: Observations 26 to 31-1 -47.35152 123.8478 94.81189 -15.43244 21.46345 0.9274301 -0.13391443 0.3455463
是否有可能,或者我必须做一些双循环才能做到这一点?
编辑:以下是jimmyb的回答:
mergeRFAs <- function(rfaList){ #assumes all rfa's have the same number of rows
iterations <- nrow(rfaList[[1]])
rlist <- list()
for (i in 1 : (iterations/2)){
rlist[[i]] <- do.call(rbind, lapply(rfaList, function(z) z[2*i,]))
}
return(rlist)
}
答案 0 :(得分:4)
使用lapply,然后:
do.call(rbind, lapply(rfaList, function(z) z[2,]))
答案 1 :(得分:0)
我将组合所有data.frames,添加id来跟踪数据,然后沿行id进行rbind和split。您将拥有一个列表,其中每个元素都是您可以使用lapply处理的所有第n行的data.frame。
# create starting data.frames and combined list
l1 <- data.frame(r1 = c(1,3,4,6,7),r2 = c(5,7,2,4,5))
l2 <- data.frame(r1 = c(6,3,2,3,6),r2 = c(8,56,4,3,5))
l3 <- data.frame(r1 = c(6,3,7,3,2),r2 = c(8,1,3,3,5))
xlist <- list(l1,l2,l3)
# add data.frame and row id's
xlist <- mapply(FUN = function(i,y){
y$dataframeid <- i
y$rowid <- 1:nrow(y)
y
},i = 1:length(xlist),y=xlist,SIMPLIFY = F)
# rbind the list (other faster methods are available if needed)
df <- do.call("rbind",xlist)
# split into list of matching rows
rowlist <- split(df,f = df$rowid)
# do comparison on each data.frame of rows
results <- lapply(rowlist,function(x){
# comparison function
paste('mean',mean(x$r1),'sd',sd(x$r1))
})