Question

我试图自动化一个过程，在这个过程中，我有几个类似结构的数据框，我想逐行比较它们（注意下面的精度来自两个不同的arima模型）。

#example:
>rfaList
[[1]]
                                                             ME      RMSE           MAE        MPE     MAPE      MASE       ACF1 Theil's U
Accuracy of arima: training set with 25 observations  -16.33875  88.05937      44.30480  -3.328660 11.07727 0.4333803 0.27705862        NA
test set: Observations 26 to 31                      -182.39756 230.02043     182.39756 -49.975717 49.97572 1.7841747 0.09691896 0.6185846
Accuracy of arima: training set with 26 observations  -17.85131  87.38139      45.85696  -5.374825 12.98175 0.4412353 0.27525790        NA
test set: Observations 27 to 32                      -211.75054 260.66359     211.75054 -45.792264 45.79226 2.0374623 0.29132058 0.5969906
Accuracy of arima: training set with 27 observations  -17.91701  85.88447      45.52783  -5.639418 13.05428 0.4380484 0.27217167        NA
test set: Observations 28 to 33                      -202.97126 255.09522     202.97126 -43.606832 43.60683 1.9528986 0.10820455 0.4102644
Accuracy of arima: training set with 28 observations  -21.68377  89.78126      50.36505  -7.270820 14.70567 0.4504420 0.23118459        NA
test set: Observations 29 to 34                      -139.19230 219.54884     170.97994 -37.980345 41.73775 1.5291666 0.18220954 1.0032026
Accuracy of arima: training set with 29 observations  -19.83452  90.02895      51.04194  -7.174426 14.72279 0.4718395 0.17271443        NA
test set: Observations 30 to 35                      -194.76221 246.17176     194.76221 -64.531004 64.53100 1.8004120 0.37381607 1.6133421
Accuracy of arima: training set with 30 observations  -25.56542  97.05813      56.99899  -7.840041 15.30587 0.4913706 0.05417828        NA
test set: Observations 31 to 36                      -114.66254 166.27998     122.79252 -42.088922 46.53154 1.0585562 0.27582130 0.6896271

[[2]]
                                                               ME     RMSE       MAE        MPE     MAPE      MASE         ACF1 Theil's U
Accuracy of arima: training set with 25 observations   -4.3705239 133.8019      98.73919  -7.131231 23.93680 0.9658461  0.322173290        NA
test set: Observations 26 to 31                       -47.3515232 123.8478      94.81189 -15.432437 21.46345 0.9274301 -0.133914435 0.3455463
Accuracy of arima: training set with 26 observations   -5.2737700 129.5009      95.00013  -7.538341 23.56918 0.9140906  0.319144732        NA
test set: Observations 27 to 32                       -48.3440533 131.9764     111.74678 -11.416319 22.63950 1.0752267  0.119835485 0.3682129
Accuracy of arima: training set with 27 observations   -4.0336280 125.4884      91.61947  -6.562458 22.94511 0.8815215  0.311706961        NA
test set: Observations 28 to 33                       -71.0070836 150.0291     133.95266 -13.869465 30.24818 1.2888325 -0.004516046 0.3097133
Accuracy of arima: training set with 28 observations   -7.5355474 125.2524      92.29724  -7.760418 23.60083 0.8254644  0.298939410        NA
test set: Observations 29 to 34                        51.1304141 143.2634     128.10709  37.084918 51.30762 1.1457314  0.177443928 0.8306461
Accuracy of arima: training set with 29 observations    0.4647981 128.6868      96.13299  -6.450063 23.46775 0.8886682  0.214122590        NA
test set: Observations 30 to 35                      -196.8589380 228.0361     196.85894 -81.747826 81.74783 1.8197944  0.304335026 2.0842492
Accuracy of arima: training set with 30 observations   -5.9968050 133.1665     100.46128  -7.364505 23.96390 0.8660455  0.117534084        NA
test set: Observations 31 to 36                        17.5029009 120.7836     115.17301  46.629747 64.77070 0.9928707  0.316970745 1.0940114

我最初的想法是将数据框放在一个列表中，然后从列表中的所有项目中选择所有相同编号的行，但我无法使其工作。

我可以轻松找到从列表中选择多个项目的方法

rfaList[c(1,2)]

或从列表中选择一个项目并从中选择项目

rfaList[[1]][2,]

但我无法找到办法做到这两点。

理想情况下，我想分解数据框并将它们重新组合成一对（或三倍或四倍或我需要比较的模型数量）相同数量的所有行。

#Example
>rbind(model1rfa[2,], model2rfa[2,])
                                        ME     RMSE       MAE       MPE     MAPE      MASE        ACF1 Theil's U
test set: Observations 26 to 31  -182.39756 230.0204 182.39756 -49.97572 49.97572 1.7841747  0.09691896 0.6185846
test set: Observations 26 to 31-1  -47.35152 123.8478  94.81189 -15.43244 21.46345 0.9274301 -0.13391443 0.3455463

是否有可能，或者我必须做一些双循环才能做到这一点？

编辑：以下是jimmyb的回答：

mergeRFAs <- function(rfaList){ #assumes all rfa's have the same number of rows
iterations <- nrow(rfaList[[1]])
rlist <- list()
for (i in 1 : (iterations/2)){
rlist[[i]] <- do.call(rbind, lapply(rfaList, function(z) z[2*i,]))
}
return(rlist)
}

Answer 1

使用lapply，然后：

do.call(rbind, lapply(rfaList, function(z) z[2,]))

Answer 2

我将组合所有data.frames，添加id来跟踪数据，然后沿行id进行rbind和split。您将拥有一个列表，其中每个元素都是您可以使用lapply处理的所有第n行的data.frame。

# create starting data.frames and combined list
l1 <- data.frame(r1 = c(1,3,4,6,7),r2 = c(5,7,2,4,5))
l2 <- data.frame(r1 = c(6,3,2,3,6),r2 = c(8,56,4,3,5))
l3 <- data.frame(r1 = c(6,3,7,3,2),r2 = c(8,1,3,3,5))
xlist <- list(l1,l2,l3)

# add data.frame and row id's
xlist <- mapply(FUN = function(i,y){
  y$dataframeid <- i
  y$rowid <- 1:nrow(y)
  y
},i = 1:length(xlist),y=xlist,SIMPLIFY = F)

# rbind the list (other faster methods are available if needed)
df <- do.call("rbind",xlist)

# split into list of matching rows
rowlist <- split(df,f = df$rowid)

# do comparison on each data.frame of rows
results <- lapply(rowlist,function(x){
  # comparison function
  paste('mean',mean(x$r1),'sd',sd(x$r1))
})

R - 是否可以同时从列表中选择多个项目并依次从中选择项目？

2 个答案: