将多个数据帧拆分,应用和组合到一个数据帧中

时间:2013-12-26 21:08:18

标签: r split dataframe apply dbf

我已经完成了在ArcGIS中通过街道网络旅行的起始 - 目的地成本矩阵(23个起点,约600,000个目的地),并使用Python脚本按照商店ID将结果矩阵分解为DBF表。我已将每个DBF表加载到R会话中,如下所示:

# Import OD cost matrix results for each store
origins <- read.dbf('ODM_origins.dbf')
store_17318 <- read.dbf('table_17318.dbf')
store_17358 <- read.dbf('table_17358.dbf')
store_17601 <- read.dbf('table_17601.dbf')
store_17771 <- read.dbf('table_17771.dbf')
store_18068 <- read.dbf('table_18068.dbf')
store_18261 <- read.dbf('table_18261.dbf')
store_18289 <- read.dbf('table_18289.dbf')
store_18329 <- read.dbf('table_18329.dbf')
store_18393 <- read.dbf('table_18393.dbf')
store_18503 <- read.dbf('table_18503.dbf')
store_18522 <- read.dbf('table_18522.dbf')
store_19325 <- read.dbf('table_19325.dbf')
store_19454 <- read.dbf('table_19454.dbf')
store_20068 <- read.dbf('table_20068.dbf')
store_20238 <- read.dbf('table_20238.dbf')
store_20292 <- read.dbf('table_20292.dbf')
store_20435 <- read.dbf('table_20435.dbf')
store_20465 <- read.dbf('table_20465.dbf')
store_20999 <- read.dbf('table_20999.dbf')
store_22686 <- read.dbf('table_22686.dbf')
store_22715 <- read.dbf('table_22715.dbf')
store_24445 <- read.dbf('table_24445.dbf')
store_24446 <- read.dbf('table_24446.dbf')
ID <- as.vector(origins$Name) # Create list of store IDs
object_list <- list(ls(pat="store_")) # Create list of DBF object names

以下是每个数据框的布局:

> head(store_17318)
  OID_          NAME ORIGINID DESTINATIO DESTINAT_1 TOTAL_TRAV SHAPE_LENG
1    0 17318 - 17318       25       5367          1  0.2056914   202.2393
2    0 17318 - 17318       25       5368          2  0.2056914   202.2393
3    0 17318 - 17318       25       5381          5  0.2432538   224.3947
4    0 17318 - 17318       25       5382          6  0.2432538   224.3947
5    0 17318 - 17318       25       5362          7  0.3670772   294.8987
6    0 17318 - 17318       25       5363          8  0.3670772   294.8987

对于每个数据框,我想按商店ID找到旅行时间的摘要统计(平均值,SD)并将其写入新的数据框。这似乎是标准的拆分,应用,组合工作流程,但它涉及拆分多个对象。任何有关此问题的帮助将不胜感激。

1 个答案:

答案 0 :(得分:2)

您可以使用sapply

res <- sapply(ls(pattern = "store_"), function(x) {
  tmp <- get(x)$TOTAL_TRAV
  c(mean = mean(tmp), SD = sd(tmp))
})

这将返回一个矩阵。列表示商店标识。这两行包含均值和标准差。

您可以使用

将此矩阵转换为(转置的)数据框
as.data.frame(t(res))

这里,两列包含均值和标准差。行名称表示商店ID。