Question

我有一个结构相同的csv文件目录。我正在尝试将所有这些加载到单个data.frame中。目前我使用lapply()和read.csv()来获取data.frames列表，我正在寻找一种优雅的方法将此列表转换为data.frame，以避免显式循环。

我的lapply(list.of.file.names,read.csv)的结果可以近似为此结构：

list.of.dfs <- list(data.frame(A=sample(seq(from = 1, to = 10), size = 5),
                               B=sample(seq(from = 1, to = 10), size = 5)), 
                    data.frame(A=sample(seq(from = 1, to = 10), size = 5),
                               B=sample(seq(from = 1, to = 10), size = 5)), 
                    data.frame(A=sample(seq(from = 1, to = 10), size = 5),
                               B=sample(seq(from = 1, to = 10), size = 5))
                    )

以下行的优雅版本适用于任意长度列表：

one.data.frame <- rbind(list.of.dfs[[1]],list.of.dfs[[2]],list.of.dfs[[3]])

我可以使用for循环执行此操作，但是有基于矢量的解决方案吗？

Answer 1

do.call是执行此操作的基本方法。

do.call(rbind, list.of.dfs)

但是如果你有很多数据项，那么它可能会很慢，而其他讨论在S.O.通过使用自定义函数或data.table或plyr包来集中讨论如何加快速度。 E.g：

Why is rbindlist "better" than rbind?

Can rbind be parallelized in R?

Performance of rbind.data.frame

Answer 2

@thelatemail提到了它，但你可能想要使用以下速度：

rbindlist(list.of.dfs)

（需要library(data.table)）

将任意长的数据帧列表简化为单个数据帧

2 个答案: