与read_csv_chunked一起使用

时间:2019-06-26 17:09:20

标签: r apply tidyverse readr

我正在尝试使用.csvlapply导入多个至少1GB的read_csv_chunked文件。要创建示例数据,请查看:

A <- data.frame(id = sample(c("A","B","C"), 100, replace = T), k = rnorm(100), j = rnorm(100), l = rnorm(100)); write_csv(A, "A.csv")
B <- data.frame(id_limp = sample(c("A","B","C"), 100, replace = T), k = rnorm(100), j = rnorm(100), l = rnorm(100)); write_csv(B, "B.csv")
C <- data.frame(id = sample(c("A","B","C"), 100, replace = T), k = rnorm(100), j = rnorm(100), l = rnorm(100)); write_csv(C, "C.csv")
D <- data.frame(id_samp = sample(c("A","B","C"), 100, replace = T), k = rnorm(100), j = rnorm(100), l = rnorm(100)); write_csv(D, "D.csv")
E <- data.frame(id = sample(c("A","B","C"), 100, replace = T), k = rnorm(100), j = rnorm(100), l = rnorm(100)); write_csv(E, "E.csv")

read_csv_chunked使用的功能如下:

f <- function(x, pos) if("id" %in% names(x)) { subset(x, id == "A")} else if ("id_limp" %in% names(x)) {subset(x, id_limp == "A")} else {subset(id_samp == "A")}

file_list  <- list.files(pattern = ".csv")
data       <- lapply(file_list, read_csv_chunked, callback = DateFrameCallback$new(f),chunk_size = 10)
names(data)<- tolower(word(gsub("[[:punct:]]+"," ",file_list), 1))
list2env(data, envir = .GlobalEnv)

rm(data)

其中fcallback所需的read_csv_chunked函数(或至少应该如此),file_list是该文件夹中文件的列表.csvdata传递了lapply函数,names(data)data中的小标题命名,list2env写出{{ 1}}。

我得到的错误是:

data

有人可以解释为什么我得到这个错误吗?如果我一次没有读Error in as_chunk_callback(callback) : object 'DateFrameCallback' not found ,并且没有csv,就不会收到错误消息。有什么解决办法?还是有更好的解决方案?

0 个答案:

没有答案