在R中一次导入大型CSV文件

时间:2018-10-19 16:09:36

标签: r csv import readr

我在一个文件夹中有70个具有相同列的csv文件,每个文件均为0.5 GB。 我想将它们导入到R中的单个数据框中。

通常我按如下所示正确地导入它们:

df <- read_delim("file.csv", 
"|", escape_double = FALSE, col_types = cols(pc_no = col_character(), 
    id_key = col_character()), trim_ws = TRUE)

要全部导入,应进行如下编码和错误,如下所示: 参数“ delim”丢失,没有默认值

tbl <-
list.files(pattern = "*.csv") %>% 
map_df(~read_delim("|", escape_double = FALSE, col_types = cols(pc_no = col_character(), id_key = col_character()), trim_ws = TRUE))

使用read_csv导入,但仅显示一列,其中包含所有列和值。

 tbl <-
 list.files(pattern = "*.csv") %>% 
 map_df(~read_csv(., col_types = cols(.default = "c")))

1 个答案:

答案 0 :(得分:0)

在第二段代码中,您缺少.,因此read_delim会将参数解释为read_delim(file="|", delim=<nothing provided>, ...)。试试:

tbl <- list.files(pattern = "*.csv") %>% 
  map_df(~ read_delim(., delim = "|", escape_double = FALSE,
                      col_types = cols(pc_no = col_character(), id_key = col_character()),
                      trim_ws = TRUE))

我在这里明确标识了delim=,但这并不是绝对必要的。但是,如果您初次尝试过这样做,您会看到

readr::read_delim(delim = "|", escape_double = FALSE,
                  col_types = cols(pc_no = col_character(), id_key = col_character()),
                  trim_ws = TRUE)
# Error in read_delimited(file, tokenizer, col_names = col_names, col_types = col_types,  : 
#   argument "file" is missing, with no default

更能说明实际问题。