选择特定列并将csv名称添加到最终的csv文件中

时间:2016-01-28 18:06:49

标签: r dplyr

我试图从位于不同子目录中的许多csv文件中提取相同的前16列数据,并将csv文件名添加到最终csv的每一行。我的代码:

getwd()
root<-list.dirs(".", recursive=TRUE)
# get list of files ending in csv in directory root
dir(root, pattern='csv$', recursive = TRUE, full.names = TRUE) %>%
# read files into data frames
lapply(FUN = read.csv) %>%
# bind all data frames into a single data frame
rbind_all %>%
# write into a single csv file
write.csv("all.csv")

我想知道在哪里放置选择列并添加文件名代码。

解答:

getwd()
root<-list.dirs(".", recursive=TRUE)
# get list of files ending in csv in directory root
dir(root, pattern='csv$', recursive = TRUE, full.names = TRUE) %>%
# read files into data frames, select first 16 columns and add filename 
lapply(FUN = function(p) read.csv(p) %>% select(1:16) %>%    

mutate(file_name=p)) %>%     
# bind all data frames into a single data frame
rbind_all %>%
# write into a single csv file
write.csv("all.csv")

1 个答案:

答案 0 :(得分:2)

您应该在使用lapply时执行此操作,因为这是您可以访问文件名/路径的最后一步:

dir(root, pattern='csv$', recursive = TRUE, full.names = TRUE) %>%
  lapply(FUN = function(p) read.csv(p) %>% select(1:16) %>% mutate(file_name=p)) %>%
  bind_rows() %>%
  write.csv("all.csv")