Question

我有数百个.csv文件，其结构如下：

xyz25012013 <- data.frame(province = c("AB", "BC", "ON"), high = c(30, 20, 25), low = c(5, 2, 3))
xyz13122014 <- data.frame(province = c("AB", "BC", "ON"), high = c(20, 34, 25), low = c(1, 8, 3))
xyz30042014 <- data.frame(province = c("AB", "BC", "ON"), high = c(50, 21, 27), low = c(1, 9, 26))
xyz04072015 <- data.frame(province = c("AB", "BC", "ON"), high = c(26, 07, 90), low = c(4, 7, 3))

我想导入并合并/行绑定所有这些并保留文件名中包含的元数据日期。

as.Date(substr(<filename>,4,11) format = "%d%m%Y")

我希望最终输出看起来像这样：

date <- c(rep("25012013", 3), rep("13122014", 3), rep("30042014", 3), rep("04072015", 3))
xyz <- rbind(xyz25012013, xyz13122014, xyz30042014, xyz04072015)
xyz <- cbind(xyz, date)
xyz$date <- as.Date(xyz$date, format = "%d%m%Y")
print(xyz)

Answer 1

我认为这样做符合您的要求并且应该相对有效：

##  Create a file list to operate on:
files <- list.files(path=".", pattern="*.csv")

##  Read in our data from each CSV into a list structure:
csvs <- lapply(files, function(x) { 
  d <- read.csv(x); 
  d$date <- as.Date(substr(x,4,11), format="%d%m%Y"); 
  d 
})

##  rbind our CSV data together:
d <- do.call(rbind, csvs)

结果：

> head(d)
  X province high low      dates
1 1       AB   26   4 2015-07-04
2 2       BC    7   7 2015-07-04
3 3       ON   90   3 2015-07-04
4 1       AB   20   1 2014-12-13
5 2       BC   34   8 2014-12-13
6 3       ON   25   3 2014-12-13

Answer 2

假设您的所有文件都在＆＃34; test＆＃34;文件夹：

library(readr)
files = list.files("test/")
dd = vector("list", length = length(files))
for (i in seq_along(files)){
  dd[[i]] = read_csv(file = paste0("test/", files[i]))
  dd[[i]]$date = as.Date(substr(files[i], 4, 11), format = "%d%m%Y")
}

merged = do.call(rbind, dd)

rbind（）数百个带文件名元数据的.CSV

2 个答案: