我目前有一个矢量,其中包含一个指向诸如以下文件的路径的列表:
files <- c("C:/Users/Me/Desktop/cc/canada/2016/Ontario.BRU",
"C:/Users/Me/Desktop/cc/canada/2017/Ontario.BRU",
"C:/Users/Me/Desktop/cc/canada/2018/Ottawa.BRU",
"C:/Users/Me/Desktop/cc/canada/2018/Ontario.BRU")
我想将以同一城市结尾的文件一个接一个地组合到同一数据框中。如果只有一个城市出现,我仍然会在最后将数据框另存为csv文件。这是我刚刚开始的代码:
cad<-NULL
for(b in 1:length(files)){
country<-sub(".*/ *(.*?) */[[:digit:]].*", "\\1", files[b])
if(country=="canada"){
cad<-c(cad, files[b])
}
cad_cities <- unique((sub(".*/ *(.*?) *.BRU.*", "\\1", cad)))
for(c in 1:length(cad_cities)){
city<-sub(".*/ *(.*?) *.BRU.*", "\\1", cad)
}
}
我被困在这部分之后。谢谢。
编辑:数据文件示例
2018,1,0,9999,-20.70,-23.00,-22.10,81.00,0.00,000,-991,-991,-991,-2.41,-991,-991,8.90,353,97.36,-991,-991,19.00,-991
2018,1,100,9999,-21.40,-22.70,-22.00,80.00,0.00,100,-991,-991,-991,-2.42,-991,-991,7.80,264,97.36,-991,-991,18.00,-991
2018,1,200,9999,-21.40,-22.50,-21.90,79.00,0.00,200,-991,-991,-991,-2.42,-991,-991,10.30,270,97.34,-991,-991,19.00,-991
2018,1,300,9999,-20.80,-21.90,-21.40,78.00,0.00,300,-991,-991,-991,-2.43,-991,-991,10.70,263,97.32,-991,-991,18.00,-991
答案 0 :(得分:0)
可能类似于以下内容。(首先,运行问题中的代码。)
未经测试,因为没有数据文件。
for(cad in cad_cities){
tmp <- grep(cad, files, value = TRUE)
tmp <- lapply(tmp, read.table, sep = ",")
tmp <- do.call(rbind, tmp)
write.csv(tmp, file = paste0(cad, ".csv"), row.names = FALSE)
}
rm(tmp) # tidy up
答案 1 :(得分:0)
首先,从文件名中提取城市:
cities <- sub("\\.BRU", "", basename(files))
现在读取所有文件:
dataz <- lapply(files, read.csv, as.is=TRUE)
# it is usually good idea to add as.is
然后重新整理来自相同城市的数据:
lapply(split(dataz, cities), function(x) do.call(rbind,x))
此策略应该可以工作,但是可能需要稍作修改,因为未经测试。
[编辑]
带有随机数据的测试用例:
dataz <- lapply(1:4, function(iii) as.data.frame(replicate(3, rnorm(5))))
lapply(split(dataz, cities), function(x) do.call(rbind,x))