Question

我有一个名为data

的数据集

  Model Garage        City    
  Honda      C     Chicago       
 Maruti      B      Boston  
Porsche      A    New York    
  Honda      B     Chicago  
  Honda      C    New York

这是100000行，我想通过汽车，位置和城市拆分这些数据，并将拆分文件保存在不同的csv中。

split(Data, with(Data, interaction(Model,City,Garage)), drop = TRUE)

现在这段代码使它成为一个列表。如何取消列出并保存所有拆分类型的不同csv文件

前本田将有三个拆分文件Honda C Chicago，Honda B Chicago和Honda C New York

由于

Answer 1

# create all combinations of data.frames possible based on unique values of Model, Garage, City
l = split(x, list(x$Model, x$Garage, x$City))

# create csv filrs only if data.frame had any rows in it
lapply(names(l), function(x) if(dim(l[[x]])[1] != 0){write.csv(l[[x]], paste0("path", x,".csv"))})

Answer 2

只需添加更多选项，即可使用data.table：

library(data.table)
x <- as.data.table(x)
x[, write.table(.SD, paste("path/file_", Model, "_", Garage, "_", City, ".csv", sep = "")), by = c("Model", "Garage", "City")]

Answer 3

你可以轻松使用循环。这不应该是100k左右的问题。

x <- read.table(text = "Model, Garage, City
                   Honda, C, Chicago
                   Maruti, B, Boston
                   Porsche, A, New York
                   Honda, B, Chicago
                   Honda, C, New York", sep = ",", header = TRUE)
x   
#   Model Garage      City
#   Honda      C   Chicago
#  Maruti      B    Boston
# Porsche      A  New York
#   Honda      B   Chicago
#   Honda      C  New York

library(dplyr)

您只需从您的filter遍历模型，车库和城市的所有独特组合data.frame，然后将临时data.frame导出到csv表。

uni <- unique(x[,c("Model", "Garage", "City")])

for (j in 1:nrow(uni)) {
  i <- uni[j,]
  tmp <- x %>% filter(Model == i$Model, Garage == i$Garage, City == i$City)

  write.table(tmp, paste(i$Model, "_", i$City, "_", i$Garage, ".csv"))
}

在r中拆分数据并将所有拆分文件保存在csv中

3 个答案: