Question

我有一个包含4列（YY，MM，DD，RR）的csv文件。以下是示例数据：

目前的格式有9861行和4个这样的列（1981 - 2007年的每日数据）：

data = {"categoryID": self.categoryID,
                "streamID": streamId,
                "APIKey": self.apikey,
                "callback": "foo",
                "threadLimit": 1000   # assume all the articles have no more then 1000 comments
                }
r =   urlopen("http://comments.us1.gigya.com/comments.getComments", data=urlencode(data).encode("utf-8"))
comments_lst = loads(r.read().decode("utf-8"))["comments"]

我想每年拆分csv文件。输出应该是27个具有相同列数的csv文件。例如，1981.csv包含：

YY,MM,DD,RR
1981,1,1,0
1981,1,2,0
1981,1,3,-9999
1981,1,4,-9999
1981,1,5,0
1981,1,6,0
.....
.....
2007,1,31,-9999

这是我的剧本：

YY, MM, DD, RR
1981, 1, 1, 0.4
1981, 1, 2, 0
.....
.....
1981, 12, 31, 0.5

我想逐行绑定输出csv文件，使输出看起来像这样：

dat <- read.csv("test_dat.csv", header = T, sep = ",")
spt1<-split(dat,dat$YY)
lapply(names(spt1), function(x){write.csv(spt1[[x]], file = paste0("output",x, sep = "",".csv"),row.names=F)})

闰年应该有366天。

R中是否容易做到这一点？

我会感激任何帮助。

Answer 1

假设你有这样的数据帧，你可以运行一个循环

YY <- seq(1981, 2007,1)     #Defining years
RR <- runif(27,0,30)        #Defining another random column, replace this with your respective columns

df <- data.frame(YY,RR)     #created data frame
df$YY <- as.factor(df$YY)   #ignore this step if your year column is already a factor

for (i in levels(df$YY)) {      #run a for loop for each year
  year <- i
  df.subset <- df[df$YY %in% year,]     #subset your data as per year
  write.csv(df.subset,file = paste(year,"csv",sep = "."))   #save the subset df in a new file with year as file name
}

根据列中的值拆分csv，然后按行合并/绑定输出

1 个答案: