我有一个大型数据集,如下所示:
df1 <- data.frame(matrix(vector(),ncol=4, nrow = 3))
colnames(df1) <- c("Date","A","B","C")
df1[1,] <- c("2000-01-30","0","1","0")
df1[2,] <- c("2000-01-31","2","0","3")
df1[3,] <- c("2000-02-29","1","2","1")
df1[4,] <- c("2000-03-31","2","1","3")
df1
Date A A C
1 2000-01-30 0 1 0
2 2000-01-31 2 0 3
3 2000-02-29 1 2 1
4 2000-03-31 2 1 3
我想要:
Date A A C
1 2000-01 2 1 3
2 2000-02 1 2 1
3 2000-03 2 1 3
这是我做的:
library(zoo)
library(dplyr)
df1[,-1] = lapply(df1[,-1], as.numeric)
df1 %>% mutate(Date = as.yearmon(Date)) %>%
group_by(Date) %>%
summarise_each(funs(sum))
当没有重复的列名时,这很有用。但是,根据数据的大小,某些列可能具有相同的名称,从而导致错误found duplicated column name: A
。我不想组合列,我想得到如上所述的结果。请指教。