如何按季度对数据框中的行进行分组?

时间:2019-10-19 09:54:54

标签: r dataframe

我有一个具有213行和2列(日期 Article )的数据框。最终目的是通过按季度对日期进行分组来减少行数。显然,我希望 Article 列中的文本进行相应的合并。

让我们举个例子。

Date <- c("2000-01-05", "2000-02-03", "2000-03-02", "2000-03-30", "2000-04-13", "2000-05-11", "2000-06-08", "2000-07-06", "2000-09-14", "2000-10-05", "2000-10-19", "2000-11-02", "2000-12-14")
Article <- c("Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text")

Date <- data.frame(Date)
Article <- data.frame(Article)

df <- cbind(Date, Article)

#Dataframe

Date           Article
1  2000-01-05 Long Text
2  2000-02-03 Long Text
3  2000-03-02 Long Text
4  2000-03-30 Long Text
5  2000-04-13 Long Text
6  2000-05-11 Long Text
7  2000-06-08 Long Text
8  2000-07-06 Long Text
9  2000-09-14 Long Text
10 2000-10-05 Long Text
11 2000-10-19 Long Text
12 2000-11-02 Long Text
13 2000-12-14 Long Text

我想要获得的最终输出如下:

Date         Article
1  2000 Q1   Long Text, Long Text, Long Text, Long Text
2  2000 Q2   Long Text, Long Text, Long Text
3  2000 Q3   Long Text, Long Text
4  2000 Q4   Long Text, Long Text, Long Text, Long Text

从本质上讲,行已按季度和相应的文本分组在一起。

我试图环顾四周,但不幸的是我不知道该怎么做。

有人可以帮我吗?

谢谢!

3 个答案:

答案 0 :(得分:3)

一个dplyrlubridate选项可以是:

df %>%
 group_by(Date = as.character(lubridate::quarter(ymd(Date), with_year = TRUE))) %>%
 summarise(Article = paste0(Article, collapse = ",")) 

  Date   Article                                
  <chr>  <chr>                                  
1 2000.1 Long Text,Long Text,Long Text,Long Text
2 2000.2 Long Text,Long Text,Long Text          
3 2000.3 Long Text,Long Text                    
4 2000.4 Long Text,Long Text,Long Text,Long Text

答案 1 :(得分:2)

我们可以使用as.yearqtr中的zoo进行总结

library(zoo)
library(data.table)
setDT(df)[, .(Article = toString(Article)),.(Date = as.yearqtr(as.IDate(Date)))]
#   Date                                    Article
#1: 2000 Q1 Long Text, Long Text, Long Text, Long Text
#2: 2000 Q2            Long Text, Long Text, Long Text
#3: 2000 Q3                       Long Text, Long Text
#4: 2000 Q4 Long Text, Long Text, Long Text, Long Text

答案 2 :(得分:1)

Base R解决方案:

# Row-wise concatenate Article vec by the group of year & qtr: 

aggregate(list(Article = df$Article),

          by = list(Date = paste(gsub("[-].*", "", df$Date), quarters(df$Date), sep = " ")),

          paste, sep = ", ")

数据:

df <- data.frame(Date = as.Date(c("2000-01-05",
                                   "2000-02-03",
                                   "2000-03-02",
                                   "2000-03-30", "2000-04-13", "2000-05-11", "2000-06-08",
                                   "2000-07-06", "2000-09-14", "2000-10-05", "2000-10-19",
                                   "2000-11-02", "2000-12-14"),
                                 "%Y-%m-%d"),
            Article = c("Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text","Long Text"))