返回日期范围(按组)

时间:2019-07-19 13:44:41

标签: r date aggregate

我想按颜色分组并计算该颜色的日期范围。我已经尝试过group_by()summarize()aggregate()

#Data:
df1 <- as.Date(c('Jul 1', 'Jun 26', 'July 5', 'July 15'), format = '%B %d')
df2 <- c("red", "blue", "red", "blue")

df1 <- data.frame(df1,df2)

我想要得到的:

#  Group.1   x
[1]   4     red
[2]   19    blue 

我一直在尝试:

df <- aggregate(df1[,1], list(df1[,2]), as.numeric(max(df1[,1]) - min(df1[,1]), units="days"))

我已经测试过as.numeric(max(df1[,1]) - min(df1[,1]), units="days"),它会返回我要寻找的值,但我不知道如何为每种颜色返回该值。

下面是我的错误消息,但实际上,我只是以错误的方式进行操作。

 Error in match.fun(FUN) : 
      'as.numeric(max(df1$date) - min(df1$date), units = "days")' is not a function, character or symbol

仔细阅读aggregate()文档后,我尝试使用formula =作为最后一个参数,并返回以下错误:

Error in match.fun(FUN) : argument "FUN" is missing, with no default

3 个答案:

答案 0 :(得分:2)

使用dplyr

 df1 %>% 
   group_by(df2) %>% 
   summarise(Range=max(df1) - min(df1))
# A tibble: 2 x 2
  df2   Range  
  <fct> <drtn> 
1 blue  19 days
2 red    4 days

答案 1 :(得分:1)

使用aggregate

aggregate(df1~ df2, df1, function(x) diff(range(x)))

请注意,“ df1”的列名分别是“ df1”和“ df2”,这会造成一些混乱。相反,最好创建数据屁股

df1 <- data.frame(x = df1, Group = df2)

然后使用公式方法

aggregate(x~ Group, df1, diff)

答案 2 :(得分:1)

require(dplyr)

df001 <- as.Date(c('Jul 1', 'Jun 26', 'July 5', 'July 15'), format = '%B %d')
df002 <- c("red", "blue", "red", "blue")

df003 <- data.frame(df001,df002)


df003 %>%  rename(dates = df001, colors = df002) %>% 
  group_by(colors) %>% 
  summarise(min_date = min(dates), max_date = max(dates)) %>%  
  mutate(range = max_date - min_date) %>%  
  select(colors, range)


# 
# # A tibble: 2 x 2
# colors range 
# <fct>  <time>
#   1 blue   19    
#   2 red    4