从单列数据获取开始和结束日期

时间:2019-02-08 20:33:24

标签: r

我正在尝试从作为销售虚拟变量的数据列中查找产品销售期的开始日期和结束日期。这是我正在使用的数据类型的代理: enter image description here

我正在寻找的结果是:

enter image description here

我正在处理的实际数据集远不止于此,并且不一定仅查看2010-01至2011-12。

谢谢!

1 个答案:

答案 0 :(得分:0)

假设每种产品只销售一次

require(tidyverse)

df <- data.frame(product = 'Product A', 
                 month = seq(as.Date('2010-01-01'),
                             as.Date('2010-10-01'),
                             by = 'month'
                             ),
                 onSale = c(rep(0,3), rep(1,4),rep(0,3))
                 )


df %>% 
  group_by(product) %>% 
  summarise(saleStart = month[which.min(month[onSale == 1])],
            salend    = month[which.max(month[onSale == 1])]
            )

编辑:

df <- data.frame(product = 'Product A', 
                 month = seq(as.Date('2010-01-01'),
                             as.Date('2011-09-01'),
                             by = 'month'
                             ),
                 onSale = c(rep(0,3), rep(1,4),rep(0,3), rep(1,4),rep(0,3), rep(1,4))
                 )


df %>% 
  group_by(product) %>% 
  mutate(diff = c(0,diff(onSale))) %>% 
  group_by(product, diff) %>% 
  filter(diff == 1) %>% 
  mutate(monthStart = month, monthEnd = month  %m+% months(1)) %>% 
  select(-month,-diff)