查找最后的ActivityDate,其中ActivityType =" Sale"

时间:2015-05-18 21:34:22

标签: r

我是R和stackoverflow的新手。它由Name和ActivityDate排列。我正在尝试将LastSaleDate添加到我的数据框中。我试图只在该组内(通过名称)获取LastSaleDate。我尝试用dplyr做一些事但无济于事。非常感谢您的帮助。

Name      ActivityType      ActivityDate    SalesAmount  LastSaleDate(Desired)          
John       Email            1/1/2014        NA            NA            
John       Sale             2/1/2014        1000          NA            
John       Sale             3/1/2014        2000          2/1/2014          
John       Seminar          4/1/2014        NA            3/1/2014          
John       Webinar          5/1/2014        NA            3/1/2014          
Tom        Email            1/1/2014        NA            NA            
Tom        Sale             2/1/2015        1000          NA            
Tom        Sale             3/1/2015        2000          2/1/2015          
Tom        Seminar          4/1/2015        NA            3/1/2015          
Tom        Webinar          5/1/2015        NA            3/1/2015          

1 个答案:

答案 0 :(得分:1)

以这种方式:

require(zoo)
custlife %>% 
  group_by(Name) %>% 
  mutate(lastsale=na.locf(lag(ifelse(ActivityType=="Sale",ActivityDate,NA)),na.rm=FALSE))

似乎匹配:

Source: local data frame [10 x 6]
Groups: Name

   Name ActivityType ActivityDate SalesAmount LastSaleDate.Desired. lastsale
1  John        Email     1/1/2014          NA                    NA       NA
2  John         Sale     2/1/2014        1000                    NA       NA
3  John         Sale     3/1/2014        2000              2/1/2014 2/1/2014
4  John      Seminar     4/1/2014          NA              3/1/2014 3/1/2014
5  John      Webinar     5/1/2014          NA              3/1/2014 3/1/2014
6   Tom        Email     1/1/2014          NA                    NA       NA
7   Tom         Sale     2/1/2015        1000                    NA       NA
8   Tom         Sale     3/1/2015        2000              2/1/2015 2/1/2015
9   Tom      Seminar     4/1/2015          NA              3/1/2015 3/1/2015
10  Tom      Webinar     5/1/2015          NA              3/1/2015 3/1/2015

工作原理:

  • lag用于查看滞后值
  • ifelse替换NA,其中滞后值不可用
  • 来自na.locf
  • zoo填写NA s,其中包含最新值(如果有)