我是R和stackoverflow的新手。它由Name和ActivityDate排列。我正在尝试将LastSaleDate添加到我的数据框中。我试图只在该组内(通过名称)获取LastSaleDate。我尝试用dplyr做一些事但无济于事。非常感谢您的帮助。
Name ActivityType ActivityDate SalesAmount LastSaleDate(Desired)
John Email 1/1/2014 NA NA
John Sale 2/1/2014 1000 NA
John Sale 3/1/2014 2000 2/1/2014
John Seminar 4/1/2014 NA 3/1/2014
John Webinar 5/1/2014 NA 3/1/2014
Tom Email 1/1/2014 NA NA
Tom Sale 2/1/2015 1000 NA
Tom Sale 3/1/2015 2000 2/1/2015
Tom Seminar 4/1/2015 NA 3/1/2015
Tom Webinar 5/1/2015 NA 3/1/2015
答案 0 :(得分:1)
以这种方式:
require(zoo)
custlife %>%
group_by(Name) %>%
mutate(lastsale=na.locf(lag(ifelse(ActivityType=="Sale",ActivityDate,NA)),na.rm=FALSE))
似乎匹配:
Source: local data frame [10 x 6]
Groups: Name
Name ActivityType ActivityDate SalesAmount LastSaleDate.Desired. lastsale
1 John Email 1/1/2014 NA NA NA
2 John Sale 2/1/2014 1000 NA NA
3 John Sale 3/1/2014 2000 2/1/2014 2/1/2014
4 John Seminar 4/1/2014 NA 3/1/2014 3/1/2014
5 John Webinar 5/1/2014 NA 3/1/2014 3/1/2014
6 Tom Email 1/1/2014 NA NA NA
7 Tom Sale 2/1/2015 1000 NA NA
8 Tom Sale 3/1/2015 2000 2/1/2015 2/1/2015
9 Tom Seminar 4/1/2015 NA 3/1/2015 3/1/2015
10 Tom Webinar 5/1/2015 NA 3/1/2015 3/1/2015
工作原理:
lag
用于查看滞后值ifelse
替换NA
,其中滞后值不可用na.locf
的zoo
填写NA
s,其中包含最新值(如果有)