对于以下样本数据集,我需要在第一次购买(CustomerStatus =已购买)后删除客户(CustomerID)的任何行。有些客户不购买该产品,我仍然希望保留对这些客户的任何观察。日期变量很重要。
我在删除组内的行时遇到困难。原始数据的分组不如此,我试图简化我遇到的问题。任何帮助表示赞赏。
我提供了一个样本数据集:
SalesPerson CustomerID Date CustomerStatus
Amanda 2000 1/5/2017 Intro
Amanda 2000 1/6/2017 Email
Amanda 2000 1/15/2017 PhoneCall
Amanda 2000 2/15/2017 Purchased
Amanda 2001 1/3/2017 Intro
Amanda 2001 1/4/2017 Email
Amanda 2001 1/12/2017 PhoneCall
Amanda 2001 1/15/2017 Conference
Amanda 2001 2/4/2017 Purchased
Amanda 2001 3/17/2017 Meeting
Amanda 2001 3/20/2017 Email
Kyle 2002 1/19/2017 Intro
Kyle 2002 1/20/2017 Email
Kyle 2002 1/21/2017 PhoneCall
Sharon 2006 1/8/2017 Intro
Sharon 2006 1/10/2017 Meeting
Sharon 2006 1/19/2017 Purchased
Sharon 2006 1/30/2017 Conference
Sharon 2006 2/10/2017 Purchased
输出应该是这样的:
SalesPerson CustomerID Date CustomerStatus
Amanda 2000 1/5/2017 Intro
Amanda 2000 1/6/2017 Email
Amanda 2000 1/15/2017 PhoneCall
Amanda 2000 2/15/2017 Purchased
Amanda 2001 1/3/2017 Intro
Amanda 2001 1/4/2017 Email
Amanda 2001 1/12/2017 PhoneCall
Amanda 2001 1/15/2017 Conference
Amanda 2001 2/4/2017 Purchased
Kyle 2002 1/19/2017 Intro
Kyle 2002 1/20/2017 Email
Kyle 2002 1/21/2017 PhoneCall
Sharon 2006 1/8/2017 Intro
Sharon 2006 1/10/2017 Meeting
Sharon 2006 1/19/2017 Purchased
答案 0 :(得分:2)
我们可以按'SalesPerson','CustomerID'进行分组,为filter
创建逻辑索引
library(dplyr)
df1 %>%
group_by(SalesPerson, CustomerID) %>%
filter(cumsum(lag(CustomerStatus == "Purchased", default = FALSE))<1)
# A tibble: 15 x 4
# Groups: SalesPerson, CustomerID [4]
# SalesPerson CustomerID Date CustomerStatus
# <chr> <int> <chr> <chr>
# 1 Amanda 2000 1/5/2017 Intro
# 2 Amanda 2000 1/6/2017 Email
# 3 Amanda 2000 1/15/2017 PhoneCall
# 4 Amanda 2000 2/15/2017 Purchased
# 5 Amanda 2001 1/3/2017 Intro
# 6 Amanda 2001 1/4/2017 Email
# 7 Amanda 2001 1/12/2017 PhoneCall
# 8 Amanda 2001 1/15/2017 Conference
# 9 Amanda 2001 2/4/2017 Purchased
#10 Kyle 2002 1/19/2017 Intro
#11 Kyle 2002 1/20/2017 Email
#12 Kyle 2002 1/21/2017 PhoneCall
#13 Sharon 2006 1/8/2017 Intro
#14 Sharon 2006 1/10/2017 Meeting
#15 Sharon 2006 1/19/2017 Purchased