R - 如果新日期列与其他列

时间:2018-02-08 05:38:37

标签: r

我在用户广告系列中有如下数据框,但有以下详细信息:

列是: email_address,response_date,campaign_name,州,郊区,邮政编码,magazine_subs_title,response_type

 email_address  response_date   campaign_name   state   suburb  postcode    magazine_subs_title response_type
jow.wow@gmail.com 1/02/2018 18:01   2018_Beauty_Acq NSW Sydney  2000    AGL opened
dew.jones@yahoo.id 03/10/2017  14:00:00 2017_Fashion_Show   QLD Brisbane    4000    MHI delivered
dew.jones@yahoo.id 03/10/2017  17:00:00 2017_Fashion_Show   QLD Brisbane    4000    MHI opened
jow.wow@gmail.com 25/01/2018 9:00 2018_Beauty_Acq   NSW Sydney  2000    AGL delivered
jow.wow@gmail.com 14/07/2017 11:00  2017_Fashion_Show   NSW Sydney  2000    AGL delivered

从这里开始,我想提取response_date,其中response_type ='已发送'并针对每个广告系列具体说明,并以下表结束:

email_address   response_date   campaign_name   state   suburb  postcode    magazine_subs_title response_type   delivered_date
jow.wow@gmail.com   1/02/2018 18:01 2018_Beauty_Acq NSW Sydney  2000    AGL opened  25/01/2018 9:00
dew.jones@yahoo.id  03/10/2017  14:00:00 PM 2017_Fashion_Show   QLD Brisbane    4000    MHI delivered   03/10/2017  14:00:00 PM
dew.jones@yahoo.id  03/10/2017  17:00:00 PM 2017_Fashion_Show   QLD Brisbane    4000    MHI opened  03/10/2017  14:00:00 PM
jow.wow@gmail.com   25/01/2018 9:00 2018_Beauty_Acq NSW Sydney  2000    AGL delivered   25/01/2018 9:00
jow.wow@gmail.com   14/07/2017 11:00    2017_Fashion_Show   NSW Sydney  2000    AGL delivered   14/07/2017 11:00

这有意义吗?

任何人都知道如何在R中执行这种操作? 谢谢

1 个答案:

答案 0 :(得分:1)

一种方法可能是使用lubridatetidyrdplyr

方法是首先准备数据。分别阅读response_dateTime,然后unite列到response_date。然后使用parse_date_time将这两列转换为datetime格式,这是可选的(因为OP在此日期没有做出任何决定)。最后,应用ifelse填充delivered_date

#Data

df <- read.table(text = "
email_address  response_date Time  campaign_name   state   suburb  postcode    magazine_subs_title response_type
jow.wow@gmail.com 1/02/2018 18:01   2018_Beauty_Acq NSW Sydney  2000    AGL opened
dew.jones@yahoo.id 03/10/2017  14:00:00 2017_Fashion_Show   QLD Brisbane    4000    MHI delivered
dew.jones@yahoo.id 03/10/2017  17:00:00 2017_Fashion_Show   QLD Brisbane    4000    MHI opened
jow.wow@gmail.com 25/01/2018 9:00 2018_Beauty_Acq   NSW Sydney  2000    AGL delivered
jow.wow@gmail.com 14/07/2017 11:00  2017_Fashion_Show   NSW Sydney  2000    AGL delivered", header=T, stringsAsFactor = F)

library(lubridate)
library(dplyr)
library(tidyr)

df %>%
   unite("response_date", c("response_date", "Time"), sep= " ") %>%
   mutate(response_date = parse_date_time(response_date, c("dmy HMS", "dmy HM"))) %>%
   mutate(delivered_date = ifelse(grepl("delivered",response_type), as.character(response_date), NA)) %>%
  group_by(campaign_name, state, suburb, postcode) %>% 
  fill(delivered_date) %>% ungroup() %>% 
  as.data.frame()    
Result:
   email_address       response_date     campaign_name state   suburb postcode magazine_subs_title response_type      delivered_date
#1  jow.wow@gmail.com 2017-07-14 11:00:00 2017_Fashion_Show   NSW   Sydney     2000                 AGL     delivered 2017-07-14 11:00:00
#2 dew.jones@yahoo.id 2017-10-03 14:00:00 2017_Fashion_Show   QLD Brisbane     4000                 MHI     delivered 2017-10-03 14:00:00
#3 dew.jones@yahoo.id 2017-10-03 17:00:00 2017_Fashion_Show   QLD Brisbane     4000                 MHI        opened 2017-10-03 14:00:00
#4  jow.wow@gmail.com 2018-02-01 18:01:00   2018_Beauty_Acq   NSW   Sydney     2000                 AGL        opened                <NA>
#5  jow.wow@gmail.com 2018-01-25 09:00:00   2018_Beauty_Acq   NSW   Sydney     2000                 AGL     delivered 2018-01-25 09:00:00