根据其他列中的条件创建新列

时间:2018-09-10 14:21:20

标签: r

我有一个B2B客户数据集,我们希望在授予客户访问我们的网上商店后,能够衡量客户的入职流程。公司可以有许多已被授予访问权限的用户。我想创建另一个名为“入职”的列,其条件为“首次登录日期”,如果来自任何给定公司的用户是首次登录,则我们会将该公司或客户分类为入职,其值是“是”,否则为“否”。这个#表示他们尚未登录。我不确定如何在R中解决此问题。请问有什么可以帮助我的吗? ^^

示例附有图片:

data frame

data frame with new column

2 个答案:

答案 0 :(得分:1)

这就是您想要的...

可重复报告的第一批数据:

PNG

要回答您的原始问题,我只需使用基本ifelse()函数:

 dat <- data.frame(Company  = c("A","A","A","A","A","B","B","B","B","C","C","C","C","D","D","D","D"),
                   UserID    = c("Simon","Hans","Jane","Alex","David","Dan","Sarah","Susan","Bob","Keith",
                              "Harry","Adam","Kenneth","Denial","Henna","John","Dylan"),
                   First_Log_in_Date = c("2018-02-22","#","2018-03-07","2018-04-29","#","#","#",
                                                    "2018-05-01","2018-02-27","2018-06-08","2018-07-12",
                                                    "2018-02-21","#","#","#","#","#"), 
                   stringsAsFactors = F)

然后,根据登录日期,我们将在结果“入职”列中填充“是”或“否”。

要回答基于条件的第二个问题,我只需使用“ dplyr”包函数:

dat$Onboarding <- ifelse(dat$First_Log_in_Date=="#", "NO", "YES")

我们获得的结果“入职”(Onboarding)列中填充“是”或“否”,这取决于集团公司中任何一家公司的员工的登录日期,而不仅仅是“#”。

该表将如下所示:

enter image description here

答案 1 :(得分:0)

换句话说,对于给定的公司,如果所有用户的“首次登录日期”均为“#”,则该公司尚未加入。正确吗?

您可以使用split-apply-combine方法解决此类问题:

#### Data ####
 my_df <- data.frame(Company  = c("A","A","A","A","A","B","B","B","B","C","C","C","C","D","D","D","D"),
                  UserID     = c("Simon","Hans","Jane","Alex","David","Dan","Sarah","Susan","Bob","Keith",
                              "Harry","Adam","Kenneth","Denial","Henna","John","Dylan"),
                              First_Log_in_Date = c("2018-02-22","#","2018-03-07","2018-04-29","#","#","#",
                                                    "2018-05-01","2018-02-27","2018-06-08","2018-07-12",
                                                    "2018-02-21","#","#","#","#","#"))

#### Split - Apply - Combine ####
my_df %>% split(., .$Company) %>% lapply(function(company_df) {
    # "Check if any user logged in
    if(any(company_df$First_Log_in_Date != "#")) {
        company_df$onboarded <- T
        return(company_df)
    }
    company_df$onboarded <- F
    return(company_df)
}) %>% do.call(rbind, .)