Question

我的数据框共有1000个员工详细信息它具有一列Tenure和Month_count，并且想要匹配其中Month1 = 1，Month2 = 2，Month3 = 3和Experienced = 4的条目。如果匹配，我只需要过滤这些行。

name<-c(rep("Bob", 4),rep("Dick", 6),rep("Jack",5),rep("ryan",4))
name<-as.data.frame(name)
Tenure<-c("Month1","Month2","Month3","Experienced","Month2","Month3","Experienced",
          "Experienced","Experienced","Experienced","Month1","Month2","Month3","Experienced","Experienced","Experienced","Experienced","Experienced","Experienced")
Tenure<-as.data.frame(Tenure)
Month_count<-c(seq(1:4),seq(2,7,by=1),seq(1:5),seq(1:4))
Month_count<-as.data.frame(Month_count)
total<-cbind(name,Tenure,Month_count)

下面的输入和必需的输出

如果有任何dplyr解决方案，我将不胜感激

Answer 1

您可以在filter中添加条件：

library(dplyr)
total %>%
  filter(Tenure == 'Month1' & Month_count == 1 | 
         Tenure == 'Month2' & Month_count == 2 |
         Tenure == 'Month3' & Month_count == 3 |
         Tenure == 'Experienced' & Month_count == 4)

#   name      Tenure Month_count
#1   Bob      Month1           1
#2   Bob      Month2           2
#3   Bob      Month3           3
#4   Bob Experienced           4
#5  Dick      Month2           2
#6  Dick      Month3           3
#7  Dick Experienced           4
#8  Jack      Month1           1
#9  Jack      Month2           2
#10 Jack      Month3           3
#11 Jack Experienced           4
#12 ryan Experienced           4

或者在subset中使用相同的代码将其保留在基数R中：

subset(total, Tenure == 'Month1' & Month_count == 1 | 
              Tenure == 'Month2' & Month_count == 2 |
              Tenure == 'Month3' & Month_count == 3 |
              Tenure == 'Experienced' & Month_count == 4)

Answer 2

我们可以使用Map

自动执行此操作

v1 <- c(paste0("Month", 1:3), "Experienced")
v2 <- 1:4
total[Reduce(`|`, Map(function(x, y) with(total,
             Tenure == x & Month_count ==y), v1, v2)),]
#   name      Tenure Month_count
#1   Bob      Month1           1
#2   Bob      Month2           2
#3   Bob      Month3           3
#4   Bob Experienced           4
#5  Dick      Month2           2
#6  Dick      Month3           3
#7  Dick Experienced           4
#11 Jack      Month1           1
#12 Jack      Month2           2
#13 Jack      Month3           3
#14 Jack Experienced           4
#19 ryan Experienced           4

或使用tidyverse

library(dplyr)
library(purrr)
total %>% 
      filter(map2(v1, v2, ~ Tenure == .x & Month_count == .y) %>%
               reduce(`|`))
#    name      Tenure Month_count
#1   Bob      Month1           1
#2   Bob      Month2           2
#3   Bob      Month3           3
#4   Bob Experienced           4
#5  Dick      Month2           2
#6  Dick      Month3           3
#7  Dick Experienced           4
#8  Jack      Month1           1
#9  Jack      Month2           2
#10 Jack      Month3           3
#11 Jack Experienced           4
#12 ryan Experienced           4

如何匹配列的两个字符串并过滤匹配的列值

2 个答案: