group_by并使用条件将代码应用于组中的每个元素

时间:2018-12-06 11:46:41

标签: r loops dplyr aggregate

我有这样的数据:

  ID  membership   AdultChild    
   1     1           A
   2     1           A 
   3     2           A  
   4     2           C  
   5     2           C
   6     3           A 
   7     3           A 
   :     :           : 

我想按会员资格分组并在计算 AdultChild 变量即

后应用“代码”
ID membership   AdultChild code
 1    1           A          x1
 2    1           A          x1
 3    2           A          x2
 4    2           C          x2
 5    2           C          x2
 6    3           A          x1
 7    3           A          x1
 :    :           :          :

我的条件类似于:

count <- function(x){
if(sum(x == "A") == 2 && sum(x == "C") == 0){
  code <<-  x1
}else if (sum(x == "A") == 1 & sum(x == "C") >= 1){
  code <<- x2
}else {
  code <<- X3
} 

我尝试使用dplyr进行分组和变异,使用上面的函数添加一个名为 code 的新变量。我还考虑过使用 aggregate 函数,但是运气不高。

df.2 <-  df %>% group_by(membership) 
         %>% mutate(n = count(AdultChild)) %>% 
         ungroup()

df.2 <-  aggregate.data.frame(df, by = membership, FUN = 
         count(df$AdultChild))

基本上,我想要一个新的变量,该变量使用某些条件决定,并在按成员资格分组时应用于每个ID。

先谢谢了。

2 个答案:

答案 0 :(得分:0)

library(dplyr)
df %>% group_by(membership) %>% 
       mutate(code=case_when(
        sum(AdultChild=='A', na.rm = T)==2 & sum(AdultChild=='C', na.rm = T)==0 ~ 'X1',
        sum(AdultChild=='A', na.rm = T)==1 & sum(AdultChild=='C', na.rm = T)>=1 ~ 'X2',
        TRUE ~ 'X3'
))
 # A tibble: 7 x 4
# Groups:   membership [3]
ID membership AdultChild code 
<int>      <int> <fct>      <chr>
1     1          1 A          X1   
2     2          1 A          X1   
3     3          2 A          X2   
4     4          2 C          X2   
5     5          2 C          X2   
6     6          3 A          X1   
7     7          3 A          X1

答案 1 :(得分:0)

 count <- function(x){
 if(sum(x == "A", na.rm = T) == 2 & sum(x == "C", na.rm = T) == 0){
  y <-  "4"
} else if (sum(x == "A", na.rm = T) > 2 & sum(x == "C", na.rm = T) == 0){
  y <- "5"
}else if (sum(x == "A", na.rm = T) == 1 & sum(x == "C", na.rm = T) >= 1){
  y <- "6"
}else if (sum(x == "A", na.rm = T) == 2 & sum(x == "C", na.rm = T) <= 3 & sum(x == "C", na.rm = T) >= 1){
  y <- "7"
}else {
  y <- "8"
}
}

df.2 <-  df %>% group_by(membership) %>% mutate(code = count(AdultChild)) %>% ungroup()