基于多个条件创建新变量

时间:2021-06-15 15:47:41

标签: r

我正在尝试根据某些条件创建一个新变量。我有以下数据:

df <- data.frame(ID = c("A1","A1","A2","A2","A3","A4","A4"),
                 type = c("small","large","small","large","large","small","large"),
                 code = c("B9", "[0,20]","B9","[20,40]","[0,20]","B9","[40,60]" ))

给出:

    ID  type   code
1   A1  small   B9
2   A1  large   [0,20]
3   A2  small   B9
4   A2  large   [20,40]
5   A3  large   [0,20]
6   A4  small   B9
7   A4  large   [40,60]

我想创建一个基于 type == largecode 的相应值的新变量 (code2),同时按 ID< /em>。所以 ID - A1 应该有 [0,20] 作为它的 code2。我想实现以下目标:

    ID  type   code       code2
1   A1  small   B9        [0,20]    
2   A1  large   [0,20]    [0,20] 
3   A2  small   B9        [20,40]
4   A2  large   [20,40]   [20,40]
5   A3  large   [0,20]    [0,20] 
6   A4  small   B9        [40,60]
7   A4  large   [40,60]   [40,60]

据我所知,我正在尝试使用 dplyrifelse,但没有成功。

2 个答案:

答案 0 :(得分:4)

我们可以在dplyr中使用group by操作,即按'ID'分组,提取'type'值为“large”的'code'(假设里面没有'type'的重复值每个“ID”

library(dplyr)
df <- df %>% 
   group_by(ID) %>%
   mutate(code2 = code[type == 'large']) %>%
   ungroup

-输出

df
# A tibble: 7 x 4
  ID    type  code    code2  
  <chr> <chr> <chr>   <chr>  
1 A1    small B9      [0,20] 
2 A1    large [0,20]  [0,20] 
3 A2    small B9      [20,40]
4 A2    large [20,40] [20,40]
5 A3    large [0,20]  [0,20] 
6 A4    small B9      [40,60]
7 A4    large [40,60] [40,60]

如果有重复,使用match,它会给出第一个匹配索引的索引

df <- df %>%
       group_by(ID) %>%
       mutate(code2 = code[match('large', type)]) %>%
       ungroup

答案 1 :(得分:1)

data.table 选项

> setDT(df)[, code2 := code[type == "large"], ID][]
   ID  type    code   code2
1: A1 small      B9  [0,20]
2: A1 large  [0,20]  [0,20]
3: A2 small      B9 [20,40]
4: A2 large [20,40] [20,40]
5: A3 large  [0,20]  [0,20]
6: A4 small      B9 [40,60]
7: A4 large [40,60] [40,60]