组内的条件语句

时间:2018-12-03 09:40:04

标签: r dataframe group-by conditional

我有一个数据框,我想在其中创建一个新列,并使用基于组中条件的值。因此,对于下面的数据框,我想创建一个新列n_actions,它给出

条件1。对于整个组GROUP,如果STEP列中出现6,则为2 条件2。对于整个组GROUP,如果STEP列中出现9,则数字3 条件3。如果GROUP的STEP列中未显示6或9,则为1

#dataframe start
dataframe <- data.frame(group = c("A", "A", "A", "B", "B", "B", "B", "B", "B", "C", "C", "C", "D", "D", "D", "D", "D", "D", "D", "D", "D"),
               step = c(1, 2, 3, 1, 2, 3, 4, 5, 6, 1, 2, 3, 1, 2, 3, 4, 5, 6, 7, 8, 9))

# dataframe desired
dataframe$n_actions <- c(rep(1, 3), rep(2, 6,), rep(1, 3), rep(3, 9))

3 个答案:

答案 0 :(得分:2)

尝试:

library(dplyr)
dataframe %>% 
        group_by(group) %>%
        mutate(n_actions = ifelse(9 %in% step, 3, 
                                  ifelse(6 %in% step, 2, 1)))
# A tibble: 21 x 3
# Groups:   group [4]
    group  step n_actions
   <fctr> <dbl>     <dbl>
 1      A     1         1
 2      A     2         1
 3      A     3         1
 4      B     1         2
 5      B     2         2
 6      B     3         2
 7      B     4         2
 8      B     5         2
 9      B     6         2
10      C     1         1
# ... with 11 more rows

答案 1 :(得分:1)

您似乎可以将每个组的最大值除以%/% 3

dataframe <- transform(dataframe,
                       n_actions2 = ave(step, group, FUN = function(x) max(x) %/% 3))
dataframe
#   group step n_actions n_actions2
#1      A    1         1          1
#2      A    2         1          1
#3      A    3         1          1
#4      B    1         2          2
#5      B    2         2          2
#6      B    3         2          2
#7      B    4         2          2
#8      B    5         2          2
#9      B    6         2          2
#10     C    1         1          1
#11     C    2         1          1
#12     C    3         1          1
#13     D    1         3          3
#14     D    2         3          3
#15     D    3         3          3
#16     D    4         3          3
#17     D    5         3          3
#18     D    6         3          3
#19     D    7         3          3
#20     D    8         3          3
#21     D    9         3          3

答案 2 :(得分:1)

使用dplyr的{​​{1}}的另一种方式:

case_when

输出:

library(dplyr)

dataframe %>% 
  group_by(group) %>%
  mutate(
    n_actions1 = case_when(
      9 %in% step ~ 3,
      6 %in% step ~ 2,
      TRUE ~ 1
    )
  )