Question

我想使用dplyr :: mutate有条件地填充一列。新变量的一个级别应对应于上一列中是否存在某个值，而另一级别为“其他”条件。

我有一个数据框：

         group     piece      answer         agreement
        group1     A          noise       good 
        group1     A          silence     good
        group1     A          silence     good
        group1     B          silence     bad
        group1     B          loud_noise  bad
        group1     B          noise       bad
        group1     B          loud_noise  bad
        group1     B          noise       bad
        group2     C          silence     good
        group2     C          silence     good

我想按组创建一个新的变量分组，如果“协议”中出现“不良”，则该值应为“不一致”，但如果“协议”的所有值均为“良好”，则该值应该是“一致的”。

        group     piece      answer     agreement   new_agreement
        group1     A          noise       good       bad
        group1     A          silence     good       bad
        group1     A          silence     good       bad
        group1     B          silence     bad        bad
        group1     B          loud_noise  bad        bad
        group1     B          noise       bad        bad
        group1     B          loud_noise  bad        bad
        group1     B          noise       bad        bad
        group2     C          silence     good       good
        group2     C          silence     good       good

但是case_when并没有那么做-只是再次复制了相同的变量：

   newdf <- df %>%
    group_by(group) %>%
    mutate(new_agreement = case_when(agreement == 'bad' ~
    "inconsistent", agreement =='good' ~ "consistent")) %>%
    as.data.frame()

Answer 1

只需添加any(agreement == 'bad')

df %>%
  group_by(group) %>%
  mutate(new_agreement = case_when(any(agreement == 'bad') ~"inconsistent",
                                   agreement =='good' ~ "consistent"))
    # A tibble: 10 x 5
    # Groups:   group [2]
       group  piece answer     agreement new_agreement
       <fct>  <fct> <fct>      <fct>     <chr>        
     1 group1 A     noise      good      inconsistent 
     2 group1 A     silence    good      inconsistent 
     3 group1 A     silence    good      inconsistent 
     4 group1 B     silence    bad       inconsistent 
     5 group1 B     loud_noise bad       inconsistent 
     6 group1 B     noise      bad       inconsistent 
     7 group1 B     loud_noise bad       inconsistent 
     8 group1 B     noise      bad       inconsistent 
     9 group2 C     silence    good      consistent   
    10 group2 C     silence    good      consistent

您甚至可以将if_else与any一起使用：

df %>% 
  group_by(group) %>% 
  mutate(new_agreement= if_else(any(agreement=="bad"), "inconsistent", "consistent") )

Answer 2

对于case_when，请使用any。

library(dplyr)

df %>%
  group_by(group) %>%
  mutate(new_agreement = case_when(
    any(agreement == 'bad') ~ 'inconsistent',
    TRUE ~ 'consistent'))
## A tibble: 10 x 5
## Groups:   group [2]
#   group  piece answer     agreement new_agreement
#   <fct>  <fct> <fct>      <fct>     <chr>        
# 1 group1 A     noise      good      inconsistent 
# 2 group1 A     silence    good      inconsistent 
# 3 group1 A     silence    good      inconsistent 
# 4 group1 B     silence    bad       inconsistent 
# 5 group1 B     loud_noise bad       inconsistent 
# 6 group1 B     noise      bad       inconsistent 
# 7 group1 B     loud_noise bad       inconsistent 
# 8 group1 B     noise      bad       inconsistent 
# 9 group2 C     silence    good      consistent   
#10 group2 C     silence    good      consistent

dput格式的数据。

df <-
structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 2L), .Label = c("group1", "group2"), 
class = "factor"), piece = structure(c(1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 3L, 3L), .Label = c("A", "B", "C"), 
class = "factor"), answer = structure(c(2L, 3L, 3L, 
3L, 1L, 2L, 1L, 2L, 3L, 3L), .Label = c("loud_noise", 
"noise", "silence"), class = "factor"), agreement = 
structure(c(2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L), 
.Label = c("bad", "good"), class = "factor")), 
class = "data.frame", row.names = c(NA, -10L))

如果前一列包含值，则有条件地填充列？

2 个答案: