将简单的功能应用于数据

时间:2018-04-17 22:49:58

标签: r if-statement dplyr

您好我尝试将简单函数应用于数据以为组创建sub_id

test = data.frame(gr=gl(2,4), id =rep(c("Good","bad","ugly","dirty"),2),
                        count=c(175,1,13,11, 10,165,10,2))  


> test
  gr    id count
1  1  Good   175
2  1   bad     1
3  1  ugly    13
4  1 dirty    11
5  2  Good    10
6  2   bad   165
7  2  ugly    10
8  2 dirty     2

sub_id的条件是这样的

如果组号与count时的最小id==bad相等,则这些组sub_id为red flag其他(不满足此条件的其他组)green flag

所以我写了这个函数

  sub_id <- function(gr,count,id){
    if (gr==min(count)&id=="bad"){
      "red flag"

    }
    else
    "green flag"  
  }

并尝试了

library(dplyr)

  test%>%
    group_by(gr)%>%
    mutate(color=sub_id(gr,count,id))

给了我

# A tibble: 8 x 4
# Groups:   gr [2]
      gr     id count      color
  <fctr> <fctr> <dbl>      <chr>
1      1   Good   175 green flag
2      1    bad     1 green flag
3      1   ugly    13 green flag
4      1  dirty    11 green flag
5      2   Good    10 green flag
6      2    bad   165 green flag
7      2   ugly    10 green flag
8      2  dirty     2 green flag
Warning messages:
1: In if (gr == min(count) & id == "Bad") { :
  the condition has length > 1 and only the first element will be used
2: In if (gr == min(count) & id == "Bad") { :
  the condition has length > 1 and only the first element will be used

预期产出

      gr     id count      color
  <fctr> <fctr> <dbl>      <chr>
1      1   Good   175   red flag
2      1    bad     1   red flag
3      1   ugly    13   red flag
4      1  dirty    11   red flag
5      2   Good    10 green flag
6      2    bad   165 green flag
7      2   ugly    10 green flag
8      2  dirty     2 green flag

1 个答案:

答案 0 :(得分:2)

以下内容将再现您的预期输出。

test %>%
    group_by(gr) %>%
    mutate(colour = case_when(
        any(id == "bad" & gr == pmin(count)) ~ "red flag",
        TRUE ~ "green flag"
    ))
## A tibble: 8 x 4
## Groups:   gr [2]
#  gr    id    count colour
#  <fct> <fct> <dbl> <chr>
#1 1     Good   175. red flag
#2 1     bad      1. red flag
#3 1     ugly    13. red flag
#4 1     dirty   11. red flag
#5 2     Good    10. green flag
#6 2     bad    165. green flag
#7 2     ugly    10. green flag
#8 2     dirty    2. green flag

说明:我们按gr进行分组,然后使用case_when"red flag"组中标记所有条目,如果组中的任何位置id == "bad" }和gr == min(count)

请注意,我们需要使用向量化pmin(而不是标量min)。

更新

使用用户定义的函数:

sub_id <- function(gr, count, id) {
    ifelse(any(gr == pmin(count) & id == "bad"), "red flag", "green flag")
}
test %>%
    group_by(gr) %>%
    mutate(colour = sub_id(gr, count, id))