使用嵌套的ifelse条件按突变添加列

时间:2019-08-05 07:44:00

标签: r dplyr mutate

我正尝试根据多种条件使用source2mutate在我的长格式数据集中创建一个新列plyr

为了获得新的价值,我正在寻找小组中的众多因素。如果只有一个级别,请使用此级别。但是,如果存在距离值的组合,请进行相应设置:

distance == "b20" & distance == "b5"  =>  "buffer",
distance == "PA" & distance == "b5")  =>  "pa_buff20",
distance == "PA" & distance == "b20") =>  "pa_buff500"

我认为嵌套的ifelse语句应该这样做,但是看来我的组合不起作用。是因为我先检查级别数吗? (组号的第一个ifelse也是为什么我可能不能简单地使用case_when的原因。)

我的虚拟数据集:

# how to find a year when a lag value overpass the certain threshold
df<- data.frame(year = c(1,1,2,1,5,5,10),
                distance = c("b20", "b5", "b20", "b20", "PA", "b5", "PA"),
                site     = c("a", "a", "b", "c", "d", "d", "e"))


# Create new columns based on number of levels in `distance`
df %>% 
  group_by(site) %>% 
  mutate(source = ifelse(n_distinct(distance) == 1,   # create source column based of number of factors
                       as.character(distance[1]), 'unclear')) %>% 
  mutate(source2 = ifelse(n_distinct(distance) == 1,   # create source column based of number of factors
                          as.character(distance[1]), 
                          ifelse(distance == "b20" & distance == "b5"), "buffer",
                          ifelse(distance == "PA" & distance == "b5"), "pa_buff20",
                          ifelse(distance == "PA" & distance == "b20"), "pa_buff500")) %>% 
  print()

我有Error in ifelse(n_distinct(distance) == 1, as.character(distance[1]), : unused arguments ("buffer", ifelse(distance == "PA" & distance == "b5"), "pa_buff20", ifelse(distance == "PA" & distance == "b20"), "pa_buff500")

如何更正此ifelse语句?

预期输出:

   year distance site  source  source2
  <dbl> <fct>    <fct> <chr>   <chr> 
1     1 b20      a     unclear buffer
2     1 b5       a     unclear buffer
3     2 b20      b     b20     b20
4     1 b20      c     b20     b20
5     5 PA       d     unclear pa_buff20
6     5 b5       d     unclear pa_buff20
7    10 PA       e     PA      PA

1 个答案:

答案 0 :(得分:4)

我们可以使用ifelse而不是嵌套的case_when

library(dplyr)

df %>%
  mutate(distance = as.character(distance)) %>%
  group_by(site) %>%
  mutate(source2 = case_when(all(c("b20", "b5") %in% distance) ~ "buffer", 
                             all(c("PA", "b5") %in% distance) ~ "pa_buff20",
                             all(c("PA", "b20") %in% distance) ~ "pa_buff500",
                             n_distinct(distance) == 1 ~ distance, 
                             TRUE ~ NA_character_))


#   year distance site  source2  
#  <dbl> <chr>    <fct> <chr>    
#1     1 b20      a     buffer   
#2     1 b5       a     buffer   
#3     2 b20      b     b20      
#4     1 b20      c     b20      
#5     5 PA       d     pa_buff20
#6     5 b5       d     pa_buff20
#7    10 PA       e     PA     

如前所述,case_when是多重嵌套ifelse语句的替代方法,其中LHS是我们要检查的条件,而RHS是我们要返回的值。条件被顺序评估。如果默认情况下没有条件匹配返回NA,则在此使用TRUE条件明确提及。