我正尝试根据多种条件使用source2
和mutate
在我的长格式数据集中创建一个新列plyr
。
为了获得新的价值,我正在寻找小组中的众多因素。如果只有一个级别,请使用此级别。但是,如果存在距离值的组合,请进行相应设置:
distance == "b20" & distance == "b5" => "buffer",
distance == "PA" & distance == "b5") => "pa_buff20",
distance == "PA" & distance == "b20") => "pa_buff500"
我认为嵌套的ifelse
语句应该这样做,但是看来我的组合不起作用。是因为我先检查级别数吗?
(组号的第一个ifelse
也是为什么我可能不能简单地使用case_when
的原因。)
我的虚拟数据集:
# how to find a year when a lag value overpass the certain threshold
df<- data.frame(year = c(1,1,2,1,5,5,10),
distance = c("b20", "b5", "b20", "b20", "PA", "b5", "PA"),
site = c("a", "a", "b", "c", "d", "d", "e"))
# Create new columns based on number of levels in `distance`
df %>%
group_by(site) %>%
mutate(source = ifelse(n_distinct(distance) == 1, # create source column based of number of factors
as.character(distance[1]), 'unclear')) %>%
mutate(source2 = ifelse(n_distinct(distance) == 1, # create source column based of number of factors
as.character(distance[1]),
ifelse(distance == "b20" & distance == "b5"), "buffer",
ifelse(distance == "PA" & distance == "b5"), "pa_buff20",
ifelse(distance == "PA" & distance == "b20"), "pa_buff500")) %>%
print()
我有Error in ifelse(n_distinct(distance) == 1, as.character(distance[1]), :
unused arguments ("buffer", ifelse(distance == "PA" & distance == "b5"), "pa_buff20", ifelse(distance == "PA" & distance == "b20"), "pa_buff500")
如何更正此ifelse
语句?
预期输出:
year distance site source source2
<dbl> <fct> <fct> <chr> <chr>
1 1 b20 a unclear buffer
2 1 b5 a unclear buffer
3 2 b20 b b20 b20
4 1 b20 c b20 b20
5 5 PA d unclear pa_buff20
6 5 b5 d unclear pa_buff20
7 10 PA e PA PA
答案 0 :(得分:4)
我们可以使用ifelse
而不是嵌套的case_when
。
library(dplyr)
df %>%
mutate(distance = as.character(distance)) %>%
group_by(site) %>%
mutate(source2 = case_when(all(c("b20", "b5") %in% distance) ~ "buffer",
all(c("PA", "b5") %in% distance) ~ "pa_buff20",
all(c("PA", "b20") %in% distance) ~ "pa_buff500",
n_distinct(distance) == 1 ~ distance,
TRUE ~ NA_character_))
# year distance site source2
# <dbl> <chr> <fct> <chr>
#1 1 b20 a buffer
#2 1 b5 a buffer
#3 2 b20 b b20
#4 1 b20 c b20
#5 5 PA d pa_buff20
#6 5 b5 d pa_buff20
#7 10 PA e PA
如前所述,case_when
是多重嵌套ifelse
语句的替代方法,其中LHS是我们要检查的条件,而RHS是我们要返回的值。条件被顺序评估。如果默认情况下没有条件匹配返回NA
,则在此使用TRUE
条件明确提及。