我在if_else
中有一个嵌套的mutate
语句。在我的示例数据框中:
tmp_df2 <- data.frame(a = c(1,1,2), b = c(T,F,T), c = c(1,2,3))
a b c
1 1 TRUE 1
2 1 FALSE 2
3 2 TRUE 3
我希望按a
分组,然后根据组是否有一行或两行执行操作。我原以为这个嵌套的if_else
就足够了:
tmp_df2 %>%
group_by(a) %>%
mutate(tmp_check = n() == 1) %>%
mutate(d = if_else(tmp_check, # check for number of entries in group
0,
if_else(b, sum(c)/c[b == T], sum(c)/c[which(b != T)])
)
)
但这引发了错误:
Error in eval(substitute(expr), envir, enclos) :
`false` is length 2 not 1 or 1.
设置示例的方式,当第一个if_else(n() == 1)
条件的计算结果为true时,则返回一个元素,但当它的计算结果为false时,则返回一个包含两个元素的向量,这就是我的意思我假设是导致错误。然而,从逻辑上讲,这句话对我来说似乎很合理。
以下两个陈述产生(期望的)结果:
> tmp_df2 %>%
+ group_by(a) %>%
+ mutate(d = ifelse(rep(n() == 1, n()), # avoid undesired recycling
+ 0,
+ if_else(b, sum(c)/c[b == T], sum(c)/c[which(b != T)])
+ )
+ )
Source: local data frame [3 x 4]
Groups: a [2]
a b c d
<dbl> <lgl> <dbl> <dbl>
1 1 TRUE 1 3.0
2 1 FALSE 2 1.5
3 2 TRUE 3 0.0
或只是过滤以便只剩下包含两行的组:
> tmp_df2 %>%
+ group_by(a) %>%
+ filter(n() == 2) %>%
+ mutate(d = if_else(b, sum(c)/c[b == T], sum(c)/c[which(b != T)]))
Source: local data frame [2 x 4]
Groups: a [1]
a b c d
<dbl> <lgl> <dbl> <dbl>
1 1 TRUE 1 3.0
2 1 FALSE 2 1.5
我有三个问题。
dplyr如何知道由于逻辑条件而不应评估的第二个输出无效?
如何在dplyr中获得所需的行为(不使用ifelse
)?
编辑要么没有临时tmp_check
列并使用if ... else
构造,要么使用以下可行的代码,但会产生警告:
library(dplyr)
tmp_df2 %>%
group_by(a) %>%
mutate(tmp_check = n() == 1) %>%
mutate(d = if (tmp_check) # check for number of entries in group
0 else
if_else(b, sum(c)/c[b == T], sum(c)/c[which(b != T)])
)
答案 0 :(得分:4)
dplyr
“知道”因为if_else
检查要用于True和False案例的值。这在?if_else
中说明,来源告诉我们它是如何完成的:
if_else
# function (condition, true, false, missing = NULL)
# {
# if (!is.logical(condition)) {
# stop("`condition` must be logical", call. = FALSE)
# }
# out <- true[rep(NA_integer_, length(condition))]
# out <- replace_with(out, condition & !is.na(condition), true,
# "`true`")
# out <- replace_with(out, !condition & !is.na(condition),
# false, "`false`")
# out <- replace_with(out, is.na(condition), missing, "`missing`")
# out
# }
# <environment: namespace:dplyr>
检查replace_with
的来源:
dplyr:::replace_with
# function (x, i, val, name)
# {
# if (is.null(val)) {
# return(x)
# }
# check_length(val, x, name)
# check_type(val, x, name)
# check_class(val, x, name)
# if (length(val) == 1L) {
# x[i] <- val
# }
# else {
# x[i] <- val[i]
# }
# x
# }
# <environment: namespace:dplyr>
因此,检查True和False案例的值的长度。
要获得理想的行为,您可以在之前的问题中使用if ... else
,as another SO user suggested:
tmp_df2 %>%
group_by(a) %>%
mutate(d = if (n() == 1) 0 else if_else(b, sum(c)/c[b == T], sum(c)/c[which(b != T)])
)