使用整洁的评估基于每个组中其他列的条件创建新列

时间:2019-05-11 03:36:14

标签: r dplyr mutate tidyeval

类似于此question,但我想使用整洁的评估。

df = data.frame(group = c(1,1,1,2,2,2,3,3,3), 
                date  = c(1,2,3,4,5,6,7,8,9),
                speed = c(3,4,3,4,5,6,6,4,9))
> df
  group date speed
1     1    1     3
2     1    2     4
3     1    3     3
4     2    4     4
5     2    5     5
6     2    6     6
7     3    7     6
8     3    8     4
9     3    9     9

任务是创建一个新列(newValue),其值等于date列(每个组)的值(每组)。示例:speed == 4的{​​{1}}中有group 1,因为newValue

2

它的工作没有经过整洁的评估

date[speed==4] = 2

但是评估整齐有误

    group date speed newValue
1     1    1     3        2
2     1    2     4        2
3     1    3     3        2
4     2    4     4        4
5     2    5     5        4
6     2    6     6        4
7     3    7     6        8
8     3    8     4        8
9     3    9     9        8

谢谢。

2 个答案:

答案 0 :(得分:4)

我们可以将评估放在方括号内。否则,它可能会尝试计算整个表达式(filter_var)而不是单独的library(rlang) library(dplyr) my_fu <- function(df, filter_var){ filter_var <- sym(filter_var) df %>% group_by(group) %>% mutate(newValue=(!!filter_var)[speed==4L]) } my_fu(df, "date") # A tibble: 9 x 4 # Groups: group [3] # group date speed newValue # <dbl> <dbl> <dbl> <dbl> #1 1 1 3 2 #2 1 2 4 2 #3 1 3 3 2 #4 2 4 4 4 #5 2 5 5 4 #6 2 6 6 4 #7 3 7 6 8 #8 3 8 4 8 #9 3 9 9 8

@CreationTimestamp

答案 1 :(得分:2)

此外,您可以从sqldf开始使用。将df加入约束:

library(sqldf)
df = data.frame(group = c(1,1,1,2,2,2,3,3,3), 
            date  = c(1,2,3,4,5,6,7,8,9),
            speed = c(3,4,3,4,5,6,6,4,9))

sqldf("SELECT df_origin.*, df4.`date` new_value FROM 
       df df_origin join (SELECT `group`, `date` FROM df WHERE speed = 4) df4 
                    on (df_origin.`group` = df4.`group`)")