有没有办法指定dplyr :: distinct应该使用所有列名而不诉诸非标准评估?
df <- data.frame(a=c(1,1,2),b=c(1,1,3))
df %>% distinct(a,b,.keep_all=FALSE) # behavior I'd like to replicate
VS
df %>% distinct(everything(),.keep_all=FALSE) # with syntax of this form
答案 0 :(得分:2)
您可以使用以下代码区分所有列。
library(dplyr)
library(data.table)
df <- data_frame(
id = c(1, 1, 2, 2, 3, 3),
value = c("a", "a", "b", "c", "d", "d")
)
# A tibble: 6 × 2
# id value
# <dbl> <chr>
# 1 1 a
# 2 1 a
# 3 2 b
# 4 2 c
# 5 3 d
# 6 3 d
# distinct with Non-Standard Evaluation
df %>% distinct()
# distinct with Standard Evaluation
df %>% distinct_()
# Also, you can set the column names with .dots.
df %>% distinct_(.dots = names(.))
# A tibble: 4 × 2
# id value
# <dbl> <chr>
# 1 1 a
# 2 2 b
# 3 2 c
# 4 3 d
# distinct with data.table
unique(as.data.table(df))
# id value
# 1: 1 a
# 2: 2 b
# 3: 2 c
# 4: 3 d
答案 1 :(得分:0)
从 dplyr
的 1.0.5 版开始,以下两个选项产生相同的输出。
df <- data.frame(a = c(1, 1, 2),
b = c(1, 1, 3))
df %>% distinct(a, b)
a b
1 1 1
2 2 3
df %>% distinct(across(everything()))
a b
1 1 1
2 2 3
没有理由指定 .keep_all = FALSE
参数,因为这是默认值。
您也可以使用 tibble()
代替 data.frame()