如何从R中的多个列获取给定id(foo)的唯一元素
df <- data.frame(
foo = c("x1","x1","y1","y1"),
c1 = c("apple","orange","banana","apple"),
c2 = c("banana","apple","pear","grape"),
c3 = c("orange","apple","banana","grape")
)
df
#> foo c1 c2 c3
#> 1: x1 apple banana orange
#> 2: x1 orange apple apple
#> 3: y1 banana pear banana
#> 4: y1 apple grape grape
所需
#> x1 apple banana orange
#> y1 apple grape pear banana
答案 0 :(得分:3)
两种方法:
by(df[2:4], df$foo, function(a) unique(unlist(a, use.names=FALSE)))
# df$foo: x1
# [1] apple orange banana
# Levels: apple banana orange grape pear
# ------------------------------------------------------------
# df$foo: y1
# [1] banana apple pear grape
# Levels: apple banana orange grape pear
或
library(dplyr)
library(tidyr)
df %>% tidyr::gather(k, v, -foo) %>% distinct(foo, v) %>% arrange(foo, v)
# Warning: attributes are not identical across measure variables;
# they will be dropped
# foo v
# 1 x1 apple
# 2 x1 banana
# 3 x1 orange
# 4 y1 apple
# 5 y1 banana
# 6 y1 grape
# 7 y1 pear
答案 1 :(得分:2)
这里是base R
的另一个split
选项
lapply(split(as.matrix(df[-1]), df$foo), unique)
#$x1
#[1] "apple" "orange" "banana"
#$y1
#[1] "banana" "apple" "pear" "grape"
或使用tidyverse
library(tidyverse)
df %>%
group_by(foo) %>%
nest(.key = out) %>%
mutate(out = map(out, ~ sort(unique(unlist(.)))))