从R中的多个列获取给定id的唯一元素

时间:2018-10-13 23:35:03

标签: r

如何从R中的多个列获取给定id(foo)的唯一元素

df <- data.frame(
  foo = c("x1","x1","y1","y1"),
  c1 = c("apple","orange","banana","apple"),
  c2 = c("banana","apple","pear","grape"),
  c3 = c("orange","apple","banana","grape")
)

df
#>    foo     c1     c2     c3
#> 1:  x1  apple banana orange
#> 2:  x1 orange  apple  apple
#> 3:  y1 banana   pear banana
#> 4:  y1  apple  grape  grape

所需

#> x1 apple banana orange
#> y1 apple grape pear banana

2 个答案:

答案 0 :(得分:3)

两种方法:

by(df[2:4], df$foo, function(a) unique(unlist(a, use.names=FALSE)))
# df$foo: x1
# [1] apple  orange banana
# Levels: apple banana orange grape pear
# ------------------------------------------------------------ 
# df$foo: y1
# [1] banana apple  pear   grape 
# Levels: apple banana orange grape pear

library(dplyr)
library(tidyr)
df %>% tidyr::gather(k, v, -foo) %>% distinct(foo, v) %>% arrange(foo, v)
# Warning: attributes are not identical across measure variables;
# they will be dropped
#   foo      v
# 1  x1  apple
# 2  x1 banana
# 3  x1 orange
# 4  y1  apple
# 5  y1 banana
# 6  y1  grape
# 7  y1   pear

答案 1 :(得分:2)

这里是base R的另一个split选项

lapply(split(as.matrix(df[-1]), df$foo), unique)
#$x1
#[1] "apple"  "orange" "banana"

#$y1
#[1] "banana" "apple"  "pear"   "grape" 

或使用tidyverse

library(tidyverse)
df %>% 
   group_by(foo) %>%
   nest(.key = out) %>%
   mutate(out = map(out, ~ sort(unique(unlist(.)))))