如何从r中的数据框中的2列中提取唯一级别

时间:2012-07-22 13:58:46

标签: r unique dataframe

我有data.frame

df<-data.frame("Site.1" = c("A", "B", "C"),
               "Site.2" = c("D", "B", "B"),
               "Tsim" = c(2, 4, 7), 
               "Jaccard" = c(5, 7, 1))

#    Site.1 Site.2 Tsim Jaccard
#  1      A      D    2       5
#  2      B      B    4       7
#  3      C      B    7       1

我可以使用

获取每列的唯一级别
top.x<-unique(df[1:2,c("Site.1")])
top.x

# [1] A B
# Levels: A B C

top.y<-unique(df[1:2,c("Site.2")])
top.y

# [1] D B
# Levels: B D

如何获得两列的唯一级别并将它们转换为矢量,即:

v <- c("A", "B", "D")
v
# [1] "A" "B" "D"

3 个答案:

答案 0 :(得分:2)

top.xy <- unique(unlist(df[1:2,]))
top.xy

[1] A B D
Levels: A B C D

答案 1 :(得分:2)

尝试union

union(top.x, top.y)
# [1] "A" "B" "D"
union(unique(df[1:2, c("Site.1")]), 
      unique(df[1:2, c("Site.2")]))
# [1] "A" "B" "D"

答案 2 :(得分:0)

您可以获得第一个两个柱子的独特水平:

de<- apply(df[,1:2],2,unique)  
de

# $Site.1  
# [1] "A" "B" "C"  

# $Site.2  
# [1] "D" "B"  

然后你可以得到两组的对称差异:

union(setdiff(de$Site.1,de$Site.2), setdiff(de$Site.2,de$Site.1))  
# [1] "A" "C" "D"

如果您只对前两行感兴趣(如您的示例所示):

de<- apply(df[1:2,1:2],2,unique)  
de  
#      Site.1 Site.2  
# [1,] "A"    "D"   
# [2,] "B"    "B"  
union(de[,1],de[,2])  
# [1] "A" "B" "D"