我有一个类似这样的列表:
$`264`
[1] "CHAMP1" "MAP1S" "PRRC1" "TUT1" "CDK12"
$`265`
[1] "TUT1" "PRRC1" "CHAMP1" "MAP1S"
$`266`
[1] "REPS1" "CHAMP1" "PRRC1" "TUT1" "MAP1S"
$`267`
[1] "G3BP1" "TUT1" "PRRC1" "CHAMP1" "MAP1S"
$`268`
[1] "TUT1" "CHAMP1" "PRRC1" "MAP1S"
$`269`
[1] "DDB1" "CHAMP1" "TUT1" "PRRC1" "MAP1S"
是否有package
或函数来计算不同列表组件之间的相似性?
非常感谢
答案 0 :(得分:1)
我不知道任何软件包,但这会实现您自己的指标(来自您的评论):
siml <- function(x,y) {
length(intersect(lst[[x]],lst[[y]]))/length(union(lst[[x]],lst[[y]]))
}
z <- expand.grid(x=1:length(lst),y=1:length(lst))
result <- mapply(siml,z$x,z$y)
dim(result) <- c(length(lst),length(lst))
result
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 1.000 0.8 0.667 0.667 0.8 0.667
# [2,] 0.800 1.0 0.800 0.800 1.0 0.800
# [3,] 0.667 0.8 1.000 0.667 0.8 0.667
# [4,] 0.667 0.8 0.667 1.000 0.8 0.667
# [5,] 0.800 1.0 0.800 0.800 1.0 0.800
# [6,] 0.667 0.8 0.667 0.667 0.8 1.000
这是一种(稍微)更有效的方法来做同样的事情:
result <- sapply(lst,function(x)
sapply(lst,function(y,x)length(intersect(x,y))/length(union(x,y)),x))