我想计算疾病对的Tanimoto系数(集合/联合的交集)。样本数据如下,仅针对1对疾病。 其中疾病1是NK细胞缺陷,疾病2是腺苷酸琥珀酸裂解酶缺乏症。
第1组是疾病1(NK细胞缺陷),其具有来自Gene1列的所有基因。
第2组是疾病2(腺苷酸琥珀酸裂解酶缺陷),它具有来自Gene2列的所有基因。
**Gene1** **Gene2** **Disease1** **Disease2**
IMPDH1 XDH NK cell defects Adenylosuccinate lyase deficiency
PPP3R2 ADA NK cell defects Adenylosuccinate lyase deficiency
PPP3R2 NPR1 NK cell defects Adenylosuccinate lyase deficiency
PPP3R2 IMPDH1 NK cell defects Adenylosuccinate lyase deficiency
PPP3R2 IMPDH2 NK cell defects Adenylosuccinate lyase deficiency
PPP3R2 PPP3R2 NK cell defects Adenylosuccinate lyase deficiency
PPP3R2 RRM1 NK cell defects Adenylosuccinate lyase deficiency
NPR1 POLA1 NK cell defects Adenylosuccinate lyase deficiency
PPP3R2 ITGAL NK cell defects Adenylosuccinate lyase deficiency
ITGAL NPR1 NK cell defects Adenylosuccinate lyase deficiency
CASP3 NPR1 NK cell defects Adenylosuccinate lyase deficiency
PTK2B NPR1 NK cell defects Adenylosuccinate lyase deficiency
TNF GUCY1A2 NK cell defects Adenylosuccinate lyase deficiency
PTK2B GUCY1A2 NK cell defects Adenylosuccinate lyase deficiency
有关如何在MySQL或R
中执行此操作的任何建议谢谢,
罗汉
答案 0 :(得分:0)
学习搜索:
install.packages("sos")
library("sos")
findFn("Tanimoto")
getGeneSim {GOSim} R文档
计算基因的功能相似性
描述
使用不同的策略计算基因列表的成对功能相似性。 使用
getGeneSim(genelist1, genelist2=NULL, similarity="funSimMax", similarityTerm="relevance",
normalization="Tanimoto", method="sqrt", avg=(similarity=="OA"), verbose=FALSE)
答案 1 :(得分:0)
随机输入数据 -
library(data.table)
DT = data.table(
G1=1:5,
G2=3:7,
D1="A",
D2="B"
)
DT[,
list(
intersectG = length(intersect(G1,G2)),
unionG = length(union(G1,G2)),
Tanimoto = length(union(G1,G2))/length(intersect(G1,G2))
),
by = c('D1','D2')]
输出 -
D1 D2 intersectG unionG Tanimoto
1: A B 3 7 2.333333