我有一组基因,每个基因我计算一组基序(子串)的频率,每个基序的重量对应于基序频率和基因丰度或表达之间的相关性。
我想根据加权基序频率计算这些基因之间的相似性。换句话说,我想要一个可以在频率数据之间计算的距离分数,并且还要考虑每个主题频率的权重/重要性。
Gene1 = c(11.26971112,0.26609964,0.23126772,0.27460510,0.03523694,0.02430134,
0.01336574,0.01093560,0.01458080,0.06439854)
Gene2 = c(10.363039630,0.219438117,0.300683371,0.231586940,0.022779043,0.020501139,
0.009111617,0.018223235,0.002277904,0.029612756)
Weights = c(0.19942123,0.06118163,0.19720853,0.14889052,0.19409911,0.0904736,
0.04931893,0.06494762,0.04148397,0.07337904)
我想计算Gene1和Gene2之间的距离,其中Weights代表基因中每个频率(列)的权重