我正在计算向量元素之间的欧几里德成对距离。我使用sklearn包中的pairwise_distances函数。然而,某些元素的结果矩阵仅近似对称:在一个示例中,假定相等的元素的值仅等于小数点后面的15位数。
我意识到这一点,因为我在下游分析中遇到错误,假设输入矩阵是对称的。我知道我可以将值递增,但是造成这种情况的原因是什么?!
这是我试图计算成对距离的向量(它是一个pandas数据帧的列):
lag_measure_data[['bios_level']].values
array([[ 0.76881030949999995538490793478558771312236785888671875 ],
[ 0. ],
[ 0.67783090619999997183953155399649403989315032958984375 ],
[ 0.3228176074999999922710003374959342181682586669921875 ],
[ 0.75822395549999999087020796650904230773448944091796875 ],
[ 0.469808621599999975959605080788605846464633941650390625],
[ 0.989529862699999984698706612107343971729278564453125 ],
[ 0. ],
[ 0.5575436799999999859522858969285152852535247802734375 ],
[ 0.9756440299999999954394525047973729670047760009765625 ],
[ 0.66511863289999995085821637985645793378353118896484375 ],
[ 0.978062709200000046649847718072123825550079345703125 ],
[ 0.473957179800000016900440868994337506592273712158203125],
[ 0.82409385540000001935112550199846737086772918701171875 ],
[ 0.56548685279999999497846374651999212801456451416015625 ],
[ 0.399505730399999980928527065771049819886684417724609375],
[ 0.474232963900000026313819034839980304241180419921875 ],
[ 0.34276307189999999369689476225175894796848297119140625 ],
[ 0.9985316859999999739017084721126593649387359619140625 ],
[ 0.9063241512999999915933813099400140345096588134765625 ],
[ 0. ]])
这是我用来获取距离矩阵的命令:
d_matrix_lag = pairwise_distances(lag_measure_data[['bios_level']].values)
我不打印输出距离矩阵,因为它太乱了但是作为第一行的例子,第4列的值是
0.445992701999999907602756366031826473772525787353515625
而第4行和第1列的值是
0.4459927019999998520916051347739994525909423828125
答案 0 :(得分:4)
我可以重现你的错误我的对称测试:
function myFunc(key, value) {
myFunc2({[`${key}`]: value});
}
输出为假。尝试使用scipy.spatial.distance。 您将获得成对距离计算的距离向量,但可以将其转换为方形()
的距离矩阵import numpy as np
a = np.array([[ 0.76881030949999995538490793478558771312236785888671875 ],
[ 0. ],
[ 0.67783090619999997183953155399649403989315032958984375 ],
[ 0.3228176074999999922710003374959342181682586669921875 ],
[ 0.75822395549999999087020796650904230773448944091796875 ],
[ 0.469808621599999975959605080788605846464633941650390625],
[ 0.989529862699999984698706612107343971729278564453125 ],
[ 0. ],
[ 0.5575436799999999859522858969285152852535247802734375 ],
[ 0.9756440299999999954394525047973729670047760009765625 ],
[ 0.66511863289999995085821637985645793378353118896484375 ],
[ 0.978062709200000046649847718072123825550079345703125 ],
[ 0.473957179800000016900440868994337506592273712158203125],
[ 0.82409385540000001935112550199846737086772918701171875 ],
[ 0.56548685279999999497846374651999212801456451416015625 ],
[ 0.399505730399999980928527065771049819886684417724609375],
[ 0.474232963900000026313819034839980304241180419921875 ],
[ 0.34276307189999999369689476225175894796848297119140625 ],
[ 0.9985316859999999739017084721126593649387359619140625 ],
[ 0.9063241512999999915933813099400140345096588134765625 ],
[ 0. ]])
from sklearn.metrics.pairwise import pairwise_distances
dist_sklearn = pairwise_distances(a)
print((dist_sklearn.transpose() == dist_sklearn).all())
这给了我对称矩阵。 希望这有帮助