使用scikit中的预处理缩放矢量时设置精度学习

时间:2016-05-02 10:28:19

标签: python numpy scipy scikit-learn

我必须计算两个向量之间的欧几里德距离,并且在计算距离之前必须进行缩放。

sample_A= np.array([1,1,1,0,0,1,0,0,1,1,0,0,0,0,0,0.008624,-0.002894,0.006471,0.000961,0.007407,-0.004442,-0.00966,-0.003026,0.010202,0.008907,-0.003031,-0.002724,0.002302,0.002171,-0.011219,0.006802,0.004588,0.030068,0.016608,0.021235,0.015706,0.102711,0.053489,0.006902,-0.010042,0.002647,0.036403,-0.010567,0.040207,0.065626,-0.010786,-0.010131,0.080007,-0.046524,-0.08577,0.120587,0.159285,0.058588,0.112184,0.011561])
sample_B = np.array([18,1,1,0,0,1,0,0,1,0,1,0,0,0,0,1.921413,-1.350259,-0.549294,-0.829648,-0.271365,-2.267258,-0.043207,-0.127863,0.46472,0.106202,-0.363018,-0.863932,-1.041068,0.944935,-0.269358,-0.705195,-0.505604,-0.721329,0.603105,-0.619679,-0.461518,0.595048,-0.097054,-1.602379,-0.373747,-0.253988,-0.476779,1.108103,1.428308,1.12896,1.296803,-0.086155,-0.555077,0.347556,0.202161,0.289031,0.676664,-0.318146,0.193779,0.841483])

根据要求,这两点之间的预期距离为7.296226771

from sklearn import preprocessing
A_scaled = preprocessing.scale(sample_A)
B_scaled = preprocessing.scale(sample_B)
distance.euclidean(A_scaled,B_scaled)

我得到的价值是7.713635264892224

我的理解是,这是因为在计算标准偏差和平均值时存在更高的精度。有没有办法在缩放时提供精度作为函数的输入,或者我是否必须编写自定义缩放函数。

如果是这样,我如何编写适用于整个numpy数组的自定义缩放函数。

0 个答案:

没有答案