计算三个均值的欧几里得距离

时间:2019-10-25 14:43:07

标签: python pandas dataframe

我无法进行欧几里得距离的计算。

后来我指的是函数,它给了我这个错误:

TypeError:输入类型不支持ufunc'bitwise_and',并且根据强制转换规则“ safe”,不能将输入安全地强制转换为任何受支持的类型

硬编码K均值算法需要它。

def euclideanDist(df,pointIDX,mean_1,mean_2,mean_3):

    point = df.iloc[pointIDX][['Shoe_Size','Height']].values
    mean_1 = mean_1[['Shoe_Size','Height']].values
    mean_2 = mean_2[['Shoe_Size','Height']].values
    mean_3 = mean_3[['Shoe_Size','Height']].values

    dist_Total_1 = sum([a-b for a,b in zip(point,mean_1)])**2
    dist_Total_2 = sum([a-b for a,b in zip(point,mean_2)])**2
    dist_Total_3 = sum([a-b for a,b in zip(point,mean_3)])**2

    if dist_Total_1 < dist_Total_2 & dist_Total_3: 
        df.loc[pointIDX,'class'] = 1

    elif dist_Total_2 < dist_Total_3 > dist_Total_1:
        df.loc[pointIDX, "class"] = 2

    else:
        df.loc[pointIDX,'class'] = 3

    return df

1 个答案:

答案 0 :(得分:3)

您在这里有一些语法问题

if dist_Total_1 < dist_Total_2 & dist_Total_3: 
    df.loc[pointIDX,'class'] = 1

elif dist_Total_2 < dist_Total_3 > dist_Total_1:
    df.loc[pointIDX, "class"] = 2

我相信您真正想要的是

if dist_Total_1 < dist_Total_2 and dist_Total_1 < dist_Total_3: 
    df.loc[pointIDX,'class'] = 1

elif dist_Total_2 < dist_Total_3 and dist_Total_2 < dist_Total_1:
    df.loc[pointIDX, "class"] = 2

您的距离计算也似乎不符合我对欧几里德距离的理解。也许相反

dist_Total_1 = sum([(a-b)**2 for a,b in zip(point,mean_1)])**0.5

,依次类推dist_Total_2dist_Total_3