k均值会返回nan值吗?

时间:2019-12-23 11:20:55

标签: python nan k-means

我最近遇到了一个k均值教程,该教程看起来与我记得的算法有些不同,但是毕竟它是k均值仍然应该做同样的事情。因此,我去尝试了一些数据,代码如下:

# Assignment Stage:

def assignment(data, centroids):
    for i in centroids.keys():
        #sqrt((x1-x2)^2+(y1-y2)^2 + etc)
        data['distance_from_{}'.format(i)]= (
        np.sqrt((data['soloRatio']-centroids[i][0])**2
        +(data['secStatus']-centroids[i][1])**2
            +(data['shipsDestroyed']-centroids[i][2])**2
            +(data['combatShipsLost']-centroids[i][3])**2
            +(data['miningShipsLost']-centroids[i][4])**2
            +(data['exploShipsLost']-centroids[i][5])**2
            +(data['otherShipsLost']-centroids[i][6])**2
        ))


    print(data['distance_from_{}'.format(i)])
    centroid_distance_cols = ['distance_from_{}'.format(i) for i in centroids.keys()]

    data['closest'] = data.loc[:, centroid_distance_cols].idxmin(axis=1)
    data['closest'] = data['closest'].astype(str).str.replace('\D+', '')
    return data


data = assignment(data, centroids)

和:

#Update stage:


import copy

old_centroids = copy.deepcopy(centroids)

def update(k):
    for i in centroids.keys():
        centroids[i][0]=np.mean(data[data['closest']==i]['soloRatio'])
        centroids[i][1]=np.mean(data[data['closest']==i]['secStatus'])
        centroids[i][2]=np.mean(data[data['closest']==i]['shipsDestroyed'])
        centroids[i][3]=np.mean(data[data['closest']==i]['combatShipsLost'])
        centroids[i][4]=np.mean(data[data['closest']==i]['miningShipsLost'])
        centroids[i][5]=np.mean(data[data['closest']==i]['exploShipsLost'])
        centroids[i][6]=np.mean(data[data['closest']==i]['otherShipsLost'])
    return k


#TODO: add graphical representation?

while True:
    closest_centroids = data['closest'].copy(deep=True)
    centroids = update(centroids)
    data = assignment(data,centroids)
    if(closest_centroids.equals(data['closest'])):
        break

当我运行初始分配阶段时,它会返回距离,但是当我运行更新阶段时,所有距离值都变为NaN,而我只是不知道为什么或在什么时候发生这种情况……也许我让我我找不到的错误?

以下是我正在使用的数据摘录:

 Unnamed: 0  characterID  combatShipsLost  exploShipsLost  miningShipsLost  \
0           0   90000654.0              8.0             4.0              5.0   
1           1   90001581.0             97.0             5.0              1.0   
2           2   90001595.0             61.0             0.0              0.0   
3           3   90002023.0             22.0             1.0              0.0   
4           4   90002030.0             74.0             0.0              1.0   

   otherShipsLost  secStatus  shipsDestroyed  soloRatio  
0             0.0   5.003100             1.0       10.0  
1             0.0   2.817807          6251.0        6.0  
2             0.0  -2.015310           752.0        0.0  
3             4.0   5.002769            43.0        5.0  
4             1.0   3.090204           301.0        7.0 

0 个答案:

没有答案