将numpy数组分配给熊猫面具

时间:2019-07-01 15:38:53

标签: python pandas mask

我对熊猫遮罩的子集执行了一项任务:

pdxy = pd.DataFrame(data,columns=['X','Y','C','CC'])
mask = pdxy[:]['Y']==8

print("pdxy[mask]")
print(pdxy[mask][:10])

pdxy[mask]
       X  Y  C  CC
17    17  8  0   0
18    18  8  0   0
48    48  8  0   0
56    56  8  0   0
63    63  8  0   0
66    66  8  0   0
73    73  8  0   0
87    87  8  0   0
103  103  8  0   0
116  116  8  0   0

kmeans = KMeans(n_clusters=5,random_state=0).fit(pdxy[mask]['X','Y'])

之后,我想将结果(集群和聚类中心)关联到pandas数据框中的列:

pdxy.loc[mask]['C']  = np.array(kmeans.labels_)
pdxy.loc[mask]['CC'] = np.array(kmeans.cluster_centers_[kmeans.labels_])[:,0]

不幸的是,DataFrame未被修改,即与分配之前一样:

print("pdxy[mask] labeled")
print(pdxy[mask][:10]) 

pdxy[mask] labeled
       X  Y  C  CC
17    17  8  0   0
18    18  8  0   0
48    48  8  0   0
56    56  8  0   0
63    63  8  0   0
66    66  8  0   0
73    73  8  0   0
87    87  8  0   0
103  103  8  0   0
116  116  8  0   0

我该怎么办?

1 个答案:

答案 0 :(得分:2)

使用.loc访问行+列是用逗号完成的,例如[row,col]而不是[row] [col]

尝试一下:

import numpy as np
import pandas as pd

pdxy = pd.DataFrame(data, columns=['X', 'Y', 'C', 'CC'])
mask = pdxy[:]['Y'] == 8

kmeans = KMeans(n_clusters=5,random_state=0).fit(pdxy[mask]['X','Y'])

pdxy.loc[mask, 'C']  = np.array(kmeans.labels_)
pdxy.loc[mask, 'CC'] = np.array(kmeans.cluster_centers_[kmeans.labels_])[:,0]

print("pdxy[mask] labeled")
print(pdxy[mask][:10])