我的数据框如下所示:
我的数据框有更多的行。
我要执行的是在每个位置上进行群集。
我想编写一个循环,该循环首先提取位置“ A”,然后在“容量”和“销售”列上执行聚类。
location name capacity sale
A Stonecrystal 50 3.915434493
A Valtown 200 5.410339205
A Fairley 200 6.793002265
我尝试了以下代码,但是仅通过手动对位置“ A”执行此过程。
import numpy as np
import pandas as pd
dataset = pd.read_excel('ClusterData.xlsx')
locations = {k:v for k,v in dataset.groupby('location')}
location1_ = locations['A']
location1 = location1_.iloc[: , 2:4].values
from sklearn.cluster import AffinityPropagation
af = AffinityPropagation().fit(location1)
labels = af.labels_
centers_indices = af.cluster_centers_indices_
centers = af.cluster_centers_
num_clusters = len(centers)
我如何编写一个循环以对每个位置执行此过程。 任何帮助将不胜感激。
答案 0 :(得分:0)
我想我已经找到了:
import pandas as pd
from sklearn.cluster import AffinityPropagation
dataset = pd.read_excel('ClusterData.xlsx')
districts = {k:v for k,v in dataset.groupby('location')}
district = dataset['location'].unique()
for j in district:
dis = {}
dis[j] = districts[j].iloc[: , 2:4].values
af = AffinityPropagation().fit(dis[j])
labels = af.labels_
centers_indices = af.cluster_centers_indices_
centers = af.cluster_centers_
print(j)
print('Number of Clusters: ', len(centers))
print('Labels: ', labels)
print('Centers: ', centers)