如何循环执行群集

时间:2019-07-13 08:19:47

标签: python python-3.x

我的数据框如下所示:
data 我的数据框有更多的行。 我要执行的是在每个位置上进行群集。
我想编写一个循环,该循环首先提取位置“ A”,然后在“容量”和“销售”列上执行聚类。

location    name    capacity    sale
A   Stonecrystal    50  3.915434493
A   Valtown 200 5.410339205
A   Fairley 200 6.793002265

我尝试了以下代码,但是仅通过手动对位置“ A”执行此过程。

import numpy as np
import pandas as pd

dataset = pd.read_excel('ClusterData.xlsx')

locations = {k:v for k,v in dataset.groupby('location')}


location1_ = locations['A']
location1 = location1_.iloc[: , 2:4].values


from sklearn.cluster import AffinityPropagation

af = AffinityPropagation().fit(location1)
labels = af.labels_
centers_indices = af.cluster_centers_indices_
centers = af.cluster_centers_
num_clusters = len(centers)

我如何编写一个循环以对每个位置执行此过程。 任何帮助将不胜感激。

1 个答案:

答案 0 :(得分:0)

我想我已经找到了:

import pandas as pd
from sklearn.cluster import AffinityPropagation

dataset = pd.read_excel('ClusterData.xlsx')

districts = {k:v for k,v in dataset.groupby('location')}
district = dataset['location'].unique()

for j in district:
    dis = {}
    dis[j] = districts[j].iloc[: , 2:4].values
    af = AffinityPropagation().fit(dis[j])
    labels = af.labels_
    centers_indices = af.cluster_centers_indices_
    centers = af.cluster_centers_
    print(j)
    print('Number of Clusters: ', len(centers))
    print('Labels: ', labels)
    print('Centers: ', centers)