Question

我的数据框如下所示：
我的数据框有更多的行。我要执行的是在每个位置上进行群集。
我想编写一个循环，该循环首先提取位置“ A”，然后在“容量”和“销售”列上执行聚类。

location    name    capacity    sale
A   Stonecrystal    50  3.915434493
A   Valtown 200 5.410339205
A   Fairley 200 6.793002265

我尝试了以下代码，但是仅通过手动对位置“ A”执行此过程。

import numpy as np
import pandas as pd

dataset = pd.read_excel('ClusterData.xlsx')

locations = {k:v for k,v in dataset.groupby('location')}


location1_ = locations['A']
location1 = location1_.iloc[: , 2:4].values


from sklearn.cluster import AffinityPropagation

af = AffinityPropagation().fit(location1)
labels = af.labels_
centers_indices = af.cluster_centers_indices_
centers = af.cluster_centers_
num_clusters = len(centers)

我如何编写一个循环以对每个位置执行此过程。任何帮助将不胜感激。

Answer 1

我想我已经找到了：

import pandas as pd
from sklearn.cluster import AffinityPropagation

dataset = pd.read_excel('ClusterData.xlsx')

districts = {k:v for k,v in dataset.groupby('location')}
district = dataset['location'].unique()

for j in district:
    dis = {}
    dis[j] = districts[j].iloc[: , 2:4].values
    af = AffinityPropagation().fit(dis[j])
    labels = af.labels_
    centers_indices = af.cluster_centers_indices_
    centers = af.cluster_centers_
    print(j)
    print('Number of Clusters: ', len(centers))
    print('Labels: ', labels)
    print('Centers: ', centers)

如何循环执行群集

1 个答案: