我想通过使用管道将StandardScaler()
和KMeans()
结合起来并检查kmeans
' inertia_
,因为我想检查哪个群集是最好。
代码如下:
ks = range(3, 5)
inertias = []
inertias_temp = 9999.0
for k in ks:
scaler = StandardScaler()
kmeans = KMeans(n_clusters=k, random_state=rng)
pipeline = make_pipeline(scaler, kmeans)
pipeline.fit(X_pca)
labels = pipeline.predict(X_pca)
np.round(kmeans.cluster_centers_, decimals=3)
inertias.append(kmeans.inertia_)
if (kmeans.inertia_ < inertias_temp):
n_clusters_min = k
kmeans_min = kmeans
inertias_temp = kmeans.inertia_
但是,我认为kmeans.inertia_
的值可能不正确,因为它应该在pipeline.predict()
之后得到。但我无法在pipeline.predict()
之后获得此值。任何人都可以帮我吗?
答案 0 :(得分:3)
可以从make_pipeline
实例观察群集的惯性距离。但是,没有必要执行.predict()
来观察质心数的距离。要在您的案例中访问惯性值,您可以键入如下:
pipeline.named_steps['kmeans'].inertia_
然后按照您的喜好处理它!
此外,我有一些空闲时间,所以我为你重写了一些代码以使其更有趣:
scaler = StandardScaler()
cluster = KMeans(random_state=1337)
pipe = make_pipeline(scaler, cluster)
centroids = []
inertias = []
min_ks = []
inertia_temp = 9999.0
for k in range(3, 5):
pipe.set_params(cluster__n_clusters=k)
pipe.fit(X_pca)
centroid = pipe.named_steps['cluster'].cluster_centers_
inertia = pipe.named_steps['cluster'].inertia_
centroids.append(centroid)
inertias.append(inertia)
if inertia < inertia_temp:
min_ks.append(k)
谢谢你的提问!