K值表示循环辅助

时间:2019-11-30 18:08:36

标签: python k-means

在下面的代码中,当尝试重新运行代码时,是否会像这样使用for循环?

for x in range(2, 11):
kmeans = KMeans().setK(x).setSeed(1)
model = kmeans.fit(dataset)

这是我下面的代码的其余部分,我确实觉得自己在这里感到困惑。

from pyspark.ml.clustering import KMeans
from pyspark.ml.evaluation import ClusteringEvaluator

dataset = spark.read.format("libsvm").load("/FileStore/tables/colon_cancer-ecfbf.txt")

for x in range(2, 11):
kmeans = KMeans().setK(x).setSeed(1)
model = kmeans.fit(dataset)

predictions = model.transform(dataset)

evaluator = ClusteringEvaluator()

silhouette = evaluator.evaluate(predictions)
print("Silhouette with squared euclidean distance = " + str(silhouette))

centers = model.clusterCenters()
print("Cluster Centers: ")
for center in centers:
    print(center)

1 个答案:

答案 0 :(得分:0)

以下是我所做的缩进修复,它应该修复您的代码并使其能够运行,模拟x值2,3,4,5,6,7,8,9,10

from pyspark.ml.clustering import KMeans
from pyspark.ml.evaluation import ClusteringEvaluator

dataset = spark.read.format("libsvm").load("/FileStore/tables/colon_cancer-ecfbf.txt")

for x in range(2, 11):
    kmeans = KMeans().setK(x).setSeed(1)
    model = kmeans.fit(dataset)

    predictions = model.transform(dataset)

    evaluator = ClusteringEvaluator()

    silhouette = evaluator.evaluate(predictions)
    print("Silhouette with squared euclidean distance = " + str(silhouette))

    centers = model.clusterCenters()
    print("Cluster Centers: ")
    for center in centers:
        print(center)

此缩进方案启用以下行:

kmeans = KMeans().setK(x).setSeed(1)
model = kmeans.fit(dataset)

predictions = model.transform(dataset)

evaluator = ClusteringEvaluator()

silhouette = evaluator.evaluate(predictions)
print("Silhouette with squared euclidean distance = " + str(silhouette))

centers = model.clusterCenters()
print("Cluster Centers: ")
for center in centers:
    print(center)

针对上面列表中的每个x值运行。