Python中Kmeans算法的限制

时间:2020-04-20 14:52:33

标签: python k-means

我有下一个代码,但是我不知道如何限制距离?例如, 将半径在2 km之内的那些分组。

    <dependency>
        <groupId>org.apache.logging.log4j</groupId>
        <artifactId>log4j-slf4j-impl</artifactId>
        <version>2.10.0</version>
        <exclusions>
            <exclusion>
                <groupId>org.apache.logging.log4j</groupId>
                <artifactId>log4j-core</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
    <dependency>
        <groupId>org.apache.logging.log4j</groupId>
        <artifactId>log4j-api-scala_2.12</artifactId>
        <version>${scala.log4j}</version>
        <exclusions>
            <exclusion>
                <groupId>org.scala-lang</groupId>
                <artifactId>scala-reflect</artifactId>
            </exclusion>
            <exclusion>
                <groupId>org.scala-lang</groupId>
                <artifactId>scala-library</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
    <dependency>
        <groupId>org.apache.logging.log4j</groupId>
        <artifactId>log4j-api</artifactId>
        <version>${log4japi.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.logging.log4j</groupId>
        <artifactId>log4j-core</artifactId>
        <version>${log4j.version}</version>
    </dependency>

另一方面,我也用过熊猫

from sklearn.cluster import KMeans
from sklearn import metrics
import numpy as np

v1=[3, 1, 1, 2, 1, 6, 6, 6, 5, 6, 7, 8, 9, 8, 9, 9, 8]
v2=[5, 4, 6, 6, 5, 8, 6, 7, 6, 7, 1, 2, 1, 2, 3, 2, 3]

x1 = np.array(v1)
x2 = np.array(v2)

X = np.array(list(zip(x1, x2))).reshape(len(x1), 2)
print(X)

import matplotlib.pyplot as plt
plt.plot(v1, v2, 'ro')
plt.axis([1, 9, 1, 8]) #Eje x: de 1 a 9; Eje Y: de 1 a 8
plt.show()

K = 3 
kmeans_model = KMeans(n_clusters=K).fit(X)

for i, l in enumerate(kmeans_model.labels_):
print("(x1,x2) -> Clase")
print("({0},{1}) ->:{2}".format(x1[i], x2[i], l))

1 个答案:

答案 0 :(得分:0)

在这种情况下,为什么不使用熊猫并按条件分配组?

类似的东西

df = pd.DataFrame(data)
df1 = df[df.column2 > 2]
df2 = df[df.column2 <= 2]