无法将字符串转换为float

时间:2019-11-27 06:17:10

标签: python-3.x pandas

    Sales       Discount    Profit      Product ID
0   0.050090    0.000000    0.262335    FUR-ADV-10000002
1   0.110793    0.000000    0.260662    FUR-ADV-10000108
2   0.309561    0.864121    0.241432    FUR-ADV-10000183
3   0.039217    0.591474    0.260687    FUR-ADV-10000188
4   0.070205    0.000000    0.263628    FUR-ADV-10000190
5   0.697873    0.000000    0.281162    FUR-ADV-10000571
6   0.064918    0.000000    0.261285    FUR-ADV-10000600
7   0.091950    0.000000    0.262946    FUR-ADV-10000847
8   0.056013    0.318384    0.257952    FUR-ADV-10001283
9   0.304472    0.318384    0.265739    FUR-ADV-10001440
10  0.046234    0.318384    0.261058    FUR-ADV-10001659

Am使用K弯头法找到正确的簇数

使用弯头法找到最佳簇数

    import matplotlib.pyplot as plt

    def kelbow(final_df,k):
        from sklearn.cluster import KMeans
        x = []
        for i in range(1,k):
            kmeans = KMeans(n_clusters = i)
            kmeans.fit(final_df)
            x.append(kmeans.inertia_)

        plt.plot(range(1,k), 30)
        plt.title('The elbow method')
        plt.xlabel('The number of clusters')
        plt.ylabel('WCSS')
        plt.show()
        return x

返回功能, kelbow(final_df,30),

但是代码抛出错误,因为 ValueError:无法将字符串转换为浮点型:'TEC-STA-10004927' 如何找到群集?

2 个答案:

答案 0 :(得分:0)

设置虚拟变量。

final_df = pd.get_dummies(final_df, columns=['ProductID'], dtype=('int64'))
final_df = final_df.drop(['ProductID'], axis=1)

答案 1 :(得分:0)

这应该对您有用:

import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

def kelbow(df, k):
  x = []
  final_df = pd.get_dummies(df, columns=df.select_dtypes(['object']).columns)

  for i in range(1,k):
    kmeans = KMeans(n_clusters = i)
    kmeans.fit(final_df)
    x.append(kmeans.inertia_)

  plt.plot(range(1,k), 30)
  plt.title('The elbow method')
  plt.xlabel('The number of clusters')
  plt.ylabel('WCSS')
  plt.show()

  return x