Question

我目前正在处理一些科学数据，并且正在尝试对它执行聚类任务，但是由于数据格式的原因，导致出现Value错误。这是[170行x 7列]中的两个Pandas DataFrame。

我尝试过转置数据，格式化为列表以及numpy数组。我在代码中显示的格式来自于此处找到的解决方案：ValueError: cannot copy sequence with size 5 to array axis with dimension 2

#x is the y distance
x = np.empty(7, dtype = object)
x[:] = [distance_lC, distance_fC]

#y is the speed.
y = np.empty(7, dtype = object)
y[:] = [speed_lC, speed_fC]

cell_kmeans = KMeans(n_clusters = 4).fit_predict(y)

fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.scatterplot(cell_kmeans)
plt.show()

输出应给出群集。但是我有以下Value Error：“ ValueError：设置具有序列的数组元素。”

Answer 1

改为使用pandas.concat连接数据框：

y = pandas.concat([speed_lC, speed_fC])

使用Pandas DataFrames进行KMeans聚类的数据结构

1 个答案: