合并使我在数据框中有更多行

时间:2019-07-26 13:00:23

标签: python pandas

更新:如评论中所述,我的索引不是唯一的。通过pivot.table解决了

我得到了以下代码来在df上执行聚类。此df大约为80 K行(df称为“ Kmeans”)。然后,我有另一个与'Kmeans'(即'SKU_NR')具有共同值的df,其行数略少于80K(此df被命名为'Historie')。我想将df'Kmeans'与df'Historie'合并,但是当我这样做时,它给了我2百万行。我以前做过,然后成功了。代码出了什么问题?

#load in libraries
import pandas as pd
import numpy as np
pd.options.mode.chained_assignment = None
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

 #Load and prepare data
 Historie = pd.read_excel("file.xlsx")

 Kmeans = Historie[['SKU_NR','ORDER_ADV_CONS_UNITS_WK_PICK']]

 Kmeans = Kmeans.dropna()

 from sklearn.cluster import KMeans
 km = KMeans(n_clusters=3)
 km.fit(Kmeans)
 km.predict(Kmeans)
 labels = km.labels_
 Kmeans["Classification"] = labels
 Kmeans = Kmeans[["SKU_NR","Classification"]]

 Historie 
 =Historie[['SKU_NR','WEEKNR','ORDER_ADV_CONS_UNITS_WK_PICK',
 'FORECAST_NEC_STOCK_BASE']]
 Historie = Historie.merge(Kmeans, on = "SKU_NR")

0 个答案:

没有答案