Question

我正在尝试解决此错误;

ValueError: can not merge DataFrame with instance of type <class 'pandas.core.groupby.DataFrameGroupBy'>

我想合并由agg创建的两个数据框作为;

首先，我从主df创建了一组分组数据;

resi_all_nooutliers_bysector = df_resi_rawdata_nooutliers.groupby(['postcode_sector'])

resi_flats_nooutliers_bysector = df_resi_rawdata_nooutliers.loc[df_resi_rawdata_nooutliers['propertytype']=='F'].groupby(['postcode_sector'])

然后我运行了我想要的统计数据

resi_flats_nooutliers_bysector['updatedprice_calculated'].
agg([np.mean,np.median,np.max,'count'])

resi_all_nooutliers_bysector['updatedprice_calculated'].
agg([np.mean,np.median,np.max,'count'])

然后我尝试合并为;

df_resi_nooutliers_bysector = pd.merge(resi_all_nooutliers_bysector, 
                                       resi_flats_nooutliers_bysector,
                                       on=['postcode_sector'],how='left', 
                                       suffixes=('_allprop', '_flats'))

获取标题中的错误

Answer 1

对我来说这很有效，将agg输出保存到数据帧中，确保索引在原始索引上（列postcode_sectors）

df1 = resi_flats_nooutliers_bysector['updatedprice_calculated'].\
agg([np.mean,np.median,np.max,'count'],as_index=False)

df2 = resi_all_nooutliers_bysector['updatedprice_calculated'].\
agg([np.mean,np.median,np.max,'count'],as_index=False)
type (resi_flats_nooutliers_bysector)
df1.head(10)

然后使用索引进行连接

merge_test = df2.merge(df1, left_index=True, right_index=True,suffixes=
('_allprop', '_flats'))
merge_test.head(10)`

无法将DataFrame与类型<class'pandas.core.groupby.dataframegroupby'=“”>的实例合并

1 个答案: