Question

如何打开以下输入数据（从Excel文件馈送的熊猫数据框）：

ID      Category                    Speaker     Price
334014  Real Estate Perspectives    Tom Smith   100
334014  E&E                         Tom Smith   200
334014  Real Estate Perspectives    Janet Brown 100
334014  E&E                         Janet Brown 200

对此：

ID      Category                    Speaker                 Price
334014  Real Estate Perspectives    Tom Smith, Janet Brown  100
334014  E&E                         Tom Smith, Janet Brown  200

因此，我基本上希望按类别分组，将发言人连接起来，但不要合计价格。

我用熊猫dataframe.groupby()和.agg()尝试了不同的方法，但都无济于事。也许有更简单的纯Python解决方案？

Answer 1

有2种可能的解决方案-通过多列和join进行汇总：

dataframe.groupby(['ID','Category','Price'])['Speaker'].apply(','.join)

或者只需要汇总Price列，然后必须按first或last汇总所有列：

dataframe.groupby('Price').agg({'Speaker':','.join, 'ID':'first', 'Price':'first'})

Answer 2

尝试一下

df.groupby(['ID','Category'],as_index=False).agg(lambda x : x if x.dtype=='int64' else ', '.join(x))

熊猫数据框：按一列分组，但由其他列串联和聚合

2 个答案: