我有两个数据帧。我想在第二个数据框上使用分组依据,然后在“公司名称”列上将两者合并在一起。问题是,在我的分组依据声明中,我松开了“公司名称”列。
import pandas as pd
df1 = pd.DataFrame(
{
'Company Name': ['Google','Google','Microsoft','Microsoft','Amazon','Amazon'],
'Location': ['Somewhere','Somewhere','Somewhere','Somewhere','Somewhere','Somewhere'],
}
)
df = pd.DataFrame(
{
'Company Name': ['Google','Google','Microsoft','Microsoft','Amazon','Amazon'],
'Sales': [12345,12345,12345,12345,12345,12345],
'Company Type': ['Software','Software','Software','Software','Software','Software']
}
)
df = df.groupby(['Company Name']).sum()
pd.merge(df1,df,how="inner",on="Company Name")
由于df没有“公司名称”列来执行联接,因此合并时出现错误消息。
答案 0 :(得分:0)
替换此行:
df = df.groupby(['Company Name']).sum()
使用:
df = df.groupby('Company Name', as_index=False).sum()
然后您的代码将按预期工作,并返回:
Company Name Location Sales
0 Google Somewhere 24690
1 Google Somewhere 24690
2 Microsoft Somewhere 24690
3 Microsoft Somewhere 24690
4 Amazon Somewhere 24690
5 Amazon Somewhere 24690