Question

我有一个如下所示的数据框：

Id     Country     amount
1       AT           10
2       BE           20
3       DE           30
1       AT           10
1       BE           20
3       DK           30

我想要做的是ID，国家，所以我的df应该是这样的：

Id     Country     amount    AT_amount   BE_amount    DE_amount    DK_amount
1       AT           10       20          20            0           0
2       BE           20       0           20            0           0
3       DE           30       0           0             30          30
1       AT           10       20          20            0           0
1       BE           20       20          20            0           0
3       DK           30       0           0             30          30

我尝试使用groupby，但使用：

df['AT_amount'] = df.groupby(['Id', 'Country').sum(amount)

不起作用，从那时起我不会得到所有Id == 1的值，但只有ID == 1，并且无论国家/地区都会给我一个值。

我可以先这样做，如果国家！= AT将值设置为0，然后将组数设置为最大值，但这似乎有点长。

要为所有国家/地区获取这些值，我似乎必须编写循环，或者是否有快速方法为所有子组国家/地区创建新变量？

Answer 1

我认为您可以使用pivot_table，add_suffix和最后merge：

df1 = df.pivot_table(index='Id', 
                     columns='Country', 
                     values='amount', 
                     fill_value='0', 
                     aggfunc=sum).add_suffix('_amount').reset_index()

print df1    

Country  Id AT_amount BE_amount DE_amount DK_amount
0         1        20        20         0         0
1         2         0        20         0         0
2         3         0         0        30        30

print pd.merge(df,df1, on='Id', how='left')

   Id Country  amount AT_amount BE_amount DE_amount DK_amount
0   1      AT      10        20        20         0         0
1   2      BE      20         0        20         0         0
2   3      DE      30         0         0        30        30
3   1      AT      10        20        20         0         0
4   1      BE      20        20        20         0         0
5   3      DK      30         0         0        30        30

Answer 2

print df.join(df.pivot_table('amount', 'Id', 'Country', aggfunc=sum, fill_value=0).add_suffix('_amount'), on='Id')
   Id Country  amount  AT_amount  BE_amount  DE_amount  DK_amount
0   1      AT      10         20         20          0          0
1   2      BE      20          0         20          0          0
2   3      DE      30          0          0         30         30
3   1      AT      10         20         20          0          0
4   1      BE      20         20         20          0          0
5   3      DK      30          0          0         30         30

按组和子组聚合

2 个答案: