Question

我在数据框中有一些简单数据，其中包含三列[id，country，volume]，其中索引为'id'。

我可以执行简单的操作，如：

df_vol.groupby('country').sum()

它按预期工作。当我尝试使用rank（）时，它不能按预期工作，结果是一个空数据帧。

df_vol.groupby('country').rank()

结果不一致，在某些情况下可行。以下内容也按预期工作：

df_vol.rank()

我想要返回类似的内容：

vols = []
for _, df in f_vol.groupby('country'):
    vols.append(df['volume'].rank())
pd.concat(vols)

任何想法为什么非常感谢！

Answer 1

您可以按[]添加列 - 仅对列Volume调用函数：

df_vol.groupby('country')['volume'].rank()

样品：

df_vol = pd.DataFrame({'country':['en','us','us','en','en'],
                   'volume':[10,10,30,20,50],
                   'id':[1,1,1,2,2]})
print(df_vol)
  country  id  volume
0      en   1      10
1      us   1      10
2      us   1      30
3      en   2      20
4      en   2      50

df_vol['r'] = df_vol.groupby('country')['volume'].rank()
print (df_vol)
  country  id  volume    r
0      en   1      10  1.0
1      us   1      10  1.0
2      us   1      30  2.0
3      en   2      20  2.0
4      en   2      50  3.0

在pandas DataFrameGroupBy对象上使用`rank`

1 个答案: