Question

我有一个数据集，它使用groupby将Year，Month，msa和company的pmt_unit数量相加。

我希望每个月每个公司的pmt_units最多的前10名。

这是用于获取完整排名列表的代码：

#For each month for each builder, provide the pmt_units for the top ten cities 
#Group by Month and MSA
SFU_grouped = SFU_2.groupby(['uyear','umonth','msa','stock_ticker']).agg({'pmt_units': 'sum'}).reset_index()

按时间顺序排序，然后按公司排序，并按pmt_units从最大到最小对MSA进行排名。添加一列，按每个公司的pmt_unit对msa进行排名

SFU_ordered=SFU_grouped.sort_values(['uyear','umonth','company','pmt_units'],
ascending =[True, True, True, False])
SFU_ordered['city_rank']=SFU_ordered.groupby(['company','umonth','uyear'])['pmt_units'].rank(method = 'dense', ascending = False).astype(int)

我已经尝试过

SFU_ordered.groupby('company').apply(lambda x: x.nlargest(10,'pmt_units')).reset_index(drop=True)

但这给了我每个公司有史以来最高的十个最高许可月份。

我如何才能按月获得每个公司的前十名坦克？

编辑：我在这里澄清了MSA的作用。这是一个示例表：

Image of Sample Table enter image description here

编辑：我通过以下方法解决了我的问题：

SFU_year_rank = SFU_year_ordered.set_index('msa').groupby('company')['pmt_units'].nlargest(10).reset_index()

Answer 1

尝试一下：

SFU_ordered['city_rank']=SFU_ordered.groupby(['umonth','company','uyear'])['pmt_units'].rank(method = 'dense', ascending = False).astype(int)

Answer 2

您尝试过

    top_n = 10
    top_ten_cities = (SFU_ordered
        .groupby(['umonth','company','uyear'])['pmt_units']
        .apply(lambda x: x.sort_values(ascending=False).head(top_n))

head(n)返回序列中的前n行

每个公司每个月的前10名排名

2 个答案: