Python:group_by中的新数据框,在一行中有新索引

时间:2018-02-07 13:06:15

标签: python pandas dataframe group-by

我是Python新手,希望在编码方面更有效率。

目前我有以下代码,如果商店仍在营业,则计算每个城市的商店数量:

top_20_cities = pd.DataFrame(data = shops[shops.Open == 'Open'].groupby('city').size().sort_values(ascending=False).head(20)).reset_index()
top_20_cities.columns = ['City', 'Count']

是否有可能将这两行合二为一?我试过这个,但得到一个错误:

top_20_cities = pd.DataFrame(data = shops[shops.Open == 'Open'].groupby('city').size().sort_values(ascending=False).head(20), columns = ['City', 'Count']).reset_index()

THX

1 个答案:

答案 0 :(得分:1)

使用:

top_20_cities = (shops[shops.Open == 'Open']
                   .groupby('city')
                   .size()
                   .sort_values(ascending=False)
                   .head(20)
                   .rename_axis('City')
                   .reset_index(name='Count'))

另一个value_counts的解决方案默认排序:

top_20_cities = (shops.loc[shops.Open == 'Open', 'city']
                      .value_counts()
                      .head(20)
                      .rename_axis('City')
                      .reset_index(name='Count'))