我有一个按年份和流派分组的数据框架
new_df=df1.groupby(['release_year','genres'])['release_year','genres','budget_adj']
new_df.info()
我当前的表格就像
release_year genres budget_adj
count count count
release_year genres
1960 Action 7 7 7
Adventure 5 5 5
Comedy 7 7 7
Crime 2 2 2
1961 Action 7 7 7
Adventure 6 6 6
Animation 1 1 1
Comedy 8 8 8
以此类推
我想找到每年最多制作的流派,我如何为此编写熊猫查询?
答案 0 :(得分:0)
您可以这样:
df.loc[df.loc[:,('genres','count')].groupby(level=0)
.rank(ascending=False, method='dense')
.loc[lambda x: x==1].index]
输出:
release_year genres budget_adt
count count count
1960 Action 7 7 7
Comedy 7 7 7
1961 Comedy 8 8 8
答案 1 :(得分:0)
df.columns = df.columns.map('_'.join)
df.reset_index().groupby(['genres']).apply(lambda x: x[x.genres_count == x.genres_count.max()])
出局:
Brand Metric release_year_count genres_count budget_adj_count
Brand
1960 0 1960 Action 7 7 7
2 1960 Comedy 7 7 7
1961 7 1961 Crime 8 8 8