我有一个包含以下列的数据框:
Date | Origin | Destination | Service | Demand
April 4 | Chicago | Toronto | Ground |250
April 4 | Chicago | Tampa | Ground |250
April 5 | Chicago | Orlando | Air |100
April 5 | Chicago | Seattle | Air |400
我想用Python编写一个函数或使用pandas函数获取按“日期”和“来源”占总需求量百分比的需求列
所以,如果我有以下分组依据:
df.groupby(['Date','Origin'])['Demand'].sum().reset_index()
给我以下内容:
Date | Origin | Demand
April 4 | Chicago | 500
April 5 | Chicago | 500
我想要的输出是:
Date | Origin | Destination | Service | Demand | Percentage
April 4 | Chicago | Toronto | Ground |250 | 0.5
April 4 | Chicago | Tampa | Ground |250 | 0.5
April 5 | Chicago | Orlando | Air |100 | 0.2
April 5 | Chicago | Seattle | Air |400 | 0.8
我该如何写一些能给我百分比栏的东西?
答案 0 :(得分:1)
使用transform
df['Pct']=df['Demand']/df.groupby(['Date', 'Origin'])['Demand'].transform('sum')