鉴于有关餐具的数据,餐厅的位置及其销售情况:
if (radSorted.Checked)
lstIntegers.Items.Add(SortedList < SortOrder>);
现在,我想做以下事情:
>>> import pandas
>>> df1 = pandas.DataFrame({"dish" : ["fish", "chicken", "fish", "chicken", "chicken"],
... "location" : ["central", "central", "north", "north", "south"],
... "sales" : [1,3,5,2,4]})
>>> df1
dish location sales
0 fish central 1
1 chicken central 3
2 fish north 5
3 chicken north 2
4 chicken south 4
>>> df2 = df1[["dish", "location"]]
>>> df2["sales_contrib"] = 0.0
>>> df2
dish location sales_contrib
0 fish central 0.0
1 chicken central 0.0
2 fish north 0.0
3 chicken north 0.0
4 chicken south 0.0
得到的df是
df2
我尝试使用 dish location sales_contrib
0 fish central 16.67
1 chicken central 33.33
2 fish north 83.33
3 chicken north 22.22
4 chicken south 44.45
但无法获得结果。
答案 0 :(得分:4)
你可以使用Pandas的力量来做到这一点......
dish_totals = df1.groupby(by="dish").sum()
df2["sales_contrib"] = df1.apply((lambda row: 100*row["sales"]/dish_totals.loc[row["dish"]]), axis=1)
print(df2)
输出:
dish location sales_contrib
0 fish central 16.666667
1 chicken central 33.333333
2 fish north 83.333333
3 chicken north 22.222222
4 chicken south 44.444444
答案 1 :(得分:3)
尝试
(df1.groupby(['dish', 'location']).sales.sum().div(df1.groupby('dish').sales.sum()) * 100).round(2).reset_index()
dish location sales
0 chicken central 33.33
1 chicken north 22.22
2 chicken south 44.44
3 fish central 16.67
4 fish north 83.33