Question

鉴于有关餐具的数据，餐厅的位置及其销售情况：

if (radSorted.Checked)
            lstIntegers.Items.Add(SortedList < SortOrder>);

现在，我想做以下事情：

遍历>>> import pandas >>> df1 = pandas.DataFrame({"dish" : ["fish", "chicken", "fish", "chicken", "chicken"], ... "location" : ["central", "central", "north", "north", "south"], ... "sales" : [1,3,5,2,4]}) >>> df1 dish location sales 0 fish central 1 1 chicken central 3 2 fish north 5 3 chicken north 2 4 chicken south 4 >>> df2 = df1[["dish", "location"]] >>> df2["sales_contrib"] = 0.0 >>> df2 dish location sales_contrib 0 fish central 0.0 1 chicken central 0.0 2 fish north 0.0 3 chicken north 0.0 4 chicken south 0.0
计算销售额。那个菜的位置。因此，对于鱼类，中央贡献总收入的1/6 16.67％，北贡献剩余的83.3％

得到的df是

df2

我尝试使用dish location sales_contrib 0 fish central 16.67 1 chicken central 33.33 2 fish north 83.33 3 chicken north 22.22 4 chicken south 44.45但无法获得结果。

Answer 1

你可以使用Pandas的力量来做到这一点......

dish_totals = df1.groupby(by="dish").sum()
df2["sales_contrib"] = df1.apply((lambda row: 100*row["sales"]/dish_totals.loc[row["dish"]]), axis=1)
print(df2)

输出：

      dish location  sales_contrib
0     fish  central      16.666667
1  chicken  central      33.333333
2     fish    north      83.333333
3  chicken    north      22.222222
4  chicken    south      44.444444

Answer 2

尝试

(df1.groupby(['dish', 'location']).sales.sum().div(df1.groupby('dish').sales.sum()) * 100).round(2).reset_index()

    dish    location    sales
0   chicken central     33.33
1   chicken north       22.22
2   chicken south       44.44
3   fish    central     16.67
4   fish    north       83.33

更新数据帧的值

2 个答案: