更新数据帧的值

时间:2017-12-14 16:38:59

标签: python pandas dataframe

鉴于有关餐具的数据,餐厅的位置及其销售情况:

if (radSorted.Checked)
            lstIntegers.Items.Add(SortedList < SortOrder>); 

现在,我想做以下事情:

  1. 遍历>>> import pandas >>> df1 = pandas.DataFrame({"dish" : ["fish", "chicken", "fish", "chicken", "chicken"], ... "location" : ["central", "central", "north", "north", "south"], ... "sales" : [1,3,5,2,4]}) >>> df1 dish location sales 0 fish central 1 1 chicken central 3 2 fish north 5 3 chicken north 2 4 chicken south 4 >>> df2 = df1[["dish", "location"]] >>> df2["sales_contrib"] = 0.0 >>> df2 dish location sales_contrib 0 fish central 0.0 1 chicken central 0.0 2 fish north 0.0 3 chicken north 0.0 4 chicken south 0.0
  2. 的每一行
  3. 计算销售额。那个菜的位置。因此,对于鱼类,中央贡献总收入的1/6 16.67%,贡献剩余的83.3%
  4. 得到的df是

    df2

    我尝试使用 dish location sales_contrib 0 fish central 16.67 1 chicken central 33.33 2 fish north 83.33 3 chicken north 22.22 4 chicken south 44.45 但无法获得结果。

2 个答案:

答案 0 :(得分:4)

你可以使用Pandas的力量来做到这一点......

dish_totals = df1.groupby(by="dish").sum()
df2["sales_contrib"] = df1.apply((lambda row: 100*row["sales"]/dish_totals.loc[row["dish"]]), axis=1)
print(df2)

输出:

      dish location  sales_contrib
0     fish  central      16.666667
1  chicken  central      33.333333
2     fish    north      83.333333
3  chicken    north      22.222222
4  chicken    south      44.444444

答案 1 :(得分:3)

尝试

(df1.groupby(['dish', 'location']).sales.sum().div(df1.groupby('dish').sales.sum()) * 100).round(2).reset_index()

    dish    location    sales
0   chicken central     33.33
1   chicken north       22.22
2   chicken south       44.44
3   fish    central     16.67
4   fish    north       83.33