Question

我有一本价值词典：

{'Spanish Omlette': -0.20000000000000284,
 'Crumbed Chicken Salad': -1.2999999999999972,
 'Chocolate Bomb': 0.0,
 'Seed Nut Muesli': -3.8999999999999915,
 'Fruit': -1.2999999999999972,
 'Frikerdels Salad': -1.2000000000000028,
 'Seed Nut Cheese Biscuits': 0.4000000000000057,
 'Chorizo Pasta': -2.0,
 'No carbs Ice Cream': 0.4000000000000057,
 'Veg Stew': 0.4000000000000057,
 'Bulgar spinach Salad': 0.10000000000000853,
 'Mango Cheese': 0.10000000000000853,
 'Crumbed Calamari chips': 0.10000000000000853,
 'Slaw Salad': 0.20000000000000284,
 'Mango': -1.2000000000000028,
 'Rice & Fish': 0.20000000000000284,
 'Almonds Cheese': -0.09999999999999432,
 'Nectarine': -1.7000000000000028,
 'Banana Cheese': 0.7000000000000028,
 'Mediteranean Salad': 0.7000000000000028,
 'Almonds': -4.099999999999994}

我正在尝试使用Pandas从字典中获取每个食物项的值的总和：

fooddata = pd.DataFrame(list(foodWeight.items()), columns=['food','weight']).groupby('food')['weight'].agg(['sum']).sort_values(by='sum', ascending=0)

上面的代码给出了正确的输出：

                           sum
food                          
Banana Cheese              0.7
Mediteranean Salad         0.7
Seed Nut Cheese Biscuits   0.4
Veg Stew                   0.4
No carbs Ice Cream         0.4
Slaw Salad                 0.2
Rice & Fish                0.2
Almonds Mango              0.1
Bulgar spinach Salad       0.1
Crumbed Calamari chips     0.1
Frikkadels Salad           0.1
Mango Cheese               0.1
Chocolate Bomb             0.0
Burrito Salad              0.0
Fried Eggs Cheese Avocado  0.0
Burger and Chips          -0.1
Traditional Breakfast     -0.1
Almonds Cheese            -0.1

但是，我需要输出2列中的输出，而不是Pandas给我的输出。

如何将输出转换为可以绘制数据的格式。 I.E标签和价值作为单独的值

Answer 1

您可以在groupby中使用参数as_index=False并汇总sum：

fooddata = pd.DataFrame(list(foodWeight.items()), columns=['food','weight'])

print (fooddata.groupby('food', as_index=False)['weight']
               .sum()
               .sort_values(by='weight', ascending=0))
                        food  weight
2              Banana Cheese     0.7
12        Mediteranean Salad     0.7
20                  Veg Stew     0.4
14        No carbs Ice Cream     0.4
16  Seed Nut Cheese Biscuits     0.4
18                Slaw Salad     0.2
15               Rice & Fish     0.2
3       Bulgar spinach Salad     0.1
6     Crumbed Calamari chips     0.1
11              Mango Cheese     0.1
4             Chocolate Bomb     0.0
1             Almonds Cheese    -0.1
19           Spanish Omlette    -0.2
10                     Mango    -1.2
8           Frikerdels Salad    -1.2
9                      Fruit    -1.3
7      Crumbed Chicken Salad    -1.3
13                 Nectarine    -1.7
5              Chorizo Pasta    -2.0
17           Seed Nut Muesli    -3.9
0                    Almonds    -4.1

另一个解决方案是添加reset_index：

print (fooddata.groupby('food')['weight']
               .sum()
               .sort_values(ascending=0)
               .reset_index(name='sum'))
                        food  sum
0              Banana Cheese  0.7
1         Mediteranean Salad  0.7
2                   Veg Stew  0.4
3   Seed Nut Cheese Biscuits  0.4
4         No carbs Ice Cream  0.4
5                 Slaw Salad  0.2
6                Rice & Fish  0.2
7     Crumbed Calamari chips  0.1
8               Mango Cheese  0.1
9       Bulgar spinach Salad  0.1
10            Chocolate Bomb  0.0
11            Almonds Cheese -0.1
12           Spanish Omlette -0.2
13                     Mango -1.2
14          Frikerdels Salad -1.2
15     Crumbed Chicken Salad -1.3
16                     Fruit -1.3
17                 Nectarine -1.7
18             Chorizo Pasta -2.0
19           Seed Nut Muesli -3.9
20                   Almonds -4.1

对于绘图最好不要重置索引 - 然后索引的值创建轴x - 使用plot：

fooddata.groupby('food')['weight'].sum().sort_values(ascending=0).plot()

或者如果需要情节barh：

fooddata.groupby('food')['weight'].sum().sort_values(ascending=0).plot.barh()

Answer 2

通过

调用group时设置as_index = False

fooddata = pd.DataFrame(list(foodWeight.items()), columns=['food','weight']).groupby('food',as_index=False).agg({"weight":"sum"}).sort_values(by='weight', ascending=0)

Answer 3

分组后，您需要重置索引或在调用In [138]: pd.pivot_table(df, index=['a','b','c']) Out[138]: d a b c 4.3 3.0 1.1 0.1 4.4 2.9 1.4 0.2 3.0 1.3 0.2 3.2 1.3 0.2 4.5 2.3 1.3 0.3 4.6 3.1 1.5 0.2 3.2 1.4 0.2 3.4 1.4 0.3时使用as_index=False。对此post进行释义，默认情况下，聚合函数将不会返回聚合的组（如果它们是命名列）。相反，分组列将是返回对象的索引。之后传递groupby或致电as_index=False，如果它们被命名为列，则会返回您聚合的组。

请参阅下文，我尝试将结果转换为有意义的图表：

reset_index

这导致

Pandas Dataframe groupby语句输出为2列

3 个答案: