将数据框的所有行与其他行合并在一起

时间:2019-05-22 11:10:14

标签: python pandas dataframe

我有一个数据集

       Item Type     market_share
    Office Supplies     10
     Baby Food          20
  Vegetables            10
       Meat             30
 Personal Care          10
   Household            20

我想汇总除“婴儿食品”列之外的所有行,以便我的数据集看起来像

       Item Type     market_share
      Others            80
     Baby Food          20

我该怎么做,基本上将所有行合并在一起,将它们累加并放入其他行中。

4 个答案:

答案 0 :(得分:5)

您可以使用:

df.groupby(df['Item Type'].eq('Baby Food').map({True:'Baby Food',False:'Others'})).sum()

            market_share
Item Type              
Baby Food            20
Others               80

答案 1 :(得分:2)

根据条件或Series.map创建arraySeries,并将缺失的值转换为NaN并汇总sum

s = np.where(df['Item Type'] == 'Baby Food', 'Baby Food', 'Others')
print (s)
['Others' 'Baby Food' 'Others' 'Others' 'Others' 'Others']

s = df['Item Type'].map({'Baby Food':'Baby Food'}).fillna('Others')
print (s)
0       Others
1    Baby Food
2       Others
3       Others
4       Others
5       Others
Name: Item Type, dtype: object

df = df.groupby(s)['market_share'].sum().rename_axis('Item Type').reset_index()

print (df)
   Item Type  market_share
0  Baby Food            20
1     Others            80

答案 2 :(得分:0)

使用np.where-

df['market_share_2'] = np.where(df['Item Type'].values=='Baby Food', 'Baby Food', 'Others')

输出

         Item Type  market_share market_share_2
0  Office Supplies            10         Others
1        Baby Food            20      Baby Food
2       Vegetables            10         Others
3             Meat            30         Others
4    Personal_Care            10         Others
5        Household            20         Others

然后使用value_counts()-

df['market_share_2'].value_counts()

Others       5
Baby Food    1
Name: market_share_2, dtype: int64

TLDR;

pd.Series(np.where(df['Item Type'].values=='Baby Food', 'Baby Food', 'Others')).value_counts()

答案 3 :(得分:0)

您可以使用除外函数!=和is函数==

df[df['market_share'] != 'Baby Food'].sum()

df[df['market_share'] == 'Baby Food'].sum()