我有一个数据框,如下所示:
dashboard = pd.DataFrame({
'id':[1,2,3,4],
'category': ['a', 'b', 'a', 'c'],
'price': [123, 151, 21, 24],
'description': ['IT related', 'IT related', 'Marketing','']
})
我需要添加一行以仅显示某些类别的总和和计数,如下所示:
pd.DataFrame({
'id': [3],
'category': ['a&b'],
'price': [295],
'description': ['']
})
答案 0 :(得分:1)
使用.agg
的选项:
dashboard = pd.DataFrame({
'id': [1, 2, 3, 4],
'category': ['a', 'b', 'a', 'c'],
'price': [123, 151, 21, 24],
'description': ['IT related', 'IT related', 'Marketing', '']
})
a_b = dashboard[dashboard['category'].isin(['a','b'])].agg({'id':'count', 'price':sum})
df = pd.DataFrame({'a&b':a_b})
收益
a&b
id 3
price 295
然后您可以.transpose()
,然后根据需要合并到现有的数据框中,或者编译汇总结果的单独数据框,等等。
答案 1 :(得分:0)
我预先计算了每个类别的所有总和,然后为每对货币总和加上类别名称,并添加新行。
尝试一下:
import pandas as pd
dashboard = pd.DataFrame({
'id': [1, 2, 3, 4],
'category': ['a', 'b', 'a', 'c'],
'price': [123, 151, 21, 24],
'description': ['IT related', 'IT related', 'Marketing', '']
})
pairs = [('a', 'b')]
groups = dashboard.groupby("category")['price'].sum()
for c1, c2 in pairs:
new_id = sum((dashboard['category'] == c1) | (dashboard['category'] == c2))
name = '{}&{}'.format(c1, c2)
price_sum = groups[c1] + groups[c2]
dashboard = dashboard.append(pd.DataFrame({'id': [new_id], 'category': [name], 'price': [price_sum], 'description': ['']}))
print(dashboard)
答案 2 :(得分:0)
dashboard = pd.DataFrame({
'id':[1,2,3,4],
'category': ['a', 'b', 'a', 'c'],
'price': [123, 151, 21, 24],
'description': ['IT related', 'IT related', 'Marketing','']
})
selection =['a','b']
selection_row = '&'.join(selection)
df2 = dashboard[dashboard['category'].isin(selection)].agg({'id' : ['count'], 'price' : ['sum']}).fillna(0).T
df2['summary'] = df2['count'].add(df2['sum'])
df2.loc['description'] =np.nan
df2.loc['category'] = selection_row
final_df = df2['summary']
final_df
id 3
price 295
description NaN
category a&b
Name: summary, dtype: object