在我尝试学习Python时,为noob问题道歉。期待着加快速度并回馈
假设我有以下数据,
YEAR SECTOR PROFIT STARTMVYEAR TOTALPROFIT STARTMV
IBM TECHNOLOGY -500 2500 500 1500
APPLE TECHNOLOGY 800 4000 300 4500
GM INDUSTRIAL 250 1000 0 1250
CHRYSLER INDUSTRIAL 600 3000 100 3500
我想创建一个如下所示的摘要
SECTOR PROFITYEAR TOTALPROFIT
TECHNOLOGY .046 .133
INDUSTRIAL .213 .021
每个小组的位置都有sum(PROFIT)/sum(STARTMVYEAR)
和sum(TOTALPROFIT)/sum(STARTMV)
如果我想仅为第一个基准测试而做,我可以做到
by_profit_totals =(df.groupby(['SECTOR'])['PROFIT'].sum()/by_first_count.groupby(['SECTOR'])['STARTMVYEAR'].sum())
但我如何为两者做到这一点?此外,是否有我可以使用的简单功能,例如,利润和startmvyear并返回汇总值?
答案 0 :(得分:1)
您可以使用groupby
汇总cython optimized
sum
,然后div
values
创建https://plnkr.co/edit/9rfHtE0PHXPhC5Kcyb7P:
g = df.groupby('SECTOR').sum()
print (g[['PROFIT','TOTALPROFIT']].div( g[['STARTMVYEAR','STARTMV']].values).reset_index())
SECTOR PROFIT TOTALPROFIT
0 INDUSTRIAL 0.212500 0.021053
1 TECHNOLOGY 0.046154 0.133333