通过两个表之一的列中的公共值求和

时间:2019-03-05 19:30:03

标签: python pandas dataframe

d正在以所有者或经理的身份保存有关人们计划和实际工作时间的数据,以小时为单位。 请注意,一个人可以-但不是必须同时成为所有者和经理。

我需要重新排列d的方式,以便为我提供一列和其他列中的所有名称 每个角色的计划工时和实际工时。

下面的代码可以达到目的,但不是很漂亮。

我如何利用某些本地Pandas功能获得相同的结果,但打字却少一些?

import pandas as pd

d = {
    'owner': ['mike', 'john', 'jake', 'lucy', 'mary', 'hans'],
    'owner planned': [54, 67, 52, 19, 87, 45],
    'owner actual': [12, 54, 3, 67, 84, 22],
    'manager': ['andrew', 'tom', 'john', 'mike', 'hans', 'paul'],
    'manager planned': [13, 432, 453, 765, 432, 234], 
    'manager actual': [22, 33, 44, 55, 66, 77],
}

df = pd.DataFrame(d)
names = list(set(df['owner'].tolist() + df['manager'].tolist()))
output = {}

for name in names:
    op = df[df['owner'] == name]['owner planned'].sum()
    oa = df[df['owner'] == name]['owner actual'].sum()
    mp = df[df['manager'] == name]['manager planned'].sum()
    ma = df[df['manager'] == name]['manager actual'].sum()

    output.setdefault('owner_planned', []).append(op)
    output.setdefault('owner_actual', []).append(oa)
    output.setdefault('manager_planned', []).append(mp)
    output.setdefault('manager_actual', []).append(ma)
    output.setdefault('names', []).append(name)

df2 = pd.DataFrame(output)
print(df2)

2 个答案:

答案 0 :(得分:1)

z = pd.concat([df.iloc[:,:3], df.iloc[:,3:]], sort=True)
z['name'] = z[['owner', 'manager']].mode(1)[0]
z.groupby('name').sum()

出局:

    manager actual  manager planned owner actual    owner planned
name                
andrew  22.0    13.0    0.0 0.0
hans    66.0    432.0   22.0    45.0
jake    0.0 0.0 3.0 52.0
john    44.0    453.0   54.0    67.0
lucy    0.0 0.0 67.0    19.0
mary    0.0 0.0 84.0    87.0
mike    55.0    765.0   12.0    54.0
paul    77.0    234.0   0.0 0.0
tom 33.0    432.0   0.0 0.0

答案 1 :(得分:1)

使用filterconcatDataFrameGroupBy.sum

u = df.filter(like='owner').rename({'owner':'names'}, axis=1)
v = df.filter(like='manager').rename({'manager':'names'}, axis=1)

pd.concat([u,v], sort=False).groupby('names').sum()

        owner planned  owner actual  manager planned  manager actual
names                                                               
andrew            0.0           0.0             13.0            22.0
hans             45.0          22.0            432.0            66.0
jake             52.0           3.0              0.0             0.0
john             67.0          54.0            453.0            44.0
lucy             19.0          67.0              0.0             0.0
mary             87.0          84.0              0.0             0.0
mike             54.0          12.0            765.0            55.0
paul              0.0           0.0            234.0            77.0
tom               0.0           0.0            432.0            33.0