因此,我正在使用熊猫,并尝试在“总计”中添加新列,该列是该年所有车辆总数的总和。
从此:
type year number
Private cars 2005 401638
Motorcycles 2005 138588
Off peak cars 2005 12947
Motorcycles 2005 846
对于这样的事情:
type year number Total
Private cars 2005 401638 554019
Motorcycles 2005 138588
Off peak cars 2005 12947
Motorcycles 2005 846
答案 0 :(得分:2)
将GroupBy
+ transform
与sum
一起使用:
df['Year_Total'] = df.groupby('year')['number'].transform('sum')
请注意,这将为您提供每一行的年度总计。如果您希望某些行的总计“空白”,则应为此精确指定逻辑。
答案 1 :(得分:2)
使用GroupBy.transform
,然后在必要时替换重复的值:
df['Total'] = df.groupby('year')['number'].transform('sum')
print (df)
type year number Total
0 Private cars 2005 1 3
1 Motorcycles 2005 2 3
2 Off peak cars 2006 5 20
3 Motorcycles 2006 7 20
4 Motorcycles1 2006 8 20
df.loc[df['year'].duplicated(), 'Total'] = np.nan
print (df)
type year number Total
0 Private cars 2005 1 3.0
1 Motorcycles 2005 2 NaN
2 Off peak cars 2006 5 20.0
3 Motorcycles 2006 7 NaN
4 Motorcycles1 2006 8 NaN
可以替换为空值,但不建议这样做,因为用字符串和某些函数获取混合值会失败:
df.loc[df['year'].duplicated(), 'Total'] = ''
print (df)
type year number Total
0 Private cars 2005 1 3
1 Motorcycles 2005 2
2 Off peak cars 2006 5 20
3 Motorcycles 2006 7
4 Motorcycles1 2006 8
答案 2 :(得分:0)
这给出了类似的数据框:
total = df['numer'].sum()
df['Total'] = np.ones_line(df['number'].values) * total