Question

因此，我正在使用熊猫，并尝试在“总计”中添加新列，该列是该年所有车辆总数的总和。

从此：

    type            year     number

Private cars        2005    401638
Motorcycles         2005    138588
Off peak cars       2005    12947
Motorcycles         2005    846

对于这样的事情：

 type            year       number       Total

Private cars        2005    401638      554019
Motorcycles         2005    138588
Off peak cars       2005    12947
Motorcycles         2005    846

Answer 1

将GroupBy + transform与sum一起使用：

df['Year_Total'] = df.groupby('year')['number'].transform('sum')

请注意，这将为您提供每一行的年度总计。如果您希望某些行的总计“空白”，则应为此精确指定逻辑。

Answer 2

使用GroupBy.transform，然后在必要时替换重复的值：

df['Total'] = df.groupby('year')['number'].transform('sum')
print (df)
            type  year  number  Total
0   Private cars  2005       1      3
1    Motorcycles  2005       2      3
2  Off peak cars  2006       5     20
3    Motorcycles  2006       7     20
4   Motorcycles1  2006       8     20

df.loc[df['year'].duplicated(), 'Total'] = np.nan
print (df)
            type  year  number  Total
0   Private cars  2005       1    3.0
1    Motorcycles  2005       2    NaN
2  Off peak cars  2006       5   20.0
3    Motorcycles  2006       7    NaN
4   Motorcycles1  2006       8    NaN

可以替换为空值，但不建议这样做，因为用字符串和某些函数获取混合值会失败：

df.loc[df['year'].duplicated(), 'Total'] = ''
print (df)
            type  year  number Total
0   Private cars  2005       1     3
1    Motorcycles  2005       2      
2  Off peak cars  2006       5    20
3    Motorcycles  2006       7      
4   Motorcycles1  2006       8

Answer 3

这给出了类似的数据框：

total = df['numer'].sum()
df['Total'] = np.ones_line(df['number'].values) * total

在pandas中添加新列，这是另一列的值的总和

3 个答案: