Question

我正在尝试将datetime列date合并到我的列中，同时对Annual_cost列取平均值。我的df看起来像这样：

        date        yearly_cost_x  yearly_cost_y  yearly_cost
0     2009-01-01        5               7              3
1     2009-01-02        8               7              4
2     2009-01-03        23              6              6

我希望通过'date'合并df，并且一旦合并将取3个值的平均值，以为名为Yearly_Cost的列中的每一行创建一个值。我觉得这应该很容易，但是却在努力挣扎并收到一些错误。

我希望我的df输出看起来像这样：

        date        Yearly_Cost
0     2009-01-01        5
1     2009-01-02        6.33
2     2009-01-03        11.66

任何帮助将不胜感激！

添加：

我有一列包含多个日期，一个yearly_cost列也是如此。看起来像这样：

        date        Yearly_Cost
0     2009-01-01        5
1     2009-01-02        6
2     2009-01-03        11
3     2009-01-01        12
4     2009-01-02        45
5     2009-01-03        32

我希望它看起来像这样：

        date        Yearly_Cost
0     2009-01-01        8.5
1     2009-01-02        25.5
2     2009-01-03        21.5

Answer 1

将mean与每行axis=1一起使用DataFrame.set_index，最后DataFrame使用Series.reset_index：

df1 = df.set_index('date').mean(axis=1).reset_index(name='Yearly_Cost')
print (df1)
         date  Yearly_Cost
0  2009-01-01     5.000000
1  2009-01-02     6.333333
2  2009-01-03    11.666667

如果可能，另一个非yearly列按DataFrame.filter过滤列：

df1 = df.set_index('date').filter(like='yearly_').mean(axis=1).reset_index(name='Yearly_Cost')

Answer 2

尝试一下：

df['Yearly_Cost'] = df['Yearly_Cost'].apply(lambda x : (x['yearly_cost_x']+x['yearly_cost_y']+x['yearly_cost'])/3 )
df.drop(['yearly_cost_x','yearly_cost_y','yearly_cost'],axis=1)

如何在取平均值的同时合并datetime列上的值

2 个答案: