python-Pandas df.sum()跨多个列意外arg'轴'错误

时间:2016-11-18 01:56:51

标签: python pandas dataframe typeerror

这里我有一个pandas Dataframe df,如:

     A    B    C
 0   1    2    3
 1   1    2    3
 3   1    2    3

然后我有一个清单

x=['B','C']

我希望得到B& B下每行数的总和。 C列。所以我写得像:

df[x].sum(axis=1).values

然而,我收到错误

TypeError: f() got an unexpected keyword argument 'axis'

我不明白为什么会出错。我的代码在ipython notebook中运行。你能提出任何建议吗?感谢。

更新:真正的df就像:

                  Date      Ayotte  Hassan
Date                                 
2016-06-29     2016-06-29    46.8    45.3
2016-06-30     2016-06-30    46.8    45.3
2016-07-01     2016-07-01    46.8    45.3
2016-07-02     2016-07-02    46.8    45.3
2016-07-03     2016-07-03    46.8    45.3
2016-07-04     2016-07-04    46.8    45.3
2016-07-20     2016-07-20    45.8    45.2
2016-07-21     2016-07-21    45.8    45.2
2016-07-22     2016-07-22    45.8    45.2
   ...            ...         ...     ...
2016-10-09     2016-10-09    48.0    44.5
2016-10-10     2016-10-10    48.0    44.5
2016-10-11     2016-10-11    46.7    44.7
2016-10-16     2016-10-16    46.3    44.0
2016-10-17     2016-10-17    46.3    44.0
2016-10-18     2016-10-18    46.0    44.3
2016-10-19     2016-10-19    45.7    45.3
2016-10-20     2016-10-20    44.0    46.0
2016-10-21     2016-10-21    44.0    46.0
2016-10-22     2016-10-22    44.0    46.0
2016-10-23     2016-10-23    44.0    46.0

df的dtypes是

Date      datetime64[ns]
Ayotte           float64
Hassan           float64
dtype: object

然后,我做了

df = df.resample('D')

上面显示的df是重采样前的数据。列表x是

x=['Ayotte','Hassan']

然后在运行此代码时出现错误

print df[x].sum(axis=1).values

2 个答案:

答案 0 :(得分:1)

很难诊断出没有证明错误的连续示例。

如果我开始时:

import numpy
import pandas
df = pandas.DataFrame(
    numpy.arange(9).reshape(3, 3),
    index=['a', 'b', 'c'],
    columns=['X', 'Y', 'Z']
)
print(df)

给出了:

   X  Y  Z
a  0  1  2
b  3  4  5
c  6  7  8

我可以这样做:

df[['X', 'Y']].sum(axis=1)

获得:

a     1
b     7
c    13
dtype: int64

答案 1 :(得分:0)

在:

,Date,Ayotte,Hassan
2016-06-29,2016-06-29,46.8,45.3
2016-06-30,2016-06-30,46.8,45.3
2016-07-01,2016-07-01,46.8,45.3
2016-07-02,2016-07-02,46.8,45.3
2016-07-03,2016-07-03,46.8,45.3
2016-07-04,2016-07-04,46.8,45.3
2016-07-20,2016-07-20,45.8,45.2

代码:

import pandas as pd

df = pd.read_csv('file.txt', index_col=0)
df['Total'] = df[['Ayotte', 'Hassan']].sum(axis=1)

print df

输出:

                  Date  Ayotte  Hassan  Total
2016-06-29  2016-06-29    46.8    45.3   92.1
2016-06-30  2016-06-30    46.8    45.3   92.1
2016-07-01  2016-07-01    46.8    45.3   92.1
2016-07-02  2016-07-02    46.8    45.3   92.1
2016-07-03  2016-07-03    46.8    45.3   92.1
2016-07-04  2016-07-04    46.8    45.3   92.1
2016-07-20  2016-07-20    45.8    45.2   91.0