python pandas .apply()函数索引错误

时间:2017-01-24 16:02:49

标签: python pandas

我有以下DataFrame:

REMOVE

我正在尝试根据列 P N ID Year Month TS 2016-06-26 19:30:00 263.600006 5.4 5 2016 6 2016-06-26 20:00:00 404.700012 5.6 5 2016 6 2016-06-26 21:10:00 438.600006 6.0 5 2016 6 2016-06-26 21:20:00 218.600006 5.6 5 2016 6 2016-07-02 16:10:00 285.300049 15.1 5 2016 7 Year的值添加新列,如下所示

Month

但我收到以下错误:

  

TypeError :('整数参数预期,浮动','发生在索引2016-06-26 19:30:00')

如果我def exp_records(row): return calendar.monthrange(row['Year'], row['Month'])[1] df['exp_counts'] = df.apply(exp_records, axis=1) 为整数,则上述reset_index()工作正常。这是预期的行为吗?

我正在使用pandas 0.19.1和Python 3.4

重新创建DataFrame的代码:

.apply()

1 个答案:

答案 0 :(得分:2)

解决方案

使用df[['Year', 'Month']]申请:

df['exp_counts'] = df[['Year', 'Month']].apply(exp_records, axis=1)

结果:

                              P     N  ID  Year  Month  exp_counts
TS                                                                
2016-06-26 19:30:00  263.600006   5.4   5  2016      6          30
2016-06-26 20:00:00  404.700012   5.6   5  2016      6          30
2016-06-26 21:10:00  438.600006   6.0   5  2016      6          30
2016-06-26 21:20:00  218.600006   5.6   5  2016      6          30
2016-07-02 16:10:00  285.300049  15.1   5  2016      7          31

原因

虽然您的YearMonth列是整数:

df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 5 entries, 2016-06-26 19:30:00 to 2016-07-02 16:10:00
Data columns (total 5 columns):
P        5 non-null float64
N        5 non-null float64
ID       5 non-null int64
Year     5 non-null int64
Month    5 non-null int64
dtypes: float64(2), int64(3)
memory usage: 240.0 bytes

您可以按行访问它们,这会使它们浮动:

df.T.info()

<class 'pandas.core.frame.DataFrame'>
Index: 5 entries, P to Month
Data columns (total 5 columns):
2016-06-26 19:30:00    5 non-null float64
2016-06-26 20:00:00    5 non-null float64
2016-06-26 21:10:00    5 non-null float64
2016-06-26 21:20:00    5 non-null float64
2016-07-02 16:10:00    5 non-null float64
dtypes: float64(5)
memory usage: 240.0+ bytes

由于df.apply(exp_records, axis=1)逐行,您基本上会转换为行。

这是exp_records row中获得的内容:

P         263.600006
N           5.400000
ID          5.000000
Year     2016.000000
Month       6.000000
Name: 2016-06-26T19:30:00.000000000, dtype: float64

仅使用列YearMonth创建数据框确实会导致转换为浮点数,因为两列都是整数:

df[['Year', 'Month']].T.info()

<class 'pandas.core.frame.DataFrame'>
Index: 2 entries, Year to Month
Data columns (total 5 columns):
2016-06-26 19:30:00    2 non-null int64
2016-06-26 20:00:00    2 non-null int64
2016-06-26 21:10:00    2 non-null int64
2016-06-26 21:20:00    2 non-null int64
2016-07-02 16:10:00    2 non-null int64
dtypes: int64(5)
memory usage: 96.0+ bytes