我的DataFrame是
time NTCS001G002 NTCS001W005
0 2013-05-30 23:00:00 NaN NaN
1 2013-06-30 23:00:00 249 60
2 2013-07-31 23:00:00 161 2
3 2013-09-01 23:00:00 151 11
4 2013-09-04 23:00:00 14 0
5 2013-10-01 23:00:00 162 64
6 2013-11-01 00:00:00 281 175
7 2013-12-03 00:00:00 482 168
8 2014-01-02 00:00:00 378 NaN
9 2014-01-03 00:00:00 NaN NaN
10 2014-02-03 00:00:00 NaN 167
11 2014-03-03 00:00:00 502 167
当我迭代这些行时 对于index,diffs.iterrows()中的行: print“err”,row.tolist()
[12 rows x 3 columns]
err [Timestamp('2013-05-30 23:00:00', tz=None), NaT, NaT]
err [Timestamp('2013-06-30 23:00:00', tz=None), 249.0, 60.0]
err [Timestamp('2013-07-31 23:00:00', tz=None), 161.0, 2.0]
err [Timestamp('2013-09-01 23:00:00', tz=None), 151.0, 11.0]
err [Timestamp('2013-09-04 23:00:00', tz=None), 14.0, 0.0]
err [Timestamp('2013-10-01 23:00:00', tz=None), 162.0, 64.0]
err [Timestamp('2013-11-01 00:00:00', tz=None), 281.0, 175.0]
err [Timestamp('2013-12-03 00:00:00', tz=None), 482.0, 168.0]
err [Timestamp('2014-01-02 00:00:00', tz=None), 378.0, nan]
err [Timestamp('2014-01-03 00:00:00', tz=None), NaT, NaT]
err [Timestamp('2014-02-03 00:00:00', tz=None), nan, 167.0]
err [Timestamp('2014-03-03 00:00:00', tz=None), 502.0, 167.0]
我不确定那些NaT是不是一个bug。我认为他们应该是NaN 可以让熊猫不要返回NaT,如果不能,我怎么能检查它们,因为我必须在列表中替换它们。
由于
答案 0 :(得分:2)
原因是iterrows使每一行成为一个系列,并且这一行被转换为datetime64 ....
In [11]: pd.Series([pd.Timestamp('2014-01-03 00:00:00', tz=None), np.nan, np.nan])
Out[11]:
0 2014-01-03
1 NaT
2 NaT
dtype: datetime64[ns]
答案 1 :(得分:1)
值NaT
表示“非时间”,相当于nan
的时间戳值。
您能告诉数据框的dtypes
吗?尝试将列转换为浮点值。