找出对象与对象之间的区别熊猫的时间字段

时间:2015-04-28 18:51:16

标签: python python-2.7 python-3.x pandas

我有一个pandas数据帧“df_OUT”,如下所示。我正在使用python 2.7 -

__int128

数据框中的值如下所示 -

>>> df_OUT.dtypes
TRX_DATE              datetime64[ns]
ACTUAL_DATE_CLOSED            object

现在我想找到“TRX_DATE”和&带有&的数字中的“ACTUAL_DATE_CLOSED”没有日子。

我尝试了下面的内容 -

>>> df_OUT.head(5)
    TRX_DATE   ACTUAL_DATE_CLOSED
0 1995-09-08  4712-12-31 00:00:00
2 2003-06-30  4712-12-31 00:00:00
3 2003-06-30  4712-12-31 00:00:00
4 2003-06-30  4712-12-31 00:00:00
6 1999-08-31  2099-08-31 00:00:00

这给了我错误 -

df_FINAL_RESULTS['TRX_DATE']-df_FINAL_RESULTS['ACTUAL_DATE_CLOSED'].map(lambda x: x.strftime('%Y-%m-%d'))

你能指导我吗?

感谢。

1 个答案:

答案 0 :(得分:1)

你的问题是pandas Timestamp的最大日期是2261年。我们需要使用python datetime.date构造。

# this is not nice data - well past pandas.Timestamp.max
# let's get it as strings into a pandas DataFrame
data = """index, TRX_DATE,   ACTUAL_DATE_CLOSED
0,   1995-09-08,  4712-12-31 00:00:00
2,   2003-06-30,  4712-12-31 00:00:00
3,   2003-06-30,  4712-12-31 00:00:00
4,   2003-06-30,  4712-12-31 00:00:00
6,   1999-08-31,  2099-08-31 00:00:00
"""
from StringIO import StringIO # import from io for Python 3
df = pd.read_csv(StringIO(data), header=0, sep=',', index_col=0, 
    skipinitialspace=True, dtype={'ACTUAL_DATE_CLOSED': object})

# convert to python datetime.date - will do in new columns
import datetime as dt
df['closed'] = [dt.datetime.strptime(x, '%Y-%m-%d %H:%M:%S', ).date() 
    for x in df['ACTUAL_DATE_CLOSED']]
df['transaction'] = [dt.datetime.strptime(x, '%Y-%m-%d', ).date() 
    for x in df['TRX_DATE']]

# find the difference between the two dates
df['difference'] = df['closed'] - df['transaction']