从Pandas中的列中减去固定日期

时间:2016-07-23 05:26:19

标签: python pandas

考虑

In [99]: d = pd.to_datetime({'year':[2016], 'month':[06], 'day':[01]})
In [100]: d1 = pd.to_datetime({'year':[2016], 'month':[01], 'day':[01]})

In [101]:d - d1
Out[101]: 
0   152 days
dtype: timedelta64[ns]

但是当我尝试为整个专栏做这件事时,它给了我麻烦。考虑:

df['Age'] = map(lambda x:x - pd.to_datetime({'year':[2016], 'month':[06], 'day':[01]}), df['Manager_DoB'])

df['Manager_Dob']是一列datetime对象。 它标记以下错误:

TypeError: can only operate on a datetime with a rhs of a timedelta/DateOffset for addition and subtraction, but the operator [__rsub__] was passed

1 个答案:

答案 0 :(得分:2)

您不需要使用map *,您可以从日期时间列/系列中减去时间戳:

In [11]: d = pd.to_datetime({'year':[2016], 'month':[6], 'day':[1]})

In [12]: d
Out[12]:
0   2016-06-01
dtype: datetime64[ns]

In [13]: d[0]  # This is the Timestamp you are actually interested in subtracting
Out[13]: Timestamp('2016-06-01 00:00:00')

In [14]: dates = pd.date_range(start="2016-01-01", periods=4)

In [15]: dates - d[0]
Out[15]: TimedeltaIndex(['-152 days', '-151 days', '-150 days', '-149 days'], dtype='timedelta64[ns]', freq=None)

您可以使用构造函数更直接地获取时间戳:

In [21]: pd.Timestamp("2016-06-01")
Out[21]: Timestamp('2016-06-01 00:00:00')

* 你永远不应该使用python的地图和pandas,而不是.apply