将字符串'yyyy-mm-dd hh:mm:ss'日期转换为整数(pandas,python)

时间:2013-12-19 18:32:33

标签: python datetime pandas

我需要将两个字符串之间的差异转换为yyyy-mm-dd hh:mm:ss,格式为datetime,代表integer。因为我想在DataFrame对象的所有索引(用pandas构建)上执行此操作,所以我需要一个内置函数来执行类似

的操作
data['difference'] = somefunc(data['date1'],data['date2'])

这样的功能存在吗?如果我构建自己的函数,如何将其应用于DataFrame列?

提前致谢!

2 个答案:

答案 0 :(得分:0)

点击此链接:http://docs.python.org/2/library/time.html?highlight=strptime 基本上,您可以将字符串解析为struct_time变量,然后通过属性(tm_hour,tm_min ...)访问值。

检查time.strptime的示例。

答案 1 :(得分:0)

需要numpy> = 1.7。这是为了熊猫0.13(即将发布)。请参阅文档here

In [3]: df = DataFrame(dict(A = Timestamp('20130101'), B = Timestamp('20130101')+ pd.to_timedelta(list(range(5)),unit='D')))

In [4]: df
Out[4]: 
                    A                   B
0 2013-01-01 00:00:00 2013-01-01 00:00:00
1 2013-01-01 00:00:00 2013-01-02 00:00:00
2 2013-01-01 00:00:00 2013-01-03 00:00:00
3 2013-01-01 00:00:00 2013-01-04 00:00:00
4 2013-01-01 00:00:00 2013-01-05 00:00:00

[5 rows x 2 columns]

In [5]: df.dtypes
Out[5]: 
A    datetime64[ns]
B    datetime64[ns]
dtype: object

In [6]: df['C'] = df['B']-df['A']

In [7]: df
Out[7]: 
                    A                   B                C
0 2013-01-01 00:00:00 2013-01-01 00:00:00         00:00:00
1 2013-01-01 00:00:00 2013-01-02 00:00:00 1 days, 00:00:00
2 2013-01-01 00:00:00 2013-01-03 00:00:00 2 days, 00:00:00
3 2013-01-01 00:00:00 2013-01-04 00:00:00 3 days, 00:00:00
4 2013-01-01 00:00:00 2013-01-05 00:00:00 4 days, 00:00:00

[5 rows x 3 columns]

In [8]: df.dtypes
Out[8]: 
A     datetime64[ns]
B     datetime64[ns]
C    timedelta64[ns]
dtype: object

In [9]: df['C'].astype('timedelta64[s]')
Out[9]: 
0         0
1     86400
2    172800
3    259200
4    345600
Name: C, dtype: float64

在0.12中你可以这样做

In [1]: df = DataFrame(dict(A = Timestamp('20130101'), B = [Timestamp('20130101')+timedelta(days=i) for i in range(5) ]))

In [2]: df['C'] = df['B']-df['A']

In [3]: Series(df['C'].values / np.timedelta64(1,'s'))
Out[3]: 
0         0
1     86400
2    172800
3    259200
4    345600
dtype: float64