我需要将两个字符串之间的差异转换为yyyy-mm-dd hh:mm:ss
,格式为datetime
,代表integer
。因为我想在DataFrame对象的所有索引(用pandas构建)上执行此操作,所以我需要一个内置函数来执行类似
data['difference'] = somefunc(data['date1'],data['date2'])
这样的功能存在吗?如果我构建自己的函数,如何将其应用于DataFrame列?
提前致谢!
答案 0 :(得分:0)
点击此链接:http://docs.python.org/2/library/time.html?highlight=strptime 基本上,您可以将字符串解析为struct_time变量,然后通过属性(tm_hour,tm_min ...)访问值。
检查time.strptime的示例。
答案 1 :(得分:0)
需要numpy> = 1.7。这是为了熊猫0.13(即将发布)。请参阅文档here
In [3]: df = DataFrame(dict(A = Timestamp('20130101'), B = Timestamp('20130101')+ pd.to_timedelta(list(range(5)),unit='D')))
In [4]: df
Out[4]:
A B
0 2013-01-01 00:00:00 2013-01-01 00:00:00
1 2013-01-01 00:00:00 2013-01-02 00:00:00
2 2013-01-01 00:00:00 2013-01-03 00:00:00
3 2013-01-01 00:00:00 2013-01-04 00:00:00
4 2013-01-01 00:00:00 2013-01-05 00:00:00
[5 rows x 2 columns]
In [5]: df.dtypes
Out[5]:
A datetime64[ns]
B datetime64[ns]
dtype: object
In [6]: df['C'] = df['B']-df['A']
In [7]: df
Out[7]:
A B C
0 2013-01-01 00:00:00 2013-01-01 00:00:00 00:00:00
1 2013-01-01 00:00:00 2013-01-02 00:00:00 1 days, 00:00:00
2 2013-01-01 00:00:00 2013-01-03 00:00:00 2 days, 00:00:00
3 2013-01-01 00:00:00 2013-01-04 00:00:00 3 days, 00:00:00
4 2013-01-01 00:00:00 2013-01-05 00:00:00 4 days, 00:00:00
[5 rows x 3 columns]
In [8]: df.dtypes
Out[8]:
A datetime64[ns]
B datetime64[ns]
C timedelta64[ns]
dtype: object
In [9]: df['C'].astype('timedelta64[s]')
Out[9]:
0 0
1 86400
2 172800
3 259200
4 345600
Name: C, dtype: float64
在0.12中你可以这样做
In [1]: df = DataFrame(dict(A = Timestamp('20130101'), B = [Timestamp('20130101')+timedelta(days=i) for i in range(5) ]))
In [2]: df['C'] = df['B']-df['A']
In [3]: Series(df['C'].values / np.timedelta64(1,'s'))
Out[3]:
0 0
1 86400
2 172800
3 259200
4 345600
dtype: float64