我有一个pandas数据框,其列如:
In [96]: data['difference']
Out[96]:
0 NaT
1 1 days 21:34:30
2 0 days 16:57:36
3 0 days 00:16:51
4 0 days 15:52:38
5 0 days 14:19:34
6 0 days 02:54:46
7 1 days 04:21:28
8 0 days 01:58:55
9 0 days 10:30:35
10 0 days 07:53:04
....
Name: difference, dtype: timedelta64[ns]
我想在它旁边创建一个整数列,该列对应于此列中的日期值。
答案 0 :(得分:8)
这会将您的timedelta64[ns]
类型转换为代表日期的float64
:
data['difference'].astype('timedelta64[D]')
答案 1 :(得分:4)
您可以使用dt.days
从系列中提取几天,
df.difference
Out[117]:
0 -1 days +00:00:05
1 NaT
2 -1 days +00:00:05
3 1 days 00:00:00
dtype: timedelta64[ns]
df.difference.dt.days
Out[118]:
0 -1
1 NaN
2 -1
3 1
dtype: float64
所有其他组件提取,
dr
Out[93]:
0 -1 days +00:00:05
1 NaT
2 1 days 02:04:05
3 1 days 00:00:00
dtype: timedelta64[ns]
dr.dt.components
Out[95]:
days hours minutes seconds milliseconds microseconds nanoseconds
0 -1 0 0 5 0 0 0
1 NaN NaN NaN NaN NaN NaN NaN
2 1 2 4 5 0 0 0
3 1 0 0 0 0 0 0
答案 2 :(得分:1)
如果您可以将输出作为字符串输出,那么python应该会有所帮助。
''.join(item[0]+',' for item in re.findall('[0-9]+ days', output))[:-1]