如何从字符串中分割日期和时间?

时间:2018-02-06 07:23:51

标签: python python-3.x pandas data-analysis

我有一个从pandas导入的excel数据集。字符串格式有一列日期和时间。

16-MAR-16 11.35.27.000000000 AM
05-APR-16 05.21.14.000000000 PM
16-FEB-16 09.56.36.000000000 AM
16-MAR-16 11.35.27.000000000 AM
16-MAR-16 09.28.11.000000000 AM
19-MAY-16 03.50.38.000000000 PM

我想将这些数据分成不同的日期和时间。我经历了几个相同的问题,但找不到答案。

我试过的这段代码

(1)df["timestamp"] = pd.to_datetime(df['Invoice Date'],dayfirst = True)

(Error)File "C:\Users\admin\Anaconda3\lib\site-packages\dateutil\parser.py", line 559, in parse
raise ValueError("Unknown string format")
ValueError: Unknown string format


(2)from datetime import datetime

df["timestamp"] = df["Invoice Date"].apply(lambda x: datetime.strptime(x,"dd-mmm-yy hh.mm.ss.%f aa"))

(Error)  ValueError: time data '16-MAR-16 11.35.27.000000000 AM' does not match format 'dd-mmm-yy hh.mm.ss.%f aa'

请帮帮我。

1 个答案:

答案 0 :(得分:4)

使用参数formatreference

df["timestamp"] = pd.to_datetime(df['Invoice Date'], format='%d-%b-%y %H.%M.%S.%f %p')
print (df)

                      Invoice Date           timestamp
0  16-MAR-16 11.35.27.000000000 AM 2016-03-16 11:35:27
1  05-APR-16 05.21.14.000000000 PM 2016-04-05 05:21:14
2  16-FEB-16 09.56.36.000000000 AM 2016-02-16 09:56:36
3  16-MAR-16 11.35.27.000000000 AM 2016-03-16 11:35:27
4  16-MAR-16 09.28.11.000000000 AM 2016-03-16 09:28:11
5  19-MAY-16 03.50.38.000000000 PM 2016-05-19 03:50:38

如果想要2个单独的列来表示日期和时间:

d = pd.to_datetime(df['Invoice Date'], format='%d-%b-%y %H.%M.%S.%f %p')
df['date'] = d.dt.date
df['time'] = d.dt.time
print (df)

                      Invoice Date        date      time
0  16-MAR-16 11.35.27.000000000 AM  2016-03-16  11:35:27
1  05-APR-16 05.21.14.000000000 PM  2016-04-05  05:21:14
2  16-FEB-16 09.56.36.000000000 AM  2016-02-16  09:56:36
3  16-MAR-16 11.35.27.000000000 AM  2016-03-16  11:35:27
4  16-MAR-16 09.28.11.000000000 AM  2016-03-16  09:28:11
5  19-MAY-16 03.50.38.000000000 PM  2016-05-19  03:50:38