我有这些格式的日期:
Thursday, September 22, 2016 at 11:04am UTC+02
Monday, January 22, 2018 at 6:46pm CST
...
我想将它们转换为UNIX时间戳。这种模式有效,但忽略了时区:
timestamp = pd.to_datetime(date, format='%A, %B %d, %Y at %H:%M%p', exact=False)
我不知道如何考虑时区(“UTC + 02,”CST“)。
这不起作用:
timestamp = pd.to_datetime(date, format='%A, %B %d, %Y at %H:%M%p %Z')
# ValueError: unconverted data remains: +02
答案 0 :(得分:0)
# ValueError: unconverted data remains: +02
是因为您在使用strptime
时应该解析整个日期字符串,您将离开%z
部分。但您无法在%z
中使用strptime
,请参阅ISO to datetime object: 'z' is a bad directive。
所以也许你可以对你的数据进行某种映射:
timestamp = date.map(lambda x : dateutil.parser.parse(x))
答案 1 :(得分:0)
我知道您要求提供Pandas解决方案,但dateutil
正确处理您的字符串:
import dateutil
from dateutil.tz import gettz
samples = ['Thursday, September 22, 2016 at 11:04am UTC+02',
'Monday, January 22, 2018 at 6:46pm CST']
# American time zone abbreviations
tzinfos = {'HAST': gettz('Pacific/Honolulu'),
'AKST': gettz('America/Anchorage'),
'PST': gettz('America/Los Angeles'),
'MST': gettz('America/Phoenix'),
'CST': gettz('America/Chicago'),
'EST': gettz('America/New York'),
}
for s in samples:
parsed = dateutil.parser.parse(s, fuzzy=True, tzinfos=tzinfos)
print(s, '->', parsed)
输出:
Thursday, September 22, 2016 at 11:04am UTC+02 -> 2016-09-22 11:04:00-02:00
Monday, January 22, 2018 at 6:46pm CST -> 2018-01-22 18:46:00-06:00