我有一系列CSV,其中包含一个包含Python datetime
格式字符串的列。在解析CSV文件(可能长达数万行)时,我希望将日期列从字符串转换为实际的datetime
对象。
示例CSV行:
['0', '(2011, 12, 11, 15, 45, 20)', 'Arduino/libraries/dallas-temperature-control/'],
如您所见,日期以datetime
格式以CSV格式表示,但为字符串。
我正在寻找一种快速构建datetime
对象的方法,而无需通过datetime.strptime(row[1], "(%Y, %m, %d, %H, %M, %S)")
运行它 - 在strptime
时用nullptr
来解释日期似乎是违反直觉的。准备好按原样投入。
答案 0 :(得分:4)
就像@jonrhsarpe在他的回答中所说,你可以使用ast.literal_eval
将字符串转换为元组,然后将其解压缩到字符串中。
但基于以下测试,似乎更快的方法仍然是使用datetime.datetime.strptime()
。示例 -
代码 -
import datetime
import ast
def func1(datestring):
return datetime.datetime(*ast.literal_eval(datestring))
def func2(datestring):
return datetime.datetime.strptime(datestring, '(%Y, %m, %d, %H, %M, %S)')
时间信息 -
In [39]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 30.1 µs per loop
In [40]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 26.9 µs per loop
In [41]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 38.6 µs per loop
In [42]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 28.8 µs per loop
In [43]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 31.2 µs per loop
In [44]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 29.5 µs per loop
In [45]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
The slowest run took 5.51 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 32.6 µs per loop
In [46]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
The slowest run took 15.42 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 27.5 µs per loop
In [47]: %timeit func1("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 49.2 µs per loop
In [48]: %timeit func2("(2011, 12, 11, 15, 45, 20)")
10000 loops, best of 3: 24.4 µs per loop
不确定,在哪里获得了datetime.datetime.strptime()
反直觉的信息,但我想说要将字符串解析为日期时间对象,您应该使用strptime()
。
答案 1 :(得分:3)
您可以使用ast.literal_eval
将字符串转换为整数元组:
>>> import ast
>>> ast.literal_eval('(2011, 12, 11, 15, 45, 20)')
(2011, 12, 11, 15, 45, 20)
然后,您可以将此解压缩(请参阅例如What does ** (double star) and * (star) do for parameters?)直接打开datetime
构造函数:
>>> import datetime
>>> datetime.datetime(*ast.literal_eval('(2011, 12, 11, 15, 45, 20)'))
datetime.datetime(2011, 12, 11, 15, 45, 20)