我使用Spyder的分析器来运行python脚本,该脚本可处理700,000行数据,
并且time.strptime
函数需要超过 60s (内置函数sort
只需要11秒)。
我应该如何提高效率?是否有任何有效的时间操作模块?
核心代码段在这里:
data = []
fr = open('big_data_out.txt')
for line in fr.readlines():
curLine = line.strip().split(',')
curLine[2] = time.strptime( curLine[2], '%Y-%m-%d-%H:%M:%S')
curLine[5] = time.strptime( curLine[5], '%Y-%m-%d-%H:%M:%S')
# print curLine
data.append(curLine)
data.sort(key = lambda l:( l[2], l[5], l[7]) )
#print data
result = []
for itm in data:
if itm[2] >= start_time and itm[5] <= end_time and itm[1] == cameraID1 and itm[4] == cameraID2:
result.append(itm)
答案 0 :(得分:0)
从这里给出的答案: A faster strptime?
>>> timeit.timeit("time.strptime(\"2015-02-04 04:05:12\", \"%Y-%m-%d %H:%M:%S\")", setup="import time")
17.206257617290248
>>> timeit.timeit("datetime.datetime(*map(int, \"2015-02-04 04:05:12\".replace(\":\", \"-\").replace(\" \", \"-\").split(\"-\")))", setup="import datetime")
4.687687893159023