搜索文本文件的时间范围 - python

时间:2016-08-12 19:25:24

标签: python-2.7

我正在玩python,我正试图找到一种方法来搜索文本文件中某个时间范围内的特定单词。该文件有时间戳,但由于该文件是文本文件,因此一切都是字符串。

文本文件包含以下内容:

17:14:26.442 words words words words words

17:15:32.533 words words words words words

17:16:26.442 words words words words words

17:17:32.533 words words words words words

17:18:26.442 words words words words words

17:19:32.533 words words words words words

17:20:26.442 words words words words words

17:21:32.533 words words words words words

我想要做的是在时间范围内搜索一个单词,然后只返回17:17:32.533和17:20:26.442之间的单词。但是,由于它的文本文档和一切都是字符串我不能使用范围选项。有没有人对如何做到这一点有一些建议?

2 个答案:

答案 0 :(得分:1)

使用datetime模块解析时间戳字符串并将其转换为datetime对象,然后您可以使用比较来检查属于该时间范围的行。

from datetime import datetime as dt

start = dt.strptime('17:17:32.533','%H:%M:%S.%f')
end = dt.strptime('17:20:26.442','%H:%M:%S.%f')
word_to_search = 'word'
with open('sample.txt', 'r') as f:
    for line in f:
        ts=dt.strptime(line.split()[0],'%H:%M:%S.%f')
        if ts>start and ts<end:
            if word_to_search in line:
                print line

答案 1 :(得分:0)

如果时间戳完全采用您描述的格式(HH:MM:SS.sss),那么您可以直接比较:

start = '17:17:32.533'
end = '17:20:26.442'
with open(filename, 'r') as f:
    for line in f:
        if line[:12] >= start and line[:12] <= end:
            print(line)

如果这不起作用,因为例如01:01:01.000输出为1:1:1.0,你必须先解析时间戳。例如:

import datetime
start = datetime.time(17, 17, 32, 533)
end = datetime.time(17, 20, 26, 442)
with open(filename, 'r') as f:
    for line in f:
        timestamp, words = line.split(None, 1)
        time = datetime.strptime(timestamp, "%H:%M:%S.%f").time()
        if time >= start and time <= end:
            print(words)