我有一个.txt填充了我想要过滤的数据,因为有些行是重复的,唯一的区别是时间戳恰好是2小时之后。应该省略那些副本的晚期版本(例如附加示例中的第一行)。所有其他行应保留并写入新的.txt文件。
1_3_IMM 2016-07-19 16:11:56 00:00:40 2 Sensor Check # should go
1_3_IMM 2016-07-19 14:12:40 00:00:33 2 Sensor Check # should go
1_3_IMM 2016-07-19 14:11:56 00:00:40 2 Sensor Check # should stay
1_3_IMM 2016-07-19 16:12:40 00:00:33 2 Sensor Check # should stay
1_4_IMM 2016-07-19 17:23:25 00:00:20 2 Sensor Check # should stay
1_4_IMM 2016-07-19 19:23:25 00:00:20 2 Sensor Check # should go
1_4_IMM 2016-07-19 19:15:24 00:02:21 2 Sensor Check # should stay
1_4_IMM 2016-07-19 19:25:13 00:02:13 2 Sensor Check # should stay
我尝试编写一些Python代码来执行任务,但我害怕编码已经有点太久以至于我无法成功。任何人都可以就此问题向我提供一些反馈意见吗?请参阅下面的代码。
def filter_file():
with open("output.txt", "w") as output:
with open("input.txt","r") as logger_input:
for line in logger_data:
if...:
#compare current line with all other lines and DON'T copy
#current line to output file if:
#1. Machine number is similar (eg 1_3_IMM) &
#2. Date stamp is similar &
#3. Time stamp is similar with a +02:00:00 difference
else:
output.write(...) #write line to output file
output.write("\n") #go to new line
if __name__ == "__main__":
filter_file()
谢谢!