对于文本文件:
[2018-07-11 20:57:08] SYSTEM RESPONSE: "hello"
[2018-07-11 20:57:19] USER INPUT (xvp_dev-0): "hi! how is it going?"
[2018-07-11 20:57:19] SYSTEM RESPONSE: "It's going pretty good.
How about you?
What's good?
Up to anything new?
After a long time"
[2018-07-12 14:05:20] USER INPUT (xvp_dev-0): I've been doing good too!
Thank you for asking.
Nothing is new so far.
Just working on some projects.
[2018-07-12 20:57:19] SYSTEM RESPONSE: Great!
我希望我的输出看起来像:
[2018-07-11 20:57:08] SYSTEM RESPONSE: "hello"
[2018-07-11 20:57:19] USER INPUT (xvp_dev-0): "hi! how is it going?"
[2018-07-11 20:57:19] SYSTEM RESPONSE: "It's going pretty good. How about you?| What's good? Up to anything new?| After a long time"
[2018-07-12 14:05:20] USER INPUT (xvp_dev-0): I've been doing good too! |Thank you for asking. | Nothing is new so far. | Just working on some projects.
[2018-07-12 20:57:19] SYSTEM RESPONSE: Great!
基本上,所有不以时间戳开头的行都转到上一行。 到目前为止,我已经尝试过:
a , b = text_from_index.split(",") # so I get the file name and the date from this
with open("/home/Desktop/"+ a) as log_fd:
file = log_fd.readlines()
x =""
for line in file:
if b in line: # b here is the date. eg- 2018-07-11
x = x + "//" + line[11:]
else:
x=x
x= x.replace("//","<br /> \n")
x= x.replace("]","|")
x= re.sub(r'\(.+?\)', '', x)
到目前为止,我只能通过搜索日期来获取行。 任何建议,将有帮助!谢谢! 请随时问我任何问题或进一步说明
答案 0 :(得分:2)
将当前行存储在一个变量中,例如cur_line
。如果下一行不是以cur_line
开头,请将[
写入新文件,否则将行追加到cur_line
with open('tmp.txt') as in_file, open('out.txt', 'w') as out_file:
lines = in_file.readlines()
cur_line = ''
for l in lines:
l = l.rstrip('\r\n')
if not l:
continue
if l[0] == '[':
out_file.write(cur_line +'\n')
cur_line = l
else:
cur_line += l
out_file.write(cur_line +'\n')
答案 1 :(得分:1)
您可以使用正则表达式来执行此操作。下面的正则表达式与您的时间戳完全匹配。
import re
pattern = re.compile("\[(\d){4}\-(\d){2}\-(\d){2}\s(\d){2}:(\d){2}:(\d){2}\]")
# will match with your timestamp so you can skip these lines and concatenate others
pattern.match(line)
完整的解决方案如下所示:
import re
pattern = re.compile("\[(\d){4}\-(\d){2}\-(\d){2}\s(\d){2}:(\d){2}:(\d){2}\]")
with open("test.txt") as log_fd:
file = log_fd.readlines()
x =""
last = False
for line in file:
if not line in ['\n', '\r\n']:
if pattern.match(line):
if last:
x = x + '\n' + line.strip('\r\n')
else:
x = x + '\n' + line.strip('\r\n')
else:
x = x + ' | ' + line.strip('\r\n')
last = pattern.match(line)
print(x)
它在字符串的开头将有一个空行,但是用您的字符串进行求解并仅打印出结果。绝对不是最优雅的。