我遇到了一个我目前仍然坚持的问题。
我有一个大文件,格式如下:
Line 1: Something/Type2
Line 2: Time
Line 3: Data we need
Line 4: 00.*
Line 5: Fix 100
Line 6: In..
Line 7: Ou..
Line 8: Data we need
Line 9: Next
Line 10: Multi_Exit
Line 1: Something/Type1
Line 2: Time
Line 3: Data we need
Line 4: 00.*
Line 5: Fix 100
Line 6: In..
Line 7: Ou..
Line 8: Data we need
Line 9: Next
Line 10: Multi_Exit
Line 1: Something/Type1
Line 2: Time
Line 3: Data we need
Line 4: 00.*
Line 5: Fix 100
Line 6: In..
Line 7: Ou..
Line 8: Data we need
Line 9: Next
Line 10: Multi_Exit
Line 1: Type1/Type2
Line 2: Time
Line 3: Data we need
Line 4: 00.*
Line 5: Fix 100
Line 6: In..
Line 7: Ou..
Line 8: Data we need
Line 9: Next
Line 10: Multi_Exit
我想读取每个块的第一行,以检查Type1或Type2。在此之后,我想打印每个块的第3行和第8行,并继续这样做,直到文件结束。
我尝试过以下代码:
p = './file.txt'
fin = open(p, 'r')
for i, line in enumerate(fin):
if i%11 == 2 or i%11 == 7:
print line
fin.close()
我注意到,在我的大文件上运行此代码后,该行会发生变化。我只能假设我的块长度没有固定为10行(在下一个块开始之前加上一个行空间)。所以这种方法并不理想。
我也尝试过正则表达式,但是我无法将结果存储到我想要的格式中,例如:
For Type 1
文件输出应为: 第3行:数据第8行:数据
它之间的单一空间。
这是我尝试的下一个代码:
for line in fin:
if re.match("(Line 1|Line 3|Line 8)", line):
writeToFile(line)
writeToFile函数执行以下操作:
def writeToFile(filein):
p = './output.txt'
fo = open(p, 'a')
fo.write(filein)
fo.close()
这是output.txt文件的外观:
Line 1: Something/Type2
Line 3: Data we need
Line 8: Data we need
Line 1: Something/Type1
Line 3: Data we need
Line 8: Data we need
Line 1: Something/Type1
Line 3: Data we need
Line 8: Data we need
这并不是理想的结果。我甚至不介意使用此输出文件并检查第1行是否为类型1.然后获取第3行和第8行将它们放在同一行。继续这样做,直到找到类型2并对第3行和第8行执行相同操作并将其存储在不同的输出文件中。
我希望我没有复杂的事情。
编辑:
抱歉,我不清楚也犯了错误。
第1行:第一部分/我不感兴趣。在那之后我很感兴趣,有时可能会输入Type1或Type2。
理想情况下,输出应该是,在第一行中查找Type,如果是Type2输出:
Line 1: Type2 Line 3: Data we need Line 8: Data we need
如果Type1:
Line 1: Type1 Line 3: Data we need Line 8: Data we need
Line 1: Type1 Line 3: Data we need Line 8: Data we need
将所有具有相同类型的块分组。
编辑: 感谢用户:Floris
,我现在得到了我想要的输出如果我将它提供给我的写入文件功能。
def writeToFile(type, outputString):
p = './output'+type+'.txt'
fo = open(p, 'a')
line = '%s %s\n' % (type, outputString)
fo.write(line)
fo.close()
这是我的结果:
Type2 Line 3: Data we need Line 8: Data we need
和
Type1 Line 3: Data we need Line 8: Data we need
Type3 Line 3: Data we need Line 8: Data we need
当我指定如何将其保存为类型路径时,我的writeToFile按类型对其进行排序。
谢谢
答案 0 :(得分:0)
看看以下代码是否为您提供解决问题所需的灵感 - 我打赌它会:
import re
fin = open("./file.txt")
for line in fin:
if re.match("Line 1:", line):
# note we need to match "Line 1:" (including colon) so we don't match "Line 10"
m = re.match(".*Type(.)", line)
type = m.group(1)
# we now know what type this group is
if re.match("Line 3:", line):
m = re.match(".*3:(.*)$", line)
outputString = m.group(1)
# have first half of output string
if re.match("Line 8:", line):
m = re.match(".*8:(.*)$", line)
outputString += m.group(1)
# have second half of output string, and we know where it needs to go:
print "concatenated string of type ", type," is ", outputString
# now send it where you want it to go... one of two open files, perhaps?
fin.close()