如何在re.compile()中找到的行之后打印出行

时间:2010-06-21 18:06:21

标签: python parsing

使用此代码

import re
file = open('FilePath/OUTPUT.01')
lines = file.read()
file.close()
for match in re.finditer(r"(?m)^\s*-+\s+\S+\s+(\S+)", lines):
eng = match.group(1)
open('Tmp.txt', 'w').writelines(eng)
print match.group(1)

我得到一列如下所示的数据:

  

-1.1266E + 05
  -1.1265E + 05
  -1.1265E + 05
  -1.1265E + 05
  -1.1264E + 05
  -1.1264E + 05
  -1.1264E + 05
  -1.1263E + 05
  步骤
  -1.1263E + 05
  -1.1262E + 05
  -1.1262E + 05
  -1.1261E + 05
  -1.1261E + 05
  -1.1260E + 05
  -1.1260E + 05
  -1.1259E + 05
  步骤
  -1.1259E + 05
  -1.1258E + 05
  -1.1258E + 05
  -1.1258E + 05
  -1.1257E + 05
  终止。
  eng_tot
  -1.1274E + 05
  3D

     

如何将其写入文件(Tmp.txt)?截至目前,它只写最后一行'3D'。此外,我想消除所有不是x.xxxxExxx形式的行(即只是数字)。

3 个答案:

答案 0 :(得分:2)

您可以使用单个正则表达式:

file = open('FilePath/OUTPUT.01')
lines = file.read()
file.close()
with open("output.txt","w") as f:
    for match in re.finditer(r"(?m)^\s*-+\s+\S+\s+(-?[\d.]+E[+-]\d+)", lines):
        f.write(match.group(1)+"\n")

这应该将完全由-组成的行之后出现的所有第二个数字写入文件output.txt

此正则表达式假定列以空格分隔,并且第一列永远不会为空。

说明:

(?m)                 # allow ^ to match at start of line, not just start of string
^                    # anchor the search at the start of the line
\s*                  # match any leading whitespace
-+                   # match one or more dashes
\s+                  # match trailing whitespace, including linebreak characters
\S+                  # match a run of non-whitespace characters (we're now one line ahead of the dashes
\s+                  # match a run of whitespace
(-?[\d.]+E[+-]\d+)   # match a number in scientific notation

答案 1 :(得分:0)

ilines所在的line索引,因此i+1是下一行:

print lines[i+1]

确保----不是最后一行,否则会尝试从不存在的位置读取。此外,您的正则表达式\s+-+\s+要求-之前和之后有空格,因为\s+表示一个或多个空格;你可能意味着\s*

答案 2 :(得分:0)

我不会为RE而烦恼。请尝试以下方法:

output = file("tmp.txt", "w")        # open a file for writing
flagged = False                      # when 'flagged == True' we will print the line
for line in file("FilePath/OUTPUT.01"):
    if flagged:
        try:
            result = line.split()[1] # python is zero-indexed!
            print>>output, result    # print to output only if the split worked
        except IndexError:           # otherwise do nothing
            pass
        flagged = False              # but reset the flag
    else:
        if set(line.strip()) == set(["-"]): # does the line consist only of '-'?
            flagged = True           # if so, set the flag to print the next line

这是一个版本,允许您指定行数偏移量和列号:

OFFSET = 3 # the third line after the `----`
COLUMN = 2 # column index 2

output = file("tmp.txt", "w")
counter = 0                           # 0 evaluates as False
for line in file("FilePath/OUTPUT.01"):
    if counter:                       # any non-zero value evaluates as True
        if counter == OFFSET:
            try:
                result = line.split()[COLUMN] 
                print>>output, result # print to output only if the split worked
            except IndexError:        # otherwise do nothing
                pass
            counter = 0               # reset the flag once you've reached the OFFSET line
        else:
            counter += 1
    else:
        if set(line.strip()) == set(["-"]): # does the line consist only of '-'?
            counter = 1