Question

我是python的新手，我已经遍历了多个python帖子和教程网站以及源文档来解决我的问题，但是，我还不在那里！

我要做什么：我有一个包含多行的文本文件，我首先在其中查找从一个“ MARKERSTRING”到另一个事件的标记文本块。在整个文本中，“ MARKERSTRING”发生了多次，但只有少数在块内具有“ TAILSTRING”。如果找到，那么我想在同一块中最后一次出现的字符串“ BODY”的下面添加新行（“ newstring”）。

我想将所有行保留到一个新文件中，并将新字符串插入给定索引的“ BODY”（该块中的最后一次出现）

我的文本文件的内容如下：

Multiple lines with some other text

MARKERSTRING SOMESTRING SOME OTHER STRING #

BODY A B C
BODY V G H
BODY Y U I

TAILSTRING X1 Y
TAILSTRING X2 Y


MARKERSTRING SOMESTRING SOME OTHER STRING # 

### #Although I want to append this to my file I dont want to process my #function through this as it does not have "TAILSTRING"

BODY B C
BODY V G H J
BODY Y U I

### #But want this block:

MARKERSTRING SOMESTRING SOME OTHER STRING #

BODY B C
BODY V G H J


TAILSTRING X1 Y
TAILSTRING X2 Y


Multiple lines with some other text

END

我的问题如下：

获取索引并插入新字符串的函数仅返回第一次出现的函数。这可能与return语句的位置有关，但是如果缩进更多，则会抱怨“ UnboundLocalError”。如果我使用“ yield”函数，那么它将返回一个对象。我想在此函数中编写新字符串
查找“ MARKERSTRING”的第二部分，将所有行追加到缓冲区，然后调用我的函数，保持多次追加行，而无需插入新字符串。之所以会发生这种情况，是因为我开始在for循环中寻找所需的模式，该模式会提取文件中的每一行。

在不将每行附加到for循环内的情况下，这是更好的方法吗？

类似这样的东西：

import re
from operator import itemgetter
import itertools


### The Function #########
def myfunc(filename):
    highest = None
    for cnt, line in enumerate(filename):

        if line.startswith("BODY "):
            bline = line.split()

            highest = cnt

        if line.startswith("TAIL"):
            lpline = line.split()
            print(lpline)
            newline = "BOND", lpline[2], lpline[4]

            newstring = ' '.join((str(x)) for x in newline)

            bline.insert(highest + 1, newstring) ##This doesnt insert
            return bline

### The "Markerstring" finder snippet: Keeps iterating over all lines #####

filename = open("input.txt").readlines()
outfilename = open("result.txt", 'w+')
buffer = []
keepCurrentSet = True
for line in filename:
    buffer.append(line)
    if (line.startswith('MARKERSTRING '):
        if keepCurrentSet:
            outfilename.write("".join(buffer))

            myfunc(filename)

预期结果：

Multiple lines with some other text


MARKERSTRING SOMESTRING SOME OTHER STRING #

BODY A B C
BODY V G H
BODY Y U I
BODY X1 Y     #Inserted line = newstring
BODY X2 Y     #Inserted line = newstring


TAILSTRING X1 Y
TAILSTRING X2 Y


MARKERSTRING SOMESTRING SOME OTHER STRING # 

### #Although I want to append this to my file I dont want to process my #function through this as it does not have "TAILSTRING"

BODY B C
BODY V G H J
BODY Y U I


### #But want this block:

MARKERSTRING SOMESTRING SOME OTHER STRING #


BODY B C
BODY V G H J
BODY X1 Y        #Inserted line = newstring
BODY X2 Y        #Inserted line = newstring

TAILSTRING X1 Y
TAILSTRING X2 Y

Multiple lines with some other text

END

Answer 1

我不能说为什么您没有得到想要的结果。通常，更改或修改一两行可以解决该问题。

但是，我想出了一个我认为可行的解决方案。

编辑：要在评论部分（如下）回答您的问题，

_, params = line.split(maxsplit = 1)

这将最大拆分值1拆分为2个项目。“ _”是占位符，用于获取（并忽略）第一个拆分项目TAILSTRING。拆分中的第二项（X1 Y或X2 Y）已分配给params。

我还想确保以后，我正在查看的同一MARKERSTRING块中没有BODY X1 Y1

要实现此目的，需要修改代码。

fin = open('f01.txt', 'r')
fout = open('temp.txt', 'w')

buffer = []
idx = 0

for line in fin:
    line = line.rstrip()
    buffer.append(line)
    if line.startswith('MARKERSTRING'):
        for item in buffer:
            fout.write(item + "\n")
        buffer = []
        idx = 0
        # continue because don't want to increment idx at bottom of loop
        # idx should be 0 for this iteration
        continue
    elif line.startswith('BODY'):
        max_body_idx = idx
    elif line.startswith('TAILSTRING'):
        _, params = line.split(maxsplit = 1)
        buffer.insert(max_body_idx+1, 'BODY ' + params)
        max_body_idx += 1
    idx += 1

fin.close()

# print out last record
for item in buffer:
    fout.write(item + "\n")

fout.close()

在具有行索引的模式后插入新字符串并写入新文件

1 个答案: