Question

我有一个txt文件，我想让python读取，我希望python从中提取一个特定于两个字符之间的字符串。这是一个例子：

排队

第b行

第c行

＆amp; TESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTEST！

第d行

第e行

我想要的是python读取行以及何时遇到“＆amp;”我想让它开始打印线条（包括带有“$”的线），直到它遇到“！”

有什么建议吗？

Answer 1

这有效：

data=[]
flag=False
with open('/tmp/test.txt','r') as f:
    for line in f:
        if line.startswith('&'):
            flag=True
        if flag:
            data.append(line)
        if line.strip().endswith('!'):
            flag=False

print ''.join(data)

如果你的文件足够小，那么将它全部读入内存不是问题，并且&或!中没有歧义作为你想要的字符串的开头和结尾，这是更容易：

with open('/tmp/test.txt','r') as f:
    data=''.join(f.readlines())    

print data[data.index('&'):data.index('!')+1]

或者，如果您想要读取整个文件，但只使用&和!，如果它们分别位于行的开头和结尾，则可以使用正则表达式：

import re

with open('/tmp/test.txt','r') as f:
    data=''.join(f.readlines())    

m=re.search(r'^(&.*!)\s*?\n',data,re.S | re.M)    
if m: print m.group(1)

Answer 2

这是一个（非常简单！）的例子。

def Printer():
    f = open("yourfile.txt")
    Pr = False
    for line in f.readlines():
        if Pr: print line
        if "&" in line:
            Pr = True
            print line
        if "!" in line:
            Pr = False
    f.close()

Answer 3

一个简单的解决方案如下所示。代码包含大量注释，使您了解每行代码。代码之美，它与运营商一起使用来处理异常并关闭资源（例如文件）。

#Specify the absolute path to the input file.
file_path = "input.txt" 

#Open the file in read mode. with operator is used to take care of try..except..finally block.
with open(file_path, "r") as f:
    '''Read the contents of file. Be careful here as this will read the entire file into memory. 
       If file is too large prefer iterating over file object
    ''' 
    content = f.read()
    size = len(content)
    start =0
    while start < size:
        # Read the starting index of & after the last ! index.
        start = content.find("&",start)
        # If found, continue else go to end of contents (this is just to avoid writing if statements.
        start = start if start != -1 else size
        # Read the starting index of ! after the last $ index.
        end = content.find("!", start)
        # Again, if found, continue else go to end of contents (this is just to avoid writing if statements.
        end = end if end != -1 else size
        '''print the contents between $ and ! (excluding both these operators. 
           If no ! character is found, print till the end of file.
        ''' 
        print content[start+1:end]
        # Move forward our cursor after the position of ! character. 
        start = end + 1

从python中的txt文件中提取字符之间的字符串

3 个答案: