Question

我有一个文本文件，其中包含网页的源代码，我想读取该文件并提取其间的所有字符串

["null and "]

这可能是什么代码脚本？我正在使用那个

while True:
        p = data[start:].find('["n')

Answer 1

您可以使用已经安装的python re模块。这是一个例子：

import re
s = 'randomtext ["null and "] targettext ["null and "] morerandomtext'
m = re.search(' \["null and "\](.+)\["null and "\]', s)
m[1]

返回：'targettext'
如果您不想要空格：

m = re.search(' \["null and "\] (.+) \["null and "\]', s)
m[1]

返回：'targettext'

这里的技巧只是为了逃避括号，因为我们希望将它们视为文字，并且我们使用（。+）在其中捕获标记之间的目标文本。匹配任何东西，+表示多个字符的可能性，括号允许我们捕获内部的任何内容作为参数。

不确定这是否正是您正在寻找的，但无论如何，使用python re模块可以帮助您满足您的需求。

Answer 2

如果你有一个文件“myfile”，如：

ccc["null and"]sss["multiline strin
g"] c
aaa

以下可能是您所需要的：

import re

with open("myfile", "rt") as f:
    match = re.findall(
        r'\["([^\[\]"]*)"]',
        f.read(),
        re.MULTILINE)
if match:
    print(match)

打印列表：

['null and', 'multiline strin\ng']

想使用python从文本文件中提取字符串？

2 个答案: