这是我的代码:
def split(content):
pattern = re.compile(r"""(\\\[-16pt]\n)(.*?)(\n\\\nthinhline)""", re.X | re.DOTALL)
print(pattern.finditer(content))
for m in pattern.finditer(content):
print ("in for loop")
print("Matched:\n----\n%s\n----\n" % m.group(2))
print ("in split")
def replacement(content):
split(content)
pattern = re.compile(r'(?<=\\\\\[-16pt]\n)([\s\S]*?)(?=\\\\\n\\thinhline)')
content= ' '.join(re.findall(pattern, content))
print ("in replace")
return content
这是输出:
<callable-iterator object at 0x2ab2e09cfe10>
in split
in replace
我用不同的字符串尝试了算法,它运行正常。我还测试了内容是否是一个字符串,它是。为什么程序没有进入for..loop,即使它进入split()?
谢谢。
答案 0 :(得分:1)
见评论:
def split(content):
pattern = re.compile(r"""(\\\[-16pt]\n)(.*?)(\n\\\nthinhline)""", re.X | re.DOTALL)
# the message you're seeing is correct - this line prints an iterator object -
# like all iterators, you must actually iterate over it to see the iterator's
# contents. You're seeing the string representation of an iterator, not the
# iterator's contents.
print(pattern.finditer(content))
# this will iterate over the match objects in the iterator object - but there
# is no guarantee that any exist
for m in pattern.finditer(content):
print ("in for loop")
print("Matched:\n----\n%s\n----\n" % m.group(2))
# now you're printing this string, which you correctly observed - note that it is
# outside of the for loop. This means that its execution is not dependent on the
# regex actually finding any matches.
print ("in split")
由于“in for loop”从未打印过,这意味着你的正则表达式从未匹配过。我使用Python Regex Tool网站调试我的正则表达式取得了很大的成功。尝试在某些示例文本上使用该网站,以确保您的正则表达式实际匹配您期望的位置。
您当前的问题只是您的正则表达式找不到任何匹配项。