Question

这是我的代码：

def split(content):
   pattern = re.compile(r"""(\\\[-16pt]\n)(.*?)(\n\\\nthinhline)""", re.X | re.DOTALL)
   print(pattern.finditer(content))
   for m in pattern.finditer(content):
       print ("in for loop")
       print("Matched:\n----\n%s\n----\n" % m.group(2))
   print ("in split")


def replacement(content):
   split(content)
   pattern = re.compile(r'(?<=\\\\\[-16pt]\n)([\s\S]*?)(?=\\\\\n\\thinhline)')
   content= ' '.join(re.findall(pattern, content))
   print ("in replace")
   return content

这是输出：

<callable-iterator object at 0x2ab2e09cfe10>
in split
in replace

我用不同的字符串尝试了算法，它运行正常。我还测试了内容是否是一个字符串，它是。为什么程序没有进入for..loop，即使它进入split（）？

谢谢。

Answer 1

见评论：

def split(content):
   pattern = re.compile(r"""(\\\[-16pt]\n)(.*?)(\n\\\nthinhline)""", re.X | re.DOTALL)

   # the message you're seeing is correct - this line prints an iterator object -
   # like all iterators, you must actually iterate over it to see the iterator's
   # contents. You're seeing the string representation of an iterator, not the
   # iterator's contents.
   print(pattern.finditer(content))

   # this will iterate over the match objects in the iterator object - but there
   # is no guarantee that any exist
   for m in pattern.finditer(content):
       print ("in for loop")
       print("Matched:\n----\n%s\n----\n" % m.group(2))

   # now you're printing this string, which you correctly observed - note that it is
   # outside of the for loop. This means that its execution is not dependent on the 
   # regex actually finding any matches.
   print ("in split")

由于“in for loop”从未打印过，这意味着你的正则表达式从未匹配过。我使用Python Regex Tool网站调试我的正则表达式取得了很大的成功。尝试在某些示例文本上使用该网站，以确保您的正则表达式实际匹配您期望的位置。

您当前的问题只是您的正则表达式找不到任何匹配项。

re.compile与我的字符串不匹配

1 个答案: