re.compile与我的字符串不匹配

时间:2015-01-05 21:28:36

标签: python regex python-2.7

这是我的代码:

def split(content):
   pattern = re.compile(r"""(\\\[-16pt]\n)(.*?)(\n\\\nthinhline)""", re.X | re.DOTALL)
   print(pattern.finditer(content))
   for m in pattern.finditer(content):
       print ("in for loop")
       print("Matched:\n----\n%s\n----\n" % m.group(2))
   print ("in split")


def replacement(content):
   split(content)
   pattern = re.compile(r'(?<=\\\\\[-16pt]\n)([\s\S]*?)(?=\\\\\n\\thinhline)')
   content= ' '.join(re.findall(pattern, content))
   print ("in replace")
   return content

这是输出:

<callable-iterator object at 0x2ab2e09cfe10>
in split
in replace

我用不同的字符串尝试了算法,它运行正常。我还测试了内容是否是一个字符串,它是。为什么程序没有进入for..loop,即使它进入split()?

谢谢。

1 个答案:

答案 0 :(得分:1)

见评论:

def split(content):
   pattern = re.compile(r"""(\\\[-16pt]\n)(.*?)(\n\\\nthinhline)""", re.X | re.DOTALL)

   # the message you're seeing is correct - this line prints an iterator object -
   # like all iterators, you must actually iterate over it to see the iterator's
   # contents. You're seeing the string representation of an iterator, not the
   # iterator's contents.
   print(pattern.finditer(content))

   # this will iterate over the match objects in the iterator object - but there
   # is no guarantee that any exist
   for m in pattern.finditer(content):
       print ("in for loop")
       print("Matched:\n----\n%s\n----\n" % m.group(2))

   # now you're printing this string, which you correctly observed - note that it is
   # outside of the for loop. This means that its execution is not dependent on the 
   # regex actually finding any matches.
   print ("in split")

由于“in for loop”从未打印过,这意味着你的正则表达式从未匹配过。我使用Python Regex Tool网站调试我的正则表达式取得了很大的成功。尝试在某些示例文本上使用该网站,以确保您的正则表达式实际匹配您期望的位置。

您当前的问题只是您的正则表达式找不到任何匹配项。