Question

我正在开发一个具有搜索功能的应用程序，我想在其中匹配搜索模式。模式可以具有以下形式：

search:'pattern'和search:"pattern"（引用搜索）
search:r'pattern'和search:r"pattern"（正则表达式搜索）
search:pattern（不带引号的搜索）

我的正则表达式是：

quoted = re.compile(r'search:(?:\'|")([^"\']+)')
regex = re.compile(r'search:r(?:\'|")([^"\']+)')
unquoted = re.compile(r'search:(?<!r[\'"])([^ \'"]+)')

我的测试字符串是

test_str = "search:foo search:'bar' search:\"baz\" search:r'blah' search:r\"bleh\""

引用和正则表达式模式是正确匹配的，但是不带引号的模式（应该只匹配foo）不正确匹配，它的行为就像负面的lookbehind不存在一样。我还尝试从断言中删除引号（[\'"]），但它返回完全相同的结果：

>>> unquoted.findall(test_str)
['foo', 'r', 'r']

我不明白我在这里做错了什么，所以非常感谢任何帮助！

Answer 1

'search:(?<!r[\'"])([^ \'"]+)'中的lookbehind断言从h:序列后面的位置看后面，因此它永远不会发现h:是r'或r"
替换为(?!r[\'"])

但我发现另一个问题：

import re

quoted = re.compile(r'search:(?:[\'"])([^"\']+)')
regex = re.compile(r'search:r(?:[\'"])([^"\']+)')
unquoted = re.compile(r'search:(?!r[\'"])([^ \'"]+)')

test_str = "search:foo search:romeo "\
           "search:'bar' search:\"baz\" "\
           "search:r'blah' search:r\"bleh\""\
           "search:isn'it something to catch ?"

"""
•search:'pattern' and search:"pattern" (quoted search)
•search:r'pattern' and search:r"pattern" (regex search)
•search:pattern (unquoted search)

"""
print quoted.findall(test_str)
print
print regex.findall(test_str)
print
print unquoted.findall(test_str)

结果

['bar', 'baz']

['blah', 'bleh']

['foo', 'romeo', 'isn']

您不想抓住isn'it吗？

python中的负向lookbehind正则表达式断言

1 个答案: