我正在开发一个具有搜索功能的应用程序,我想在其中匹配搜索模式。模式可以具有以下形式:
search:'pattern'
和search:"pattern"
(引用搜索)search:r'pattern'
和search:r"pattern"
(正则表达式搜索)search:pattern
(不带引号的搜索)我的正则表达式是:
quoted = re.compile(r'search:(?:\'|")([^"\']+)')
regex = re.compile(r'search:r(?:\'|")([^"\']+)')
unquoted = re.compile(r'search:(?<!r[\'"])([^ \'"]+)')
我的测试字符串是
test_str = "search:foo search:'bar' search:\"baz\" search:r'blah' search:r\"bleh\""
引用和正则表达式模式是正确匹配的,但是不带引号的模式(应该只匹配foo
)不正确匹配,它的行为就像负面的lookbehind不存在一样。我还尝试从断言中删除引号([\'"]
),但它返回完全相同的结果:
>>> unquoted.findall(test_str)
['foo', 'r', 'r']
我不明白我在这里做错了什么,所以非常感谢任何帮助!
答案 0 :(得分:1)
'search:(?<!r[\'"])([^ \'"]+)'
中的lookbehind断言从h:
序列后面的位置看后面,因此它永远不会发现h:
是r'
或r"
替换为(?!r[\'"])
但我发现另一个问题:
import re
quoted = re.compile(r'search:(?:[\'"])([^"\']+)')
regex = re.compile(r'search:r(?:[\'"])([^"\']+)')
unquoted = re.compile(r'search:(?!r[\'"])([^ \'"]+)')
test_str = "search:foo search:romeo "\
"search:'bar' search:\"baz\" "\
"search:r'blah' search:r\"bleh\""\
"search:isn'it something to catch ?"
"""
•search:'pattern' and search:"pattern" (quoted search)
•search:r'pattern' and search:r"pattern" (regex search)
•search:pattern (unquoted search)
"""
print quoted.findall(test_str)
print
print regex.findall(test_str)
print
print unquoted.findall(test_str)
结果
['bar', 'baz']
['blah', 'bleh']
['foo', 'romeo', 'isn']
您不想抓住isn'it
吗?