Question

这应该很简单，这个正则表达式可以很好地搜索以特定字符开头的单词，但我无法使其与哈希和问号相匹配。

这适用于匹配开头的单词：

r = re.compile(r"\b([a])(\w+)\b")

但这些不匹配：试过：

r = re.compile(r"\b([#?])(\w+)\b")
r = re.compile(r"\b([\#\?])(\w+)\b")
r = re.compile( r"([#\?][\w]+)?")

甚至只尝试匹配哈希

r = re.compile( r"([#][\w]+)?"
r = re.compile( r"([/#][\w]+)?"

text = "this is one #tag and this is ?another tag"
items = r.findall(text)

期待得到：

[('#', 'tag'), ('?', 'another')]

Answer 1

\b匹配\w和\W之间（或\W和\w之间）的空白区域，但没有\b在#或?。

之前

换句话说：删除第一个单词边界。

不

r = re.compile(r"\b([#?])(\w+)\b")

但

r = re.compile(r"([#?])(\w+)\b")

Answer 2

你正在使用Python，正念我是最后想到的东西

>>> text = "this is one #tag and this is ?another tag"
>>> for word in text.split():
...   if word.startswith("#") or word.startswith("?"):
...     print word
...
#tag
?another

Answer 3

第一个\b在#或?之前不匹配，请改用(?:^|\s)。

此外，最后\b是不必要的，因为\w+是一个贪婪的匹配。

r = re.compile(r"(?:^|\s)([#?])(\w+)")

text = "#head this is one #tag and this is ?another tag, but not this?one"
print r.findall(text)
# Output: [('#', 'head'), ('#', 'tag'), ('?', 'another')]

如何使用python正则表达式匹配以哈希和问号开头的单词？

3 个答案: