Question

在我正在python中创建的程序中，我希望所有格式为__word__的单词都能脱颖而出。我如何使用正则表达式搜索这些单词？

Answer 1

也许像

\b__(\S+)__\b

>>> import re
>>> re.findall(r"\b__(\S+)__\b","Here __is__ a __test__ sentence")
['is', 'test']    
>>> re.findall(r"\b__(\S+)__\b","__Here__ is a test __sentence__")
['Here', 'sentence']
>>> re.findall(r"\b__(\S+)__\b","__Here's__ a test __sentence__")
["Here's", 'sentence']

或者您可以像这样在

这个词周围添加标签

>>> print re.sub(r"\b(__)(\S+)(__)\b",r"<b>\2<\\b>","__Here__ is a test __sentence__")
<b>Here<\b> is a test <b>sentence<\b>

如果您需要对合法字词进行更细粒度的控制，最好是明确的

\b__([a-zA-Z0-9_':])__\b  ### count "'" and ":" as part of words

>>> re.findall(r"\b__([a-zA-Z0-9_']+)__\b","__Here's__ a test __sentence:__")
["Here's"]
>>> re.findall(r"\b__([a-zA-Z0-9_':]+)__\b","__Here's__ a test __sentence:__")
["Here's", 'sentence:']

Answer 2

请点击此处：http://docs.python.org/library/re.html

这应该显示语法和示例，您可以使用这些语法和示例构建对带有2个下划线的前后字的检查。

Answer 3

最简单的正则表达式是

__.+__

如果您想从代码中访问单词本身，则应使用

__(.+)__

Answer 4

这将为您提供包含所有此类单词的列表

>>> import re
>>> m = re.findall("(__\w+__)", "What __word__ you search __for__")
>>> print m
['__word__', '__for__']

Answer 5

\b(__\w+__)\b

\b字边界
\w+一个或多个单词字符 - [a-zA-Z0-9_]

Answer 6

简单的字符串函数。没有正则表达式

>>> mystring="blah __word__ blah __word2__"
>>> for item in mystring.split():
...     if item.startswith("__") and item.endswith("__"):
...        print item
...
__word__
__word2__

创建一个正则表达式来搜索单词？

6 个答案: