Question

我正在编写一个正则表达式来提取符号（＃/ - ）后跟一个单词。例如，考虑字符串

s= "the amount is 5/10 of the original. The #2 number should be extracted on the dd/yy"

正则表达式是

r= re.search(r'(/|#).*\\s+',s)

我上面得到的输出是None，我希望它显示在哪里

/10 #2 /yy

我的正则表达式有什么问题。

Answer 1

您需要在\S+或/之后匹配任何1个以上的非空白字符（#）（可以与[/#]字符类匹配）：

[/#]\S+

请参阅regex demo。

提示：如果您不希望开头的#或/前面有任何字词char，请添加\B（非模式开始时的单词边界）：\B[/#]\S+。

在Python中使用re.findall：

import re
s= "the amount is 5/10 of the original. The #2 number should be extracted on the dd/yy"
r = re.findall(r'[/#]\S+',s)
print(r)              # => ['/10', '#2', '/yy']
print(" ".join(r))    # => /10 #2 /yy

请参阅Python demo。

Answer 2

import re
s = "the amount is 5/10 of the original. The #2 number should be extracted on the dd/yy"
r = re.findall(r'([/#]\S*)+', s)
print r
# ['/10', '#2', '/yy']

正则表达式demo

我的正则表达式有什么问题。

()表示捕获群组。使用[]匹配集合中的字符
\\s表示匹配字符串\s

Answer 3

正如你所说：

提取符号（＃/ - ）followed by a word。

所以你可以使用负面展望。

import re

pattern=r'/(?!/w).+?[^\s]|#\d'

strings= "the amount is 5/10 of the original. The #2 number should be extracted on the dd/yy"

match=re.findall(pattern,strings,re.M)

print(" ".join(list(match)))

输出：

/10 #2 /yy

正则表达式提取多个符号后跟字符串-python中的单词

3 个答案: