Question

我有一个字符串：

This is @lame

在这里，我想提取跛脚。但问题是，上面的字符串可以是

This is lame

在这里，我不提取任何东西。然后这个字符串可以是：

This is @lame but that is @not

这里我提取跛脚而不是

因此，我期望在每种情况下的输出是：

 [lame]
 []
 [lame,not]

如何在python中以健壮的方式提取这些内容？

Answer 1

使用re.findall()查找多种模式;在这种情况下，对于以@开头的任何内容，由单词字符组成：

re.findall(r'(?<=@)\w+', inputtext)

(?<=..)构造是正向后观断言;它仅在当前位置前面有@字符时匹配。因此，如果这些字符前面带有\w符号，则上述模式匹配1个或多个单词字符（@字符类）。

演示：

>>> import re
>>> re.findall(r'(?<=@)\w+', 'This is @lame')
['lame']
>>> re.findall(r'(?<=@)\w+', 'This is lame')
[]
>>> re.findall(r'(?<=@)\w+', 'This is @lame but that is @not')
['lame', 'not']

如果您打算重用该模式，请先编译表达式，然后在编译的正则表达式对象上使用.findall() method：

at_words = re.compile(r'(?<=@)\w+')

at_words.findall(inputtext)

每次调用.findall()时，都会为您节省一次缓存查找。

Answer 2

这将提供您请求的输出：

import re
regex = re.compile(r'(?<=@)\w+')
print regex.findall('This is @lame')
print regex.findall('This is lame')
print regex.findall('This is @lame but that is @not')

Answer 3

你应该使用re lib这里是一个例子：

import re
test case = "This is @lame but that is @not"
regular = re.compile("@[\w]*")
lst= regular.findall(test case)

提取多个实例regex python

3 个答案: