我有以下列表:
list12 = ['**FIRS0425 SOPL ZTE First Company limited', 'Apple Technology','*ROS Sami']
我的代码如下
import re
[item2 for item in list12 for item2 in item.split() if not re.match("^[*A-Z]+(0-9){4}$", item2)]
我得到了类似的输出:
['First', 'Company', 'limited', 'Apple', 'Technology', 'Sami']
我希望输出像:
['SOPL', 'ZTE', 'First', 'Company', 'limited', 'Apple', 'Technology', 'ROS', 'Sami']
我对正则表达式不好。如何找到所需的解决方案?
答案 0 :(得分:0)
您正在寻找
\b([A-Za-z]+)\b
在Python
中:
import re
list12 = ['**FIRS0425 SOPL ZTE First Company limited', 'Apple Technology','*ROS Sami']
rx = re.compile(r'\b([A-Za-z]+)\b')
result = [word for item in list12 for word in rx.findall(item)]
print(result)
哪个产量
['SOPL', 'ZTE', 'First', 'Company', 'limited', 'Apple', 'Technology', 'ROS', 'Sami']
答案 1 :(得分:0)
Python中的非正则表达式方式
list12 = ['**FIRS0425 SOPL ZTE First Company limited', 'Apple Technology','*ROS Sami']
str = " ".join(list12)
list21 = str.split()
res = [k.strip('*') for k in list21 if '**' not in k]
print(res)
输出:
['SOPL', 'ZTE', 'First', 'Company', 'limited', 'Apple', 'Technology', 'ROS', 'Sami']