Question

您好我是regex的新手，我开始使用python。我坚持从英语句子中提取所有单词。到目前为止，我有：

import re

shop="hello seattle what have you got"
regex = r'(\w*) '
list1=re.findall(regex,shop)
print list1

这给出了输出：

['你好'，'西雅图'，'什么'，'有'，'你']

如果我用

替换正则表达式

regex = r'(\w*)\W*'

然后输出：

['你好'，'西雅图'，'什么'，'有'，'你'，'有'，'']

而我想要这个输出

['你好'，'西雅图'，'什么'，'有'，'你'，'有']

请指出我哪里出错了。

Answer 1

使用字边界\b

import re

shop="hello seattle what have you got"
regex = r'\b\w+\b'
list1=re.findall(regex,shop)
print list1

OP : ['hello', 'seattle', 'what', 'have', 'you', 'got']

或只是\w+就足够了

import re

shop="hello seattle what have you got"
regex = r'\w+'
list1=re.findall(regex,shop)
print list1

OP : ['hello', 'seattle', 'what', 'have', 'you', 'got']

用于查找字符串

1 个答案: