Question

我有这个正则表达式在http://regexpal.com/中正常工作：

[^-:1234567890/.,\s]*

我试图在一个充满（, . # "" \n \s ...等）的段落中找到单词

但在我的代码中，我无法看到我正在演绎的结果：

def words(lines):
    words_pattern = re.compile(r'[^-:1234567890/.,\s]*')
    li = []
    for m in lines:
        e = words_pattern.search(m)
        if e:
            match = e.group()
            li.append(match)
    return li

li = [u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u'']

对此有何建议？也许我不是以正确的方式将正则表达从一个地方转移到另一个地方

提前致谢

修改

更确切地说，我想要：ñáéíó和ú

感谢

Answer 1

如果您只想要字母，可以使用string.ascii_letters

>>> from string import ascii_letters
>>> import re
>>> s = 'this is 123 some text! that has someñ \n other stuff.'
>>> re.findall('[{}]+'.format(ascii_letters), s)
['this', 'is', 'some', 'text', 'that', 'has', 'some', 'other', 'stuff']

你也可以从[A-Za-z]获得相同的行为（这与string.ascii_letters基本相同）

>>> re.findall('[A-Za-z]+', s)
['this', 'is', 'some', 'text', 'that', 'has', 'some', 'other', 'stuff']

正则表达式只搜索单词

1 个答案: