使用正则表达式时出错。断言错误:得到<class'str'=“”>,预期<class'list'=“”>`

时间:2019-01-24 21:02:11

标签: python regex nlp

我正在尝试使用正则表达式来实现一个功能,以删除标签并返回文本文件中找到的字符串列表。但是,发生以下错误:

AssertionError: Wrong type for output extracted_words. Got <class 'str'>, expected <class 'list'>

这是我下面的代码,对此将有所帮助。

import re

def get_words(text):
    """
    Extracting words from the text

    The 'text' parameter is the file which contains strings inside

    Objective: To return a list of strings found in the text called 'extracted_words'
    """
    # Implementation
    extracted_words = re.sub('<[^>]*>', '', text)
    return extracted_words

1 个答案:

答案 0 :(得分:0)

这对我有用:

rgxp = re.compile(r'([^<>]+)(?=<)')
return re.findall(rgxp, text)