我正在尝试使用正则表达式来实现一个功能,以删除标签并返回文本文件中找到的字符串列表。但是,发生以下错误:
AssertionError: Wrong type for output extracted_words. Got <class 'str'>, expected <class 'list'>
这是我下面的代码,对此将有所帮助。
import re
def get_words(text):
"""
Extracting words from the text
The 'text' parameter is the file which contains strings inside
Objective: To return a list of strings found in the text called 'extracted_words'
"""
# Implementation
extracted_words = re.sub('<[^>]*>', '', text)
return extracted_words
答案 0 :(得分:0)
这对我有用:
rgxp = re.compile(r'([^<>]+)(?=<)')
return re.findall(rgxp, text)