我有一个以文本为值的字典列表,我想删除文本中包含某些单词的字典。
df = [{'name':'jon','text':'the day is light'},{'name':'betty','text':'good night'},{'name':'shawn','text':'good afternoon'}]
我要删除包含“文本”键的单词“ light”和“ night”的字典:
words = ['light','night']
pattern = re.compile(r"|".join(words))
预期结果:
df = [{'name':'shawn','text':'good afternoon'}]
答案 0 :(得分:2)
[x for x in df if not any(w in x['text'] for w in words)]
答案 1 :(得分:1)
您已经关闭。您所需要做的就是编写列表理解并应用搜索模式:
result = [x for x in df if not re.search(pattern, x['text'])]
完整示例:
import re
df = [{'name':'jon','text':'the day is light'},{'name':'betty','text':'good night'},{'name':'shawn','text':'good afternoon'}]
words = ['light','night']
pattern = re.compile(r"|".join(words))
result = [x for x in df if not re.search(pattern, x['text'])]
print(result) # => [{'name': 'shawn', 'text': 'good afternoon'}]
答案 2 :(得分:1)
我找到了答案:
[x for x in df if not pattern.search(x['text'])]