我需要从词典列表中的特定键的值中删除单词列表。
以下是我的数据的示例:
words = ['cloves', 'packed']
data = [{'title': 'Simple Enchiladas Verdes',
'prep_time': '15 min',
'cook_time': '30 min',
'ingredients': ['chicken breast', 'tomato sauce', 'garlic cloves', 'fresh packed cilantro']
'instructions': ['some text...'],
'category': 'dessert',
'cuisine': 'thai',
'article': ['some text...']
},
{...}, {...}]
所需的输出:
data = [{'title': 'Simple Enchiladas Verdes',
'prep_time': '15 min',
'cook_time': '30 min',
'ingredients': ['chicken breast', 'tomato sauce', 'garlic', 'fresh cilantro']
},
{...}, {...}]
我尝试了不同的代码:
remove = '|'.join(words)
regex = re.compile(r'\b('+remove+r')\b', flags=re.IGNORECASE)
for dct in data:
dct['ingredients']= list(filter(lambda x: regex.sub('', x), dct['ingredients']))
但这将返回以下错误:TypeError:sub()缺少1个必需的位置参数:'string'
我尝试过的其他代码:
for dct in data:
dct['ingredients']= list(filter(lambda x: x != words, dct['ingredients']))
for dct in data:
dct['ingredients']=[[el for el in string if el in words ] for string in dct['ingredients']]
for dct in data:
for string in dct['ingredients']:
dct['ingredients'] = list(filter(lambda x: x not in words, dct['ingredients']))
但是他们都没有解决我的问题。
答案 0 :(得分:2)
为什么list
不能与dict
基本理解一样?
data = [{k:([' '.join([s for s in x.split() if s not in words]) for x in v] if k == 'ingredients' else v) for k, v in i.items()} for i in data]
答案 1 :(得分:0)
在您的re.sub
方法中,您应该使用map
,而不是filter
(您不是要过滤掉单个单词,而是将整个字符串替换为re.sub
的结果)
for dct in data:
dct['ingredients']= list(map(lambda x: regex.sub('', x), dct['ingredients']))
或者,作为列表理解,可能更可读:
dct['ingredients'] = [regex.sub("", x) for x in dct['ingredients']]
但是,两者都会留下一些多余的空间。如果单词总是用空格隔开,则可以只使用split
和join
(如果words
是set
则更快):
for dct in data:
dct['ingredients'] = [' '.join(w for w in string.split() if w not in words)
for string in dct['ingredients']]
答案 2 :(得分:0)
words = ['cloves', 'packed']
data = [{'title': 'Simple Enchiladas Verdes',
'prep_time': '15 min',
'cook_time': '30 min',
'ingredients': ['chicken breast', 'tomato sauce', 'garlic cloves', 'fresh packed cilantro']}
]
for i in data:
word = ' @! '.join(i['ingredients'])
for k in words:
word = word.replace(k,'').strip()
i['ingredients']=[i.strip() for i in word.split('@!')]
输出
[{'title': 'Simple Enchiladas Verdes',
'prep_time': '15 min',
'cook_time': '30 min',
'ingredients': ['chicken breast',
'tomato sauce',
'garlic',
'fresh cilantro']}]
答案 3 :(得分:0)
words = ['cloves', 'packed']
data = [{'title': 'Simple Enchiladas Verdes',
'prep_time': '15 min',
'cook_time': '30 min',
'ingredients': ['chicken breast', 'tomato sauce', 'garlic cloves', 'fresh packed cilantro']
},
{'title': 'Simple Enchiladas Verdes11',
'prep_time': '15 min11',
'cook_time': '30 min11',
'ingredients': ['chicken breast1', '1tomato sauce', '1garlic cloves', '1fresh packed cilantro']}
]
n = []
for d in data:
for item in d['ingredients']:
for word in words:
item = item.replace(word, '')
n.append(item)
d['ingredients'] = n
print (d)
输出:
{'title': 'Simple Enchiladas Verdes11', 'prep_time': '15 min11', 'cook_time': '30 min11', 'ingredients': ['chicken breast', 'tomato sauce', 'garlic ', 'fresh cilantro', 'chicken breast1', '1tomato sauce', '1garlic ', '1fresh cilantro']}