我已经预先定义了字符串触发器列表的字典:
triggers = {'academic': ['studied at', 'studies at', 'studies', 'studies at'],
'age': ['years old','months old'],
'gender': ['male', 'female'],
'pets': ['dog','cat'],
'location': ['Lived in','Lives in']}
我有一个以前未知的分组信息数据列表列表,例如:
example_list_of_list = [['Former Teacher of math at'],
['Studies programming at', 'Stackoverflow'],
['Lives in','Chicago'],
['owns','dog', 'cat']
我想使用匹配的预定义键值将每个匹配的列表元素附加到新词典,例如:
{'academic': ['Former Teacher of math at'],
'age': None, # np.nan or []
'gender': None, # np.nan or []
'pets': ['owns','dog','cat']
'location': ['Lives in','Chicago']
}
谢谢!
答案 0 :(得分:1)
我认为你最容易使用set语义来做到这一点:
result = {}
for input in example_list_of_list:
for key, triggerset in triggers.items():
if not input.isdisjoint(triggerset):
result[key] = result.get(key,[]).append(input)
虽然注意了几件事:
triggers
应为dict
set
而非list
s。 example_list_of_lists
应为list
set
而非result
是dict
list
的{{1}}个,因为多个输入可能匹配