假设我有模板来填充dict中的值:
我有这样的模板:
templates = [
"I have four {fruit} in {place}",
"I have four {fruit} and {grain} in {place}",
...
]
使用这样的字典:
my_dict = {'fruit': ['apple', 'banana', 'mango'],
'place': ['kitchen', 'living room'],
'grain' : ['wheat', 'rice']
}
说我有这样一句话:
sentence = "I have four apple in kitchen"
鉴于这句话,模板和字典, 我想知道这句话匹配其中一个模板并返回它匹配的值:
{'fruit': 'apple', 'place': 'kitchen'}
与上述相似如果:
Input: "I have four apple and wheat in kitchen"
Output: {'fruit': 'apple', 'grain': 'wheat', 'place': 'kitchen'}
如果能够处理这个问题会很棒:
Input: "I have four apple in bedroom"
Output: {'fruit': 'apple'}
注意它只返回水果,而不是卧室,因为卧室不在地方的价值观。
答案 0 :(得分:6)
将格式化的字符串转换为正则表达式:
import re
words = {k: '(?P<{}>{})'.format(k, '|'.join(map(re.escape, v))) for k, v in my_dict.items()}
patterns = [re.compile(template.format(**words)) for template in templates]
这会生成I have four (?P<fruit>apple|banana|mango) in (?P<place>kitchen|living room)"
形式的模式。匹配这些将为您提供预期的输出:
for pattern in patterns:
match = pattern.match(sentence)
if match:
matched_words = match.groupdict()
这是一种非常快速的O(N)方法来完全匹配句子:
>>> import re
>>> templates = [
... "I have four {fruit} in {place}",
... "I have four {fruit} and {grain} in {place}",
... ]
>>> my_dict = {'fruit': ['apple', 'banana', 'mango'],
... 'place': ['kitchen', 'living room'],
... 'grain' : ['wheat', 'rice']
... }
>>> def find_matches(sentence):
... for pattern in patterns:
... match = pattern.match(sentence)
... if match:
... return match.groupdict()
...
>>> find_matches("I have four apple in kitchen")
{'fruit': 'apple', 'place': 'kitchen'}
>>> find_matches("I have four apple and wheat in kitchen")
{'fruit': 'apple', 'grain': 'wheat', 'place': 'kitchen'}
如果您需要模板匹配部分句子,请将可选部分包含在(?...)
组中:
"I have four {fruit} in (?{place})"
或将\w+
添加到单词列表中(除了有效单词),然后在匹配后对groupdict()
验证my_dict
结果。对于in bedroom
案例,\w+
将与bedroom
部分匹配,但不会在my_dict
的{{1}}列表中找到。{ / p>