Question

假设我有以下表格中的defaultdict：

theta = defaultdict(float)

密钥由一串字符串组成，即(label, word)，相关值是给定单词符合给定标签的概率（词性标注）。

例如，“stand”一词可以是名词或动词。所以我可以这样做：

theta[('NOUN', 'stand')] = 0.4
theta[('VERB', 'stand')] = 0.6
theta[('ADJ', 'stand')] = 0.0

以及其余部分的语音标签。

我需要做的是，如果字典使用不包含的单词调用并且关联的标签为“NOUN”，则默认情况下返回值1，并为所有其他关联标签返回0。例如：

value = theta[('NOUN', 'wordthatdoesntexist')]  # this should be 1
value = theta[('VERB', 'wordthatdoesntexist')]  # this should be 0

我该怎么做？我可以在初始化步骤中使用lambda吗？或者还有其他方式吗？

Answer 1

defaultdict不能这样做;默认工厂无权访问密钥。你必须编写自己的dict子类，当你试图访问一个丢失的密钥时，使用__missing__ hook dicts查找：

class SomeAppropriateName(dict):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
    def __missing__(self, key):
        val = 1.0 if key[0] == 'NOUN' else 0.0
        # Uncomment the following line if you want to add the value to the dict
        # self[key] = val
        return val

Answer 2

您可以使用setdefault()的{{1}}方法：

dict

如果在d.setdefault(u, int(u[0] == "NOUN"))中找到u，则setdefault会返回d。否则，它将被插入到dict中，并将值作为第二个参数提供。

defaultdict以元组为键，如何在事件键中找不到默认值

2 个答案: