Question

def query_RR(postings, qtext): 
    words = tokenize(qtext) 
    allpostings = [postings[w] for w in words]
    for a in allpostings: 
        print a.keys()

这是查询的结果

[0, 2, 3, 4, 6] [1, 4, 5] [0, 2, 4] [4, 5]

查询采用用户输入术语（qtext），标记化并为每个令牌生成发布列表。

发布列表是嵌套词典的列表例如

[{0 : 0.68426, 1: 0.26423}, {2: 0.6842332, 0: 0.9823}]

我试图使用键找到这些嵌套字典的交集。

任何人都可以为我提供解决方案，非常感谢！

Answer 1

由于您在评论中澄清了您想要的内容，因此这里是修改后的代码版本，它将常用值收集到列表字典中。正如其他人所建议的那样，诀窍是使用内置的set类型来有效地确定键的交集。

自我最初修订以来，我已经能够在一个非常相似的问题的接受答案中使用想法进一步优化代码：

How to find common keys in a list of dicts and sort them by value?

最新版本：

def intersect_dicts(dict_list):
    """ Create dictionary containing of the keys common to all those in the list
        associated with a list of their values from each.
    """
    # Gather the values associated with the keys common to all the dicts in the list.
    common_keys = set.intersection(*map(set, dict_list))
    return {key: [d[key] for d in dict_list] for key in common_keys}


if __name__ == '__main__':
    postings_list = [{0 : 0.68426, 1: 0.26423}, {2: 0.6842332, 0: 0.9823}]
    intersection = intersect_dicts(postings_list)
    print(intersection)  # -> {0: [0.68426, 0.9823]}

如何找到嵌套列表的交集？

1 个答案: