我正在编写一个函数来处理布尔AND搜索中的多个查询。
我有一个文档的词典,每个查询都出现在query_dict
我想查询query_dict.values()中的所有值的交集:
query_dict = {'foo': ['doc_one.txt', 'doc_two.txt', 'doc_three.txt'],
'bar': ['doc_one.txt', 'doc_two.txt'],
'foobar': ['doc_two.txt']}
intersect(query_dict)
>> doc_two.txt
我一直在读关于交叉的但是我发现很难将它应用于字典。
感谢您的帮助!
答案 0 :(得分:9)
In [36]: query_dict = {'foo': ['doc_one.txt', 'doc_two.txt', 'doc_three.txt'],
'bar': ['doc_one.txt', 'doc_two.txt'],
'foobar': ['doc_two.txt']}
In [37]: reduce(set.intersection, (set(val) for val in query_dict.values()))
Out[37]: set(['doc_two.txt'])
在[41]中:query_dict = {'foo':['doc_one.txt','doc_two.txt','doc_three.txt'], 'bar':['doc_one.txt','doc_two.txt'], 'foobar':['doc_two.txt']}
set.intersection(*(set(val) for val in query_dict.values()))
也是一个有效的解决方案,虽然它有点慢:
In [42]: %timeit reduce(set.intersection, (set(val) for val in query_dict.values()))
100000 loops, best of 3: 2.78 us per loop
In [43]: %timeit set.intersection(*(set(val) for val in query_dict.values()))
100000 loops, best of 3: 3.28 us per loop
答案 1 :(得分:0)
另一种方式
first = query_dict.values()[0]
rest = query_dict.values()[1:]
print [t for t in set(first) if all(t in q for q in rest)]