在多级数据中使用Python组合

时间:2014-10-22 14:38:51

标签: python dictionary combinations

我有一个词典imageHashes keys=pathsvalues = integers,例如

imageHashes['/directorya/jim.txt'] = 7
imageHashes['/directorya/nigel.txt'] = 68
imageHashes['/directoryb/ralph.txt'] = 17
imageHashes['/directoryb/baba.txt'] = 43

使用组合,我可以使用:

循环使用
for keypair in list(combinations(imageHashes,2)):
    do something

问题是我只想在不同目录中的对之间做某事,所以在

之间
  • jimralph
  • nigelralph
  • jimbaba
  • nigelbaba



  • jimnigel
  • ralphbaba

    我有点像菜鸟,所以有人能告诉我开始这个的最好方法吗?

2 个答案:

答案 0 :(得分:3)

无需过度思考:只需遍历组合并使用continue跳过您不想处理的组合。您需要知道的另一件事是如何检查两个文件是否在同一目录中 - 换句话说,它们的目录名称是否匹配?这是os.path.dirname给我们的。

from itertools import combinations
from os.path import dirname

imageHashes = {}
imageHashes['/directorya/jim.txt'] = 7
imageHashes['/directorya/nigel.txt'] = 68
imageHashes['/directoryb/ralph.txt'] = 17
imageHashes['/directoryb/baba.txt'] = 43


for path_a, path_b in combinations(imageHashes, 2):
    if dirname(path_a) == dirname(path_b):
        continue

    print("They're different!: {} vs. {}".format(path_a, path_b))

给出了:

They're different!: /directoryb/baba.txt vs. /directorya/jim.txt
They're different!: /directoryb/baba.txt vs. /directorya/nigel.txt
They're different!: /directorya/jim.txt vs. /directoryb/ralph.txt
They're different!: /directorya/nigel.txt vs. /directoryb/ralph.txt

请注意,不需要将combinations返回的迭代器转换为列表:这只是浪费时间。

答案 1 :(得分:0)

如果您想要一般的方法,即拥有任意数量的目录:

from itertools import product

# get the set of different directories
dirs = {path.rsplit('/', 1)[0] for path in imageHashes}
# get iterators for every directory
dirs_iter = [[path for path in imageHashes if path.startswith(dir)] for dir in dirs]
# and now every possible combination between them
for comb in product(*dirs_iter):
    do something

现在comb将拥有与可用路径中不同目录一样多的元素;也就是说,每个comb每个目录都有一条路径。