Question

我在两个不同的列表A = [dog bit dog null]和B = [hund bet hund]中有两个字符串。我想从列表B中找到所有可能的对齐列表A，例如：

  C =  [(hund = dog, bet = bit, hund = dog),
        (hund = dog, bet = bit, hund = bit),
        (hund = dog, bet = bit, hund = null),
        (hund = dog, bet = dog, hund = dog),
        (hund = dog, bet = dog, hund = bit),
        etc.. ]

我认为这两个字符串之间有64种不同的分配。我正在使用IBM model1进行单词转换。

Answer 1

[(i,j) for i in a for j in b]

你不能在列表中有这个结构，你需要一个字典，我在这里使用一个元组来关联这些值。

Answer 2

如果您想要64种可能性，可以使用itertools.product：

>>> from itertools import product
>>> A = "dog bit dog null".split()
>>> B = "hund bet hund".split()
>>> product(A, repeat=3)
<itertools.product object at 0x1148fd500>
>>> len(list(product(A, repeat=3)))
64
>>> list(product(A, repeat=3))[:5]
[('dog', 'dog', 'dog'), ('dog', 'dog', 'bit'), ('dog', 'dog', 'dog'), ('dog', 'dog', 'null'), ('dog', 'bit', 'dog')]

但请注意，如果您dog A两次>>> len(set(product(A, repeat=3))) 27，则会产生相当多的重复项：

>>> trips = [zip(B, p) for p in product(A, repeat=len(B))]
>>> trips[:5]
[[('hund', 'dog'), ('bet', 'dog'), ('hund', 'dog')], [('hund', 'dog'), ('bet', 'dog'), ('hund', 'bit')], [('hund', 'dog'), ('bet', 'dog'), ('hund', 'dog')], [('hund', 'dog'), ('bet', 'dog'), ('hund', 'null')], [('hund', 'dog'), ('bet', 'bit'), ('hund', 'dog')]]

如果您愿意，您甚至可以获得相关联的三元组：

{{1}}

使用Python可能对齐两个列表

2 个答案: