Question

我正在尝试对齐两个列表中的单词

sentence1 = ['boy','motorcycle','people','play']
sentence2 = ['run','boy','people','boy','play','play']

这是我的代码：

def identicalWordsIndex(self, sentence1, sentence2):
    identical_index = []
    for i in xrange(len(sentence1)):
        for j in xrange(len(sentence2)):
            if sentence1[i] == sentence2[j]:
                idenNew1 = [i,j]
                identical_index.append(idenNew1)
            if sentence2[j] == sentence1[i]:
                idenNew2 = [j,i]
                identical_index.append(idenNew2)
    return identical_index

我要做的是从sentence1和sentence2获取对齐单词的索引号。

1st是从sentence1到sentence2的对齐词语索引。第二个是从sentence2到sentence1的对齐词语索引。

但上面代码的结果是这样的：

1st : [[0, 1], [1, 0], [0, 3], [3, 0], [2, 2], [2, 2], [3, 4], [4, 3], [3, 5], [5, 3]]
2nd : [[0, 1], [1, 0], [0, 3], [3, 0], [2, 2], [2, 2], [3, 4], [4, 3], [3, 5], [5, 3]]

我对结果的期望是这样的：

1st : [[0,1],[2,2],[3,4]]
2nd : [[1,0],[2,2],[3,0],[4,3],[5,3]]

任何人都可以解决？感谢

Answer 1

你只需要添加休息时间。试试这个：

sentence1 = ['boy','motorcycle','people','play']
sentence2 = ['run','boy','people','boy','play','play']
identical_index = []

def identicalWordsIndex( sentence1, sentence2):
    identical_index = []
    for i in xrange(len(sentence1)):
        for j in xrange(len(sentence2)):
            if sentence1[i] == sentence2[j]:
                idenNew1 = [i,j]
                identical_index.append(idenNew1)
                break
    return identical_index

print (identicalWordsIndex(sentence1, sentence2))
print (identicalWordsIndex(sentence2, sentence1))

打印：

[[0,1]，[2,2]，[3,4]]

[[1,0]，[2,2]，[3,0]，[4,3]，[5,3]]

Answer 2

您可以使用for loops尝试此解决方案：

a = ['boy','motorcycle','people','play']
b = ['run','boy','people','boy','play','play']

def align_ab(a, b):
    indexed = []
    for k,v in enumerate(a):
        try:
            i = b.index(v)
            indexed.append([k,i])
        except ValueError:
            pass

    return indexed
# Align a words from b
print(align_ab(a,b))
# Align b words from a
print(align_ab(b,a))

输出：

>>> [[0, 1], [2, 2], [3, 4]]
>>> [[1, 0], [2, 2], [3, 0], [4, 3], [5, 3]]

Answer 3

看看这是否适合你。在最后两行，你可以交换参数来获得你想要的东西。

sentence1 = ['boy','motorcycle','people','play']
sentence2 = ['run','boy','people','boy','play','play']

def identicalWordsIndex(sentence1, sentence2):
    identical_index = []
    for i in range(len(sentence1)):
        for j in range(len(sentence2)):
            if sentence1[i] == sentence2[j]:
            identical_index.append([i, j])
                break
    return identical_index

print(identicalWordsIndex(sentence1, sentence2))
print(identicalWordsIndex(sentence2, sentence1))

输出：

>>>[[0, 1], [2, 2], [3, 4]]
>>>[[1, 0], [2, 2], [3, 0], [4, 3], [5, 3]]

使用for循环的相同单词对齐

3 个答案: