Question

我有一个问题，我很难解释，因此我将使用许多示例来帮助大家理解并确定您是否可以帮助我。

说我有两个清单，其中包含两个人对书名的评价。用户1评级为lstA，用户2评级为lstB

lstA = ['Harry Potter','1984','50 Shades','Dracula']
lstB = ['50 Shades','Dracula','1984','Harry Potter']

用户认为“哈利·波特”比“吸血鬼”好（HP指数为0，而德古拉指数为3）

用户二认为“哈利·波特”比德古拉更糟（HP为3，德古拉为1）

在这种情况下，返回元组('Harry Potter', 'Dracula') [('Dracula', 'Harry Potter')也可以]

用户1的评分也比“吸血鬼”好50分，用户2的评分也比“吸血鬼”好50分（分别为2、3和0、1）。在这种情况下，什么也不会发生。

程序的最终结果应该返回一个元组列表，

[('Harry Potter','50 Shades'), ('Harry Potter','Dracula'), ('Harry Potter','1984'), ('1984', '50 Shades'), ('1984','Dracula')]

有人可以帮助我指出正确的方向，以提出可以给出所有元组的算法吗？

Answer 1

首先以数学方式阐述您的逻辑。对于长度为2的所有组合，给定索引idx_a1, idx_a2和idx_b1, idx_b2，如果为sign(idx_a1 - idx_a2) != sign(idx_b1 - idx_b2)，则记录组合。

下面的代码效率不高，但是显示了一种将这种逻辑转换为代码的方法：

from itertools import combinations

lstA = ['Harry Potter','1984','50 Shades','Dracula']
lstB = ['50 Shades','Dracula','1984','Harry Potter']

def sign(x):
    """Return +1 if integer is positive, -1 if negative"""
    return (x > 0) - (x < 0)

res = []
for a, b in combinations(lstA, 2):
    idx_a1, idx_a2 = lstA.index(a), lstA.index(b)
    idx_b1, idx_b2 = lstB.index(a), lstB.index(b)
    if sign(idx_a1 - idx_a2) != sign(idx_b1 - idx_b2):
        res.append((a, b))

[('Harry Potter', '1984'),
 ('Harry Potter', '50 Shades'),
 ('Harry Potter', 'Dracula'),
 ('1984', '50 Shades'),
 ('1984', 'Dracula')]

Answer 2

执行此操作的一种方法是将每个列表中的所有正序累积到一个集合中，然后取两个集合的差。当(a, b)在其各自列表中的a之前，正序将为b。这是itertools.combinations保证的顺序：

from itertools import combinations

setA = set(combinations(lstA, 2))
setB = set(combinations(lstB, 2))

result = setA - setB

这将简单地丢弃两组所同意的任何顺序。如果两个清单都具有相同的书籍，那么这将与

result = setB - setA

唯一的区别是所有元组都将被反转。

如果每个列表中都有不同的书籍，则需要添加一些额外的步骤来清理重复的书籍并将这两个书籍组合在一起：

resultA = setA - setB
resultB = setB.difference(x[::-1] for x in setA)
result = resultA | resultB

第一步是计算lstA与lstB不同的所有元素。下一步将找到lstB的元素，这些元素与我们在resultA中拥有的元素没有相反的含义，因为对于这两个列表中有关书本的分歧可以保证在集合中得到逆转。我在这里使用方法set.difference优先于-运算符，因为这样就无需从生成器表达式创建集合对象。不幸的是，您不能仅使用symmetric_difference/^，因为元素是相反的。第三步只是计算结果的并集。

IDEOne链接：https://ideone.com/DuHTed。这将演示问题的原始情况和不对称列表。

Answer 3

@jpp解决方案的有效版本如下：

from itertools import combinations

lstA = ['Harry Potter','1984','50 Shades','Dracula']
lstB = ['50 Shades','Dracula','1984','Harry Potter']

bIndices = {b: i for i, b in enumerate(lstB)}
aPairs = [sorted(c) for c in combinations(enumerate(lstA), 2)]

mismatches = [(book1[1], book2[1]) for book1, book2 in aPairs if bIndices[book1[1]] > bIndices[book2[1]]]
print(mismatches)
# [('Harry Potter', '1984'), ('Harry Potter', '50 Shades'), ('Harry Potter', 'Dracula'), ('1984', '50 Shades'), ('1984', 'Dracula')]

请注意，aPairs是（索引，书本）元组的组合，并且每个组合均按索引排序，这保证了在每对书本中，第一对比第二对“好”（对于用户A）

现在要计算排序不匹配，我们只需要确定lstB中相应的书索引是否也保留此排序。

编辑

@MadPhysicist指出，combinations在每个生成的元组中保留数组中的原始顺序，因此无需将aPairs创建为已排序的(index, book)元组的列表。我们可以只用mismatches直接生成bIndices：

lstA = ['Harry Potter','1984','50 Shades','Dracula']
lstB = ['50 Shades','Dracula','1984','Harry Potter']

bIndices = {b: i for i, b in enumerate(lstB)}
mismatches = [(book1, book2) for book1, book2 in combinations(lstA, 2) if bIndices[book1] > bIndices[book2]]

Answer 4

您可以使用$this->courses然后比较索引

iter

元素列表比较

4 个答案:

编辑