解决方案

Question

我有两个列表，让我们说：

keys1 = ['A', 'B', 'C', 'D', 'E',           'H', 'I']
keys2 = ['A', 'B',           'E', 'F', 'G', 'H',      'J', 'K']

如何创建没有重复项的合并列表，以保留两个列表的顺序，将缺少的元素插入它们所属的位置？像这样：

merged = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K']

请注意，可以将元素与相等进行比较，但不进行排序（它们是复杂的字符串）。这些元素不能通过比较来排序，但它们的顺序基于它们在原始列表中的出现次数。

如果出现矛盾（两个输入列表中的顺序不同），则包含所有元素的任何输出都是有效的。当然，如果解决方案在保留大部分订单时显示“常识”，则可获得奖励积分。

同样（正如一些评论仍然争论的那样），列表通常不会在共同元素的顺序上相互矛盾。如果他们这样做，算法需要优雅地处理该错误。

我开始使用.next（）遍历列表的版本，只推进包含不匹配元素的列表，但.next（）只知道何时停止。

merged = []
L = iter(keys1)
H = iter(keys2)
l = L.next()
h = H.next()

for i in range(max(len(keys1, keys2))):
  if l == h:
    if l not in merged:
      merged.append(l)
    l = L.next()
    h = H.next()

  elif l not in keys2:
    if l not in merged:
      merged.append(l)
    l = L.next()

  elif h not in keys1:
    if h not in merged:
      merged.append(h)
    h = H.next()

  else: # just in case the input is badly ordered
    if l not in merged:
      merged.append(l)
    l = L.next()
    if h not in merged:
      merged.append(h)
    h = H.next()   

print merged

这显然不起作用，因为.next（）会导致最短列表的异常。现在我可以更新我的代码以在每次调用.next（）时捕获该异常。但是这些代码已经完全没有了pythonic，这显然会破坏泡沫。

有没有人更好地了解如何迭代这些列表来组合元素？

如果我可以一次性完成三个列表的话，可以获得奖励积分。

Answer 1

你需要的基本上是任何合并实用程序所做的：它尝试合并两个序列，同时保持每个序列的相对顺序。您可以使用Python的difflib模块来区分两个序列，然后合并它们：

from difflib import SequenceMatcher

def merge_sequences(seq1,seq2):
    sm=SequenceMatcher(a=seq1,b=seq2)
    res = []
    for (op, start1, end1, start2, end2) in sm.get_opcodes():
        if op == 'equal' or op=='delete':
            #This range appears in both sequences, or only in the first one.
            res += seq1[start1:end1]
        elif op == 'insert':
            #This range appears in only the second sequence.
            res += seq2[start2:end2]
        elif op == 'replace':
            #There are different ranges in each sequence - add both.
            res += seq1[start1:end1]
            res += seq2[start2:end2]
    return res

示例：

>>> keys1 = ['A', 'B', 'C', 'D', 'E',           'H', 'I']
>>> keys2 = ['A', 'B',           'E', 'F', 'G', 'H',      'J', 'K']
>>> merge_sequences(keys1, keys2)
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K']

请注意，您期望的答案不一定是唯一可能的答案。例如，如果我们在这里改变序列的顺序，我们得到另一个同样有效的答案：

>>> merge_sequences(keys2, keys1)
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'I']

Answer 2

我会使用Set (cf. python doc)，我会填充两个列表的元素，一个接一个。

完成后，从Set中创建一个列表。

请注意，你的问题中存在一个矛盾/悖论：你想保留无法比较的元素的顺序（只有相等，因为你说的“它们是复杂的字符串”）。

编辑：OP注意到集合不会保留插入顺序。

Answer 3

我怀疑你可能要求解决shortest common supersequence问题，我认为在任意数量的输入序列的一般情况下它是NP难的。我不知道有任何库可以解决这个问题，所以你可能需要手动实现一个。获得工作代码的最快方法可能是使用difflib获取interjay的答案，然后使用reduce在任意数量的列表上运行它（确保将空列表指定为{{1的第3个参数） }}）。

Answer 4

只使用列表，您可以通过简单的for循环和.copy()实现此目的：

def mergeLists(list1, list2):
    # Exit if list2 is empty
    if not len(list2):
        return list1
    # Copy the content of list2 into merged list
    merged = list2.copy()

    # Create a list for storing temporary elements
    elements = []
    # Create a variable for storing previous element found in both lists
    previous = None

    # Loop through the elements of list1
    for e in list1:
        # Append the element to "elements" list if it's not in list2
        if e not in merged:
            elements.append(e)

        # If it is in list2 (is a common element)
        else:

            # Loop through the stored elements
            for x in elements:
                # Insert all the stored elements after the previous common element
                merged.insert(previous and merged.index(previous) + 1 or 0, x)
            # Save new common element to previous
            previous = e
            # Empty temporary elements
            del elements[:]

    # If no more common elements were found but there are elements still stored
    if len(elements)
        # Insert them after the previous match
        for e in elements:
            merged.insert(previous and merged.index(previous) + 1 or 0, e)
    # Return the merged list
    return merged

In [1]: keys1 = ["A", "B",      "D",      "F", "G", "H"]
In [2]: keys2 = ["A",      "C", "D", "E", "F",      "H"]
In [3]: mergeLists(keys1, keys2)
Out[3]: ["A", "B", "C", "D", "E", "F", "G", "H"]

英语不是我的第一语言，这个很难解释，但如果你关心解释，这就是它的作用：

有一个名为elements的本地列表，可以存储临时元素。
有一个名为previous的局部变量，它存储两个列表中的前一个元素。
当它找到一个不在list2但在list1中的元素时，它会将该元素追加到elements列表并继续循环。
一旦点击两个列表中的元素，它就会遍历elements列表，将previous元素后面的所有元素追加到list2。
新匹配随后会存储到previous中，elements会重置为[]并继续循环播放。
如果第一个或最后一个元素不是两个列表中的公共元素，则列表的开头和列表的结尾将被计为公共元素。

这种方式始终遵循以下格式：

以前的常用元素
list1中的元素，两个常见元素之间的
list2中的元素，位于两个常见元素之间
新常用元素

例如：

l1 = ["A", "B", "C",      "E"]
l2 = ["A",           "D", "E"]

以前的公共元素A将首先出现在合并列表中。
l1之后将插入上一个公共元素A与新公共元素E之间的A元素。
l2之间的元素A与新{{}}}之间的元素将紧接在E的元素之后插入。
新的公共元素l1将是最后一个元素。
如果找到更常见的元素，请返回第1步。

[“A”，“B”，“C”，“D”，“E”]

Answer 5

我最近在实现功能时偶然发现了类似的问题。我试图首先清楚地定义问题陈述。如果我理解正确，这是问题陈述

问题陈述

编写一个函数merge_lists，它将合并列表与重叠项目，同时保留项目的顺序。

约束

如果项目A出现在项目B之前的所有列表中，那么项目A也必须位于最终列表中的项目B之前
如果项目A和项目B在不同的列表中交换顺序，即在某些列表A中在B之前，而在某些其他列表中B在A之前，那么最终列表中A和B的顺序应与它们相同在第一个列表中按顺序排列它们。也就是说，如果A在l1中位于B之前且B位于l中位于A之前，则A应位于最终列表中的B之前
如果项目A和项目B没有在任何列表中一起出现，那么它们的顺序必须由列表的位置决定，其中每个列表首先出现。也就是说，如果项目A在l1和l3中，项目B在l2和l6中，那么最终列表中的顺序必须是A然后是B

测试用例1：

输入：

l1 = [“类型和尺寸”，“方向”，“材料”，“位置”，“正面打印类型”，“背面打印类型”]

l2 = [“类型和尺寸”，“材料”，“位置”，“正面打印类型”，“正面打印尺寸”，“背面打印类型”，“背面打印尺寸”]

l3 = [“Orientation”，“Material”，“Locations”，“Color”，“Front Print Type”]

merge_lists（[L1，L2，L3]）

输出：

['Type＆Size'，'Orientation'，'Material'，'Locations'，'Color'，'Front Print Type'，'Front Print Size'，'Back Print Type'，'Back Print Size'

测试案例2：

输入：

l1 = [“T”，“V”，“U”，“B”，“C”，“I”，“N”]

l2 = [“Y”，“V”，“U”，“G”，“B”，“I”]

l3 = [“X”，“T”，“V”，“M”，“B”，“C”，“I”]

l4 = [“U”，“P”，“G”]

merge_lists（[l1，l2，l3，l4]）

输出：

['Y'，'X'，'T'，'V'，'U'，'M'，'P'，'G'，'B'，'C'，'I'，'N “]

测试案例3：

输入：

l1 = [“T”，“V”，“U”，“B”，“C”，“I”，“N”]

l2 = [“Y”，“U”，“V”，“G”，“B”，“I”]

l3 = [“X”，“T”，“V”，“M”，“I”，“C”，“B”]

l4 = [“U”，“P”，“G”]

merge_lists（[l1，l2，l3，l4]）

输出：

['Y'，'X'，'T'，'V'，'U'，'M'，'P'，'G'，'B'，'C'，'I'，'N “]

解决方案

我找到了一个合理的解决方案，它正确地解决了我所拥有的所有数据。（对于其他一些数据集可能是错误的。将留给其他人评论）。这是解决方案

def remove_duplicates(l):
    return list(set(l))

def flatten(list_of_lists):
    return [item for sublist in list_of_lists for item in sublist]

def difference(list1, list2):
    result = []
    for item in list1:
        if item not in list2:
            result.append(item)
    return result

def preceding_items_list(l, item):
    if item not in l:
        return []
    return l[:l.index(item)]

def merge_lists(list_of_lists):
    final_list = []
    item_predecessors = {}

    unique_items = remove_duplicates(flatten(list_of_lists))
    item_priorities = {}

    for item in unique_items:
        preceding_items = remove_duplicates(flatten([preceding_items_list(l, item) for l in list_of_lists]))
        for p_item in preceding_items:
            if p_item in item_predecessors and item in item_predecessors[p_item]:
                preceding_items.remove(p_item)
        item_predecessors[item] = preceding_items
    print "Item predecessors ", item_predecessors

    items_to_be_checked = difference(unique_items, item_priorities.keys())
    loop_ctr = -1
    while len(items_to_be_checked) > 0:
        loop_ctr += 1
        print "Starting loop {0}".format(loop_ctr)
        print "items to be checked ", items_to_be_checked
        for item in items_to_be_checked:
            predecessors = item_predecessors[item]
            if len(predecessors) == 0:
                item_priorities[item] = 0
            else:
                if all(pred in item_priorities for pred in predecessors):
                    item_priorities[item] = max([item_priorities[p] for p in predecessors]) + 1
        print "item_priorities at end of loop ", item_priorities
        items_to_be_checked = difference(unique_items, item_priorities.keys())
        print "items to be checked at end of loop ", items_to_be_checked
        print

    final_list = sorted(unique_items, key=lambda item: item_priorities[item])
    return final_list

我也开源代码作为名为toolspy的库的一部分。所以你可以这样做

pip install toolspy

from toolspy import merge_lists
lls=[['a', 'x', 'g'], ['x', 'v', 'g'], ['b', 'a', 'c', 'x']]
merge_lists(lls)

Answer 6

这是我想出的一个C＃解决方案-使用扩展方法-对于两个列表可能不包含相同类型的元素的情况，因此它采用了compare方法和Selector方法（返回一个对象）给定源对象的目标类型）。在这种情况下，第一个列表（“我”）被修改为包含最终结果，但是可以对其进行修改以创建单独的列表。

public static class ListExtensions
{
    /// <summary>
    /// Merges two sorted lists containing potentially different types of objects, resulting in a single
    /// sorted list of objects of type T with no duplicates.
    /// </summary>
    public static void MergeOrderedList<TMe, TOther>(this List<TMe> me, IReadOnlyList<TOther> other, Func<TMe, TOther, int> compare = null, Func<TOther, TMe> selectT = null)
    {
        if (other == null)
            throw new ArgumentNullException(nameof(other));
        if (compare == null)
        {
            if (typeof(TMe).GetInterfaces().Any(i => i == typeof(IComparable<TOther>)))
            {
                compare = (a, b) => ((IComparable<TOther>)a).CompareTo(b);
            }
            else
            {
                throw new ArgumentNullException(nameof(compare),
                    "A comparison method must be supplied if no default comparison exists.");
            }
        }

        if (selectT == null)
            if (typeof(TMe).IsAssignableFrom(typeof(TOther)))
            {
                selectT = o => (TMe)(o as object);
            }
            else
            {
                throw new ArgumentNullException(nameof(selectT),
                    $"A selection method must be supplied if the items in the other list cannot be assigned to the type of the items in \"{nameof(me)}\"");
            }

        if (me.Count == 0)
        {
            me.AddRange(other.Select(selectT));
            return;
        }

        for (int o = 0, m = 0; o < other.Count; o++)
        {
            var currentOther = other[o];
            while (compare(me[m], currentOther) < 0 && ++m < me.Count) {}

            if (m == me.Count)
            {
                me.AddRange(other.Skip(o).Select(selectT));
                break;
            }

            if (compare(me[m], currentOther) != 0)
                me.Insert(m, selectT(currentOther));
        }
    }
}

注意：我确实为此编写了单元测试，所以很可靠。

交错不同的长度列表，消除重复，并保留顺序

6 个答案:

问题陈述

约束

测试用例1：

输入：

输出：

测试案例2：

输入：

输出：

测试案例3：

输入：

输出：

解决方案