Question

我在技术面试中被问过这个问题，如果我的答案完全错误，我想知道。

面试官要我分两个清单。这是示例

[1, 2, 3, 4], [1, 2, 3] => [4]
[1, 2, 2, 2], [1, 2] => [2, 2]
[1, 2, 2, 2], [1, 2, 3, 3] => [2, 2]


def diff_two_list(list1, list2):
  hash_map1 = {}
  for i in list1:
    hash_map1[i] = hash_map1.get(i, 0) + 1

  hash_map2 = {}
  for j in list2:
    hash_map2[j] = hash_map2.get(j, 0) + 1

  result = []
  for i in hash_map1.keys():
    if i not in hash_map2:
      for _ in range(hash_map1[i]):
        result.append(i)
    else:
      remained_value = hash_map1[i] - hash_map2[i]
      if remained_value > 0:
        for _ in range(remained_value):
          result.append(i)

  return result

我意识到这不是最好的代码。我想知道我的解决方案是否完全错误？这个解决方案的时间复杂度是多少。我想在codereview.stackexchange.com中询问这个问题，但他们说代码必须正确才能要求审核，所以我要求在这个房间里。

我回答的时间复杂度为2o(n)

Answer 1

当你面试一份工作时，面试官正在寻找有关候选人的许多事情。一些例子：

候选人是否会就问题提出正确的问题？
候选人是否具有我重视的技能？
候选人可以通过问题思考并提出合理的答案吗？
候选人是否编写了可以在我的团队中维护的干净代码？

面试官要求你在两个列表之间做差异并提供解决方案集。我查看了数据并将其解释为左数组与右数组的位置比较，其中：

结果列表将包含左侧不同的值
当右侧位置为空时，结果列表将包含左侧的值

如果你长时间盯着测试用例，你可能会想出其他的解释。

我希望最好的候选人可以提出有关数据集的问题或解释问题解释和边缘情况背后的假设。

这些示例列表显示已排序，是巧合
这似乎是位置方面的差异。我明白了吗？
右侧总是相同尺寸还是更小？
如果右侧较大，我应该包含一个元素吗？
数据集的最大大小是多少？

对于像这样的简单问题，我希望最好的候选人能够编写一些你可以快速理解的代码，而无需花费大量精力来思考代码的作用。

我希望最好的候选人能够以时间或空间效率的方式解决问题，具体取决于他们的假设。

在这种情况下，我希望候选人能够创建一个O（n）解决方案。

作为面试官，我认为你的答案难以理解且效率低下。通过嵌套循环，您的解决方案可能不是O（n）。

我可能不会花太多时间来弄清楚您的解决方案的时间复杂性或它是否有效。我会提出问题以确保问题相当清楚，然后继续讨论下一个技巧或适合的问题。

我会按如下方式解决问题：

test_cases = [
    [[1, 2, 3, 4], [1, 2, 3], [4]],
    [[1, 2, 2, 2], [1, 2], [2, 2]],
    [[1, 2, 2, 2], [1, 2, 3, 3], [2, 2]]
]


def left_diff_array(left, right):
    smallest_length = min(len(left), len(right))
    differences = []

    for x in range(1, smallest_length):
        if left[x] != right[x]:
            differences.append(left[x])

    if len(left) > len(right):
        differences += left[len(right):]

    return differences


for test in test_cases:
    first, second, answer = test

    assert(left_diff_array(first, second) == answer)
    print first, second, "=>", answer

Answer 2

假设排序列表，您可以迭代每个列表O(n)：

def diff_list(a, b):
    i, j = iter(b), iter(a)
    try:
        m, n = next(i), next(j)
        while True:
            if m == n:
                m, n = next(i), next(j)            
                continue
            if m < n:
                try:
                    m = next(i)
                except StopIteration:
                    yield n
                    raise
            else:
                yield n
                n = next(j)
    except StopIteration:
        yield from j

>>> list(diff_list([1, 2, 3, 4], [1, 2, 3]))
[4]
>>> list(diff_list([1, 2, 2, 2], [1, 2]))
[2, 2]
>>> list(diff_list([1, 2, 2, 2], [1, 2, 3, 3]))
[2, 2]

Answer 3

我认为O(m+n)可能是2o(n) / o(2n)的意思。你必须迭代这两个列表。较短的版本可能会澄清即使它可能不是最佳的：

from collections import Counter

def diff_two_list(list1, list2):
    c1, c2 = Counter(list1), Counter(list2)
    return [y for x in c1 for y in ([x] * (c1[x] - c2[x]))]

两次Counter次调用会迭代两个列表。对结果列表的理解受第一个列表（m）的长度限制，因为diff中的元素不能多于第一个列表中的元素。循环中的dict 设置项和获取项操作都是O(1)（Time complexity of python dict）。

Answer 4

此答案仅关注列表排序时的算法。

在程序中的每个特定时刻，您都会查看每个列表a和b中的值。当a小于b时，您知道a不能与list_b中的任何其他值相等，因为列表已排序，因此将a添加到list_a差异并移至b中的下一个值，当def dif_two_list(list_a, list_b): diff = [] it_a = iter(list_a) it_b = iter(list_b) def next_a(): try: return next(it_a) except StopIteration: #diff.append(b) #uncomment if you want to keep the values in the second list raise def next_b(): try: return next(it_b) except StopIteration: diff.append(a) raise try: a = b = None while True: if a==b: #discard both, they are the same a = next(it_a) #this ended up being the only one that didn't need it's own try except #if this raises the error we don't want to keep the value of b b = next_b() #however if this one raises an error we are interested in the 'a' value gotton right above elif a<b: #a is not in it_b diff.append(a) a = next_a() else: #b is not in it_a #diff.append(b) #uncomment if you are intersted in the values in the second list b = next_b() except StopIteration: #when one is exausted extend the difference by the other # (and the one just emptied doing nothing, easier then checking which one to extend by) diff.extend(it_a) #diff.extend(it_b) #uncomment if you are interested in the values in the second list return diff较低时，相反的情况也相反。

使用迭代器的实现可能如下所示：

next

我并不完全确定这与时间复杂度有何关系，但调用len(list_a) + len(list_b)的次数恰好是O(n+m)所以我相信这会使with open('/root/temps', 'a') as f:

Answer 5

我认为此代码段应以O（n + m）的时间复杂度和O（n）的空间复杂度（分别为列表的n和m）来解决问题。

我认为，只要付出一点额外的努力，就可以改进此代码。我避免创建第二个字典来减少内存消耗，以防列表大小出现问题。

在申请程序员职位时，必须具有代码可读性。好的做法是在分析技术问题时大声思考。面试官可能会给您一些有关他期望的提示。

我不知道是否可以这样做，但是我认为这可能会有所帮助：

Work at Google — Example Coding/Engineering Interview

Interview Cheat Sheet - Andrei Neagoie's - Data Structures + Algorithms

def diff_two_lists(list1, list2):
    dictionary = {}

    for item in list1:
        if not item in dictionary:
            dictionary[item] = 1
        else:
            dictionary[item] += 1

    for item in list2:
        if item in dictionary:
            dictionary[item] -= 1

    diff = []
    for key, value in dictionary.items():
        for i in range(value):
            diff.append(key)
    return diff


print(diff_two_lists([1, 2, 3, 4], [1, 2, 3]))
print(diff_two_lists([1, 2, 2, 2], [1, 2]))
print(diff_two_lists([1, 2, 2, 2], [1, 2, 3, 3]))

Python diff二列出了什么是时间复杂度？

5 个答案: