Question

将IP地址列表下载到文件中，然后重命名为Old_file。随着时间的流逝，设备会获得更多的IP更新（或删除）。因此，我将新的IP地址列表下载到另一个名为New_file

的文件中。

然后我想对这两个文件进行比较，看看什么不匹配

Old_file = [1.1.1.1， 1.1.1.2， 1.1.1.3， 1.1.1.4， 1.1.1.6，]

new_file = [1.1.1.1， 1.1.1.2， 1.1.1.3， 1.1.1.5， 1.1.1.6] 返回需要到1.1.1.4，并在那里停止。但是永远不要来自Old_file，例如：1.1.1.5（我们只需要来自New_file的结果）我真的希望这可以解释。

预先感谢托尼

Answer 1

对于简单的元素比较，您可以

def get_first_unequal(s0, s1):   
    for e0, e1 in zip(s0, s1): # assumes sequences are of equal length!
        if e0 != e1:
            print(f"unequal elements: '{e0}' vs. '{e1}'!")
            return (e0, e1)
    return None # all equal

a = ['a', 'b', 'c']
b = ['a', 'b', 'd']             
get_first_unequal(a, b)            
# unequal elements: 'c' vs. 'd'!  
# ('c', 'd')

# --> to get a list of all unequal pairs, you could also use
# [(e0, e1) for (e0, e1) in zip(s0, s1) if e0 != e1]

如果您想变得更复杂，如评论中所述，difflib可能是您的选择。运行例如两个序列的比较（这是您从要比较的两个txt文件中读取的字符串的列表）：

import difflib
a = ['a', 'b', 'c']
b = ['s', 'b', 'c', 'd']
delta = difflib.context_diff(a, b)
for d in delta:
    print(d)

给予

*** 1,3 ****
! a
  b
  c
--- 1,4 ----
! s
  b
  c
+ d

要检查两个字符串之间的差异，可以执行类似的操作（从here借用）：

a = 'string1'
b = 'string 2'
delta = difflib.ndiff(a, b)

print(f"a -> b: {a} -> {b}")
for i, d in enumerate(delta):
    if d[0] == ' ':  # no difference
        continue
    elif d[0] == '-':
        print(f"Deleted '{d[-1]}' from position {i}")
    elif d[0] == '+':
        print(f"Added '{d[-1]}' to position {i-1}")

给予

a -> b: string1 -> string 2
Deleted '1' from position 6
Added ' ' to position 6
Added '2' to position 7

Answer 2

如果您假定两个文件应该完全相同，则可以遍历第一个文件的字符并将它们与第二个文件进行比较。即

# check that they're the same length first
if len(Old_file) != len(New_file):
    print('not the same!')
else:
    for indx, char in enumerate(Old_file):
        try:
            # actually compare the characters
            old_char = char
            new_char = New_file[indx]
            assert(old_char == new_char)
        except IndexError:
            # the new file is shorter than the old file
            print('not the same!')
            break  # kill the loop
        except AssertionError:
            # the characters do not match
            print('not the same!')
            break  # kill the loop

值得注意的是，有更快的方法可以做到这一点。您可以考虑执行checksum，尽管它不会告诉您哪些部分不同，只是它们不同。如果文件很大，则一次检查一个字符的性能将很差-在这种情况下，您可以尝试一次比较数据块。

编辑：重新阅读您的原始问题，您肯定可以使用while循环来完成此操作。如果这样做了，我建议检查每个字符的策略基本上相同。在这种情况下，您当然需要手动增加indx。

比较两个文件并显示缺少的结果

2 个答案: