Question

我正在尝试创建一个比较两个txt文件的函数。如果它识别出一个文件中但不存在于另一个文件中的新行，则会将它们添加到list中，也可以添加到不包含这些新行的文件中。它没有做到这一点。这是我的功能。我做错了什么？

newLinks = []

def newer():
with open('cnbcNewLinks.txt', 'w') as newL:
    for line in open('cnbcCleanedLinks.txt'):
        if line not in "cnbcNewLinks.txt":
            newLinks.append(line)
            newL.write(line)
        else:
            continue
cleaned = ''.join(newLinks)
print(cleaned)

Answer 1

如果文件不大，则在列表中移动数据，两个列表转换为set并使用'different'内置函数，两次。然后在文件中添加差异。

Answer 2

我输入了@Alex建议的python代码。

请参阅set的文档。

我将a.txt和b.txt替换为您的文本文件名，以便于阅读。

# First read the files and compare then using `set`
with open('a.txt', 'r') as newL, open('b.txt', 'r') as cleanL:
    a = set(newL)
    b = set(cleanL)
    add_to_cleanL = list(a - b) # list with line in newL that are not in cleanL
    add_to_newL = list(b - a) # list with line in cleanL that are not in newL

# Then open in append mode to add at the end of the file
with open('a.txt', 'a') as newL, open('b.txt', 'a') as cleanL:
    newL.write(''.join(add_to_newL)) # append the list at the end of newL
    cleanL.write(''.join(add_to_cleanL)) # append the list at the end of cleanL

Answer 3

这是有效的代码

newLinks = []

with open('cnbcNewLinks.txt', 'a+') as newL:
    new_file_lines = newL.read().splitlines()
    with open('cnbcCleanedLinks.txt') as cleanL:
        clean_file_lines = cleanL.read().splitlines()
    for line in clean_file_lines:
        if line not in new_file_lines:
            newLinks.append(line)
            line_to_append = '\n'+line
            newL.write(line_to_append)

cleaned = ''.join(newLinks)
print(cleaned)

正如@cdarke评论的那样，你没有循环文件内容，而是文件名字符串。

此外，使用for line in open('filename')是不好的做法。而是使用'with'语句

newL.read().splitlines()将您的文件内容整齐地转换为一个列表，其中每个元素都是一行。

检查txt文件中是否有新字符串

3 个答案: