Question

好吧，我有一个学校作业，我需要将两个文件相互比较。这很简单，程序需要显示这两个文件中所有独特单词的内容，例如;

文件1：这是一个测试

file2的：这不是测试

输出： [“This”，“is”，“a”，“test”，“not”]

这是我从这段代码中得到的输出：

def unique_words(file_1, file_2):
    unique_words_list = []
    for word in file_1:
        unique_words_list.append(word)
    for word in file_2:
        if word not in file_1:
            unique_words_list.append(word)
    return unique_words_list

但这不会发生，遗憾的是这是输出：

['this \ n'，'是\ n'，'a \ n'，'test'，'this \ n'，'是\ n'，'不是\ n'，'a \ n'， '测试']

我有多个功能，几乎可以以相同的方式工作，也有类似的输出。我知道为什么\ n出现，我不知道怎么摆脱它。如果有人能帮助我获得正确的输出，那将是一个很好的帮助：）

Answer 1

Steampunkery的解决方案不正确：（1）它不处理每行有＆gt; 1个单词的文件，（2）它没有考虑file1.txt中的重复单词（尝试使用file1行“单词单词“ - 应该得到一个”单词“输出，但你得到四个）。此外，for/if构造也是不必要的。

这是一个紧凑而正确的解决方案。

file1.txt的内容：

the cat and the dog
the lime and the lemon

file2.txt的内容：

the mouse and the bunny
dogs really like meat

代码：

def unique(infiles):
    words = set()
    for infile in infiles:
        words.update(set([y for x in [l.strip().split() for l in open(infile, 'r').readlines()] for y in x]))
    return words

print unique(['file1.txt'])
print unique(['file2.txt'])
print unique(['file1.txt', 'file2.txt',])

输出：

set(['and', 'lemon', 'the', 'lime', 'dog', 'cat'])
set(['and', 'like', 'bunny', 'the', 'really', 'mouse', 'dogs', 'meat'])
set(['and', 'lemon', 'like', 'mouse', 'dog', 'cat', 'bunny', 'the', 'really', 'meat', 'dogs', 'lime'])

Python学员的两门课程：

使用语言提供的工具，例如set
考虑破坏算法的输入条件

Answer 2

这是我写的一小段代码重用你的一些代码：

#!/usr/bin/env python3.6

with open('file1.txt', 'r') as file1, open('file2.txt', 'r') as file2:
    file_1 = file1.readlines()
    file_1 = [line.rstrip() for line in file_1]
    file_2 = file2.readlines()
    file_2 = [line.rstrip() for line in file_2]


def unique_words(file_1, file_2):
    unique_words_list = file_1
    for word in file_2:
        if word not in unique_words_list:
            unique_words_list.append(word)
    return unique_words_list


print(unique_words(file_1, file_2))

此脚本假定您有两个名为file1.txt和file2.txt的文件，分别位于与脚本相同的目录中。从你的例子中，我们也假设每个单词都在它自己的行上。这是一个漫步：

打开这两个文件并将其行读入列表，删除带有列表推导的换行符
定义一个函数，将第一个文件中的所有单词添加到列表中，然后将第二个文件中不在该列表中的所有单词添加到列表中
使用我们之前作为输入阅读的文件打印该功能的输出。

用python比较两个文件

2 个答案: