Question

我正在编写一个迷你程序，在我的程序中有一个函数可以读取文本文件并返回句子中的单个单词。但是，即使我将它们归还，我也无法看到打印的单词。除非我的空白存在很大问题，否则我不明白为什么。你能帮忙吗？为了您的信息，我只是一个初学者。程序要求用户输入文件名，然后程序读取函数中的文件，然后将fie转换为列表并从列表中查找单个单词并将其存储在该列表中

file_input = input("enter a filename to read: ")
#unique_words = []
def file(user): 
    unique_words = []
    csv_file = open(user + ".txt","w")
    main_file = csv_file.readlines()
    csv_file.close()


    for i in main_list:
            if i not in unique_words:
                    unique_words.append(i)


    return unique_words

#display the results of the file being read in

print (file(file_input))

抱歉，我正在使用记事本：

check to see if checking works

Answer 1

您的文件中的每一行似乎只有一个单词。

def read_file(user): 
    with open(user + ".txt","r") as f:
        data = [ line.strip() for line in f.readlines() ]
    return list( set(data) )

- 更新--- 如果每行中有多个单词并用空格分隔

def read_file(user): 
        with open(user + ".txt","r") as f:
            data = [ item.strip() for line in f.readlines() for item in line.split(' ')]
        return list( set(data) )

Answer 2

事实上，我无法重现你的问题。给出正确的CSV输入文件^1），例如

a,b,c,d
e,f,g,h
i,j,k,l

你的程序打印出这个，除了最后一个''似乎没问题：

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', '']

但是，您可以显着简化代码。

而不是向每一行添加,，然后按""加入，只需加入,（这也将摆脱最后''）< / LI>

使用生成器表达式

直接在strip中执行join

main_string = ",".join(line.strip() for line in main_file)

而不是join然后split，使用双循环列表理解：

main_list = [word for line in csv_file for word in line.strip().split(",")]

而不是手动执行所有操作，请使用csv模块：

main_list = [word for row in csv.reader(csv_file) for word in row]

假设订单不重要，请使用set删除重复项：
```
unique_words = set(main_list)
```

和如果订单很重要，您可以（ab）使用collections.OrderedDict：

unique_words = list(collections.OrderedDict((x, None) for x in main_list))

使用with打开和关闭文件

全部放在一起：

import csv
def read_file(user): 
    with open(user + ".txt") as csv_file:
        main_list = [word for row in csv.reader(csv_file) for word in row]
        unique_words = set(main_list)  # or OrderedDict, see above
        return unique_words

^1）更新：它不适用于您的＆＃34;示例文本...＆＃34;编辑中显示的文件是因为不是 CSV文件。 CSV表示＆＃34;逗号分隔值＆＃34;，但该文件中的单词用空格分隔，因此您必须使用空格而不是逗号来split：

def read_file(user): 
    with open(user + ".txt") as text_file:
        main_list = [word for line in text_file for word in line.strip().split()]
        return set(main_list)

Answer 3

如果您想要的只是文本中出现的每个单词的列表，那么您的工作量太大了。你想要这样的东西：

unique_words = []
all_words = []
with open(file_name, 'r') as in_file:
  text_lines = in_file.readlines() # Read in all line from the file as a list.
for line in text_lines:
  all_words.extend(line.split()) # iterate through the list of lines, extending the list of all words to include the words in this line.
unique_words = list(set(all_words)) # reduce the list of all words to unique words.

Answer 4

您可以使用set来简化代码，因为它只包含唯一元素。

user_file = raw_input("enter a filename to read: ")

#function to read any file
def read_file(user):
    unique_words = set()
    csv_file = open(user + ".txt","r")
    main_file = csv_file.readlines()
    csv_file.close()

    for line in main_file:
        line = line.split(',')
        unique_words.update([x.strip() for x in line])

    return list(unique_words)

#display the results of the file being read in
print (read_file(user_file))

包含内容的文件的输出：

Hello, world1
Hello, world2

是

['world2', 'world1', 'Hello']

为什么不打印个别单词？

4 个答案: