Question

我需要能够在函数参数中导入和操作多个文本文件。我认为在函数参数中使用* args会起作用，但是我收到有关元组和字符串的错误。

def open_file(*filename): 
   file = open(filename,'r')
   text = file.read().strip(punctuation).lower()  
   print(text)

open_file('Strawson.txt','BigData.txt')
ERROR: expected str, bytes or os.PathLike object, not tuple

我该如何正确地做到这一点？

Answer 1

在函数参数列表中使用*args语法时，它允许您使用多个参数调用函数，这些参数将显示为函数的元组。因此，要对每个参数执行一个过程，您需要创建一个循环。像这样：

from string import punctuation

# Make a translation table to delete punctuation
no_punct = dict.fromkeys(map(ord, punctuation))

def open_file(*filenames):
    for filename in filenames:
        print('FILE', filename)
        with open(filename) as file:
            text = file.read()
        text = text.translate(no_punct).lower()
        print(text)
        print()

#test

open_file('Strawson.txt', 'BigData.txt')

我还添加了一个字典no_punct，可用于删除文本中的所有标点符号。我使用了with语句，因此每个文件都会自动关闭。

如果您希望功能为＆＃34;返回＆＃34;处理完每个文件的内容后，您不能将return放入循环中，因为这会告诉函数退出。您可以将文件内容保存到列表中，并在循环结束时返回该文件内容。但更好的选择是将功能转换为发电机。 Python yield关键字使这一切变得简单。这是一个让你入门的例子。

def open_file(*filenames):
    for filename in filenames:
        print('FILE', filename)
        with open(filename) as file:
            text = file.read()
        text = text.translate(no_punct).lower()
        yield text

def create_tokens(*filenames):
    tokens = [] 
    for text in open_file(*filenames):
        tokens.append(text.split())
    return tokens

files = '1.txt','2.txt','3.txt'
tokens = create_tokens(*files)
print(tokens)

请注意，我从word.strip(punctuation).lower()删除了create_tokens内容：不需要它，因为我们已经删除了所有标点并将文本折叠为小写{{1} }}

我们在这里真的不需要两个功能。我们可以将所有内容合二为一：

open_file

使用args python接受参数中的多个文件

1 个答案: