我有349个文本文件。我使用以下代码来读取和标记所有这些代码。
import glob
path = "C:\\texts\\*.txt"
for file in files:
with open (file) as in_file, open ("C:\\texts\\file_tokens.txt", 'w') as out_file:
for line in in_file:
words = line.split()
for word in words:
out_file.write(word)
out_file.write("\n")
此代码将结果(所有标记)保存在一个文件(file_tokens.txt)中。如何在新的.txt文件中保存每个文件的标记?我的意思是我想要输出349个文件,因为每个文件都包含每个文件的标记。
答案 0 :(得分:1)
from os import path
base_path = "C:\\texts\\*.txt" #RENAMED
for file in files:
with open (file) as in_file:
with open(path.join(base_path,"%s_tokenized.txt" % file)) as out_file: #ATTENTION
for line in in_file:
words = line.split()
for word in words:
out_file.write(word)
out_file.write("\n")
您创建一个新文件,其名称特定于您正在处理的当前文件。在这个例子中它是($file_name)_tokenized.txt
。
path.join
用于将文件输出到正确的目录。即
>>> path.join("~/Documents","out.txt")
'~/Documents/out.txt'
答案 1 :(得分:0)
为每个输出文件指定一个不同的名称。