我编写的函数只能更新一次文本文件,但我需要反复执行。为了避免经常将临时文件复制到目标文件,我想只更新一次循环中的所有单词。我怎么能这样做? 这是我的python代码(但只更新一次):
import io
from tempfile import mkstemp
from shutil import move
from os import remove, close
def replaceWords(source_file_path, old_word, cluster_labels):
new_word_list = [old_word + "_" + str(label) for label in cluster_labels]
fh, target_file_path = mkstemp()
with io.open(target_file_path, mode='w', encoding='utf8') as target_file:
with io.open(source_file_path, mode='r', encoding='utf8') as source_file:
index = 0
for line in source_file:
words =[]
for word in line.split():
if word == old_word:
words.append(word.replace(old_word, new_word_list[index]))
index += 1
else:
words.append(word)
target_file.write(" ".join(words))
close(fh)
remove(source_file_path)
move(target_file_path, source_file_path)
例如:
第一次更新:
源文件上下文:of anarchism have often been divided into the categories of social and individualist anarchism or similar dual classifications
old_word:'of'
cluster_labels:'[1,2]'
更新后:
目标文件上下文:of_1 anarchism have often been divided into the categories of_2 social and individualist anarchism or similar dual classifications
第二次更新:
old_word:'无政府主义'
cluster_labels:'[1,2]'
更新后:
目标文件上下文:of_1 anarchism_1 have often been divided into the categories of_2 social and individualist anarchism_2 or similar dual classifications
在我的代码中,我必须调用该函数两次并复制文件两次,但是当需要更新的单词太多时,这种方法绝对是耗时且频繁的读/写/复制,这是io不友好的。
那么,是否有任何方法可以在不经常阅读/写入/复制的情况下优雅地处理此问题?
答案 0 :(得分:0)
可以有很多方法可以做到这一点。对你所做的内联方法的一种方法可以是使用* argv来获取要替换的单词列表,并替换当前行中的单词。我在这里添加了一些伪代码,它没有针对错误进行测试。 请注意2项变更 1.在函数的输入参数中。 2.添加for循环以迭代输入参数。
#! /usr/bin/python
import io
from tempfile import mkstemp
from shutil import move
from os import remove, close
import logging
logging.basicConfig(level=logging.DEBUG, format=' %(asctime)s -%(levelname)s - %(message)s')
def replaceWords(**source_file_path, cluster_labels ,*argv**):
old_word = 'of'
new_word_list = [old_word + "_" + str(label) for label in cluster_labels]
fh, target_file_path = mkstemp()
logging.debug(new_word_list)
logging.debug(old_word)
with io.open(target_file_path, mode='w', encoding='utf8') as target_file:
with io.open(source_file_path, mode='r', encoding='utf8') as source_file:
index = 0
for line in source_file:
words =[]
for word in line.split():
**for wordtochange in argv:**
if word == old_word:
words.append(word.replace(old_word, new_word_list[index]))
index += 1
else:
words.append(word)
target_file.write(" ".join(words))
close(fh)
remove(source_file_path)
move(target_file_path, source_file_path)
replaceWords('file.txt',[1,2],('of','anarchism'))