Question

我正在使用os.walk浏览搜索某些文件类型的目录。一旦找到文件类型（例如.txt或.xml），我想使用这个定义用字典中的字符串替换文件中的字符串（让我们称之为old）（让我们称之为{ {1}}）。

new

起初，我有这个循环：

def multipleReplace(text, wordDict):
    for key in wordDict:
        text = text.replace(key, wordDict[key])
    return text

这很快就行了，并且会在myDict = #dictionary with keys(old) and values(new)# home = #some directory# for dirpath, dirnames, filenames in os.walk(home): for Filename in filenames: filename = os.path.join(dirpath, Filename) if filename.endswith('.txt') or filename.endswith('.xml'): with fileinput.FileInput(filename,inplace=True,backup='.bak') as file: for line in file: print(multipleReplace(line,myDict),end='')字符串中找到old字符串中的new字符串替换old字符串。但问题在于我的脚本创建了一个。{1}}字符串。每个文件的bak文件，无论它是否在其中找到old个字符串。

我想仅为包含old字符串的文件创建.bak文件（仅适用于已完成替换的文件）。我尝试读取所有文件并仅附加包含old字符串的文件（使用类似newFiles.append(re.findall('\\b'+old+'\\b',line))的方式，这样我只能对这些文件使用FileInput方法，但正则表达式查找需要永远。

Answer 1

我认为这里不需要正则表达式。唯一缺少的部分是在创建.bak文件之前检查文件是否包含old个字符串。所以，请尝试以下方法：

def multipleReplace(text, wordDict):
    for key in wordDict.keys(): # the keys are the old strings
        text = text.replace(key, wordDict[key])
    return text

myDict = #dictionary with keys(old) and values(new)#
home = #some directory#
for dirpath, dirnames, filenames in os.walk(home):
    for Filename in filenames:
        filename = os.path.join(dirpath, Filename)
        if filename.endswith('.txt') or filename.endswith('.xml'):
            with open(filename, 'r') as f:
                content = f.read() # open and read file content
            if any([key in content for key in wordDict.keys()]):  # check if old strings are found              
                with fileinput.FileInput(filename,inplace=True,backup='.bak') as file:
                    for line in file:
                        print(multipleReplace(line,myDict), end='')

FileInput：仅为已在

1 个答案: