我有一个文件列表和需要在yaml文件中捕获的字符串列表。我想编写一个接受这个yaml文件并执行搜索和替换方法的函数。这是我到目前为止所得到的
2个文本文件和yaml文件
txt_1.txt
aB123.Abc
AB345.aBC
ab123.ABC
Ab345.abc
txt_2.txt
ab123.Abc
AB345.ABC
current_date
yaml_file - cf_master.yml
input_files:
- txt_1.txt
- txt_2.txt
replacement_strings:
string1:
from: AB123.ABC
to: XY000.XYZ
string2:
from: AB345.ABC
to: XY001.ZYX
string3:
from: current_date
to: '2018-04-07'
目的是将所有字符串(从值)替换为(到值)忽略大小写(不区分大小写)
import yaml
import re
with open('cf_master.yml') as f:
dataMap = yaml.safe_load(f)
def string_replacer(dataMap):
for files in dataMap['input_files']:
with open(dataMap['input_files']) as f:
input_h = f.read()
for string in dataMap['replacement_strings']:
output_h = input_h.replace(
dataMap['replacement_strings'][string]['from'],
dataMap['replacement_strings'][string]['to']
)
with open(output_dataMap[input_files],"w") as f:
f.write(output_h)
return output_dataMap[input_files]
string_replacer(dataMap)
我不明白如何更正此代码。输入文件,yaml文件和生成的新文件都在同一个文件夹中
答案 0 :(得分:2)
您可以简化yaml
文件。替换字符串不需要索引
input_files:
- txt_1.txt
- txt_2.txt
replacement_strings:
- from: AB123.ABC
to: XY000.XYZ
- from: AB345.ABC
to: XY001.ZYX
- from: current_date
to: '2018-04-07'
就替换而言,您可能希望在两次传递中进行替换,首先用临时标记替换,然后返回并用实际替换替换标记。这可以防止替换者相互交互。例如,您将所有'a'
替换为'b'
&而'b'
替换为'c'
'秒。如果没有中间标记步骤,第二次替换将替换所有原始'b'
,以及替换'b'
&#39中的所有'a'
; S
import yaml
import re
with open('cf_master.yml') as f:
data = yaml.safe_load(f)
for filepath in data['input_files']:
with open(filepath, 'r') as f:
txt = f.read()
marker_d = dict()
for i, d in enumerate(data['replacement_strings']):
marker = '__$TEMP{}$__'.format(i)
marker_d[marker] = d['to']
txt = re.sub(re.escape(d['from']), marker, txt, flags=re.I)
for marker, s in marker_d.items():
txt = re.sub(re.escape(marker), s, txt)
# Save file somewhere?