Question

我有一个由外部应用程序生成的主.xml文件，并希望通过使用python调整和删除某些行来创建几个新的.xmls。这些自适应的搜索字符串和替换字符串存储在数组中，例如：

replaceArray = [
[u'ref_layerid_mapping="x4049" lyvis="off" toc_visible="off"',
u'ref_layerid_mapping="x4049" lyvis="on" toc_visible="on"'],
[u'<TOOL_BUFFER RowID="106874" id_tool_base="3651" use="false"/>',
u'<TOOL_BUFFER RowID="106874" id_tool_base="3651" use="true"/>'],
[u'<TOOL_SELECT_LINE RowID="106871" id_tool_base="3658" use="false"/>',
u'<TOOL_SELECT_LINE RowID="106871" id_tool_base="3658" use="true"/>']]

所以我想遍历我的文件并用'ref_layerid_mapping="x4049" lyvis="off" toc_visible="off"'替换'ref_layerid_mapping="x4049" lyvis="on" toc_visible="on"'的所有出现，依此类推。不幸的是，“RowID”，“id_tool_base”和“ref_layerid_mapping”的ID值可能会偶尔发生变化。所以我需要的是在主文件中搜索整个字符串的匹配，无论哪个id值在引号之间，并且只替换replaceArray的两个字符串中不同的子字符串（例如use =“true”而不是使用=”假”）。我对正则表达式不是很熟悉，但我觉得我的搜索需要这样的东西吗？

re.sub(r'<TOOL_SELECT_LINE RowID="\d+" id_tool_base="\d+" use="false"/>', "", sentence)

我很高兴任何指示我正确方向的提示！如果您需要任何进一步的信息，或者如果我的问题不清楚，请告诉我。

Answer 1

执行此操作的一种方法是使用替换文本的功能。该函数将从re.sub获取匹配对象，并插入从被替换的字符串中捕获的id。

import re

s  = 'ref_layerid_mapping="x4049" lyvis="off" toc_visible="off"'
pat = re.compile(r'ref_layerid_mapping=(.+) lyvis="off" toc_visible="off"')

def replacer(m):
    return "ref_layerid_mapping=" + m.group(1) + 'lyvis="on" toc_visible="on"';

re.sub(pat, replacer, s)

输出：

'ref_layerid_mapping="x4049"lyvis="on" toc_visible="on"'

另一种方法是在替换模式中使用反向引用。（见http://www.regular-expressions.info/replacebackref.html）

例如：

import re
s = "Ab ab"
re.sub(r"(\w)b (\w)b", r"\1d \2d", s)

输出：

'Ad ad'

在忽略id并仅替换子字符串的同时在文件中搜索字符串

1 个答案: