我正在尝试改进此代码的匹配表达式,以使其与字符串之前或之后的空格匹配,并且也忽略大小写。目的是输出缩短的状态缩写。
import re
s = "new South Wales "
for r in (("New South Wales", "NSW"), ("Victoria", "VIC"), ("Queensland", "QLD"), ("South Australia", "SA"), ("Western Australia", "WA"), ("Northern Territory", "NT"), ("Tasmania", "TAS"), ("Australian Capital Territory", "ACT")):
s = s.replace(*r)
output = {'state': s}
print (output)
我想出了正则表达式可以做到这一点(请参见rebase interactively):
(?i)(?<!\S)New South Wales(?!\S)
,它将在字符串的两侧匹配或不匹配空格,并且忽略大小写。谁能帮助我更新原始代码以包含新的正则表达式?
答案 0 :(得分:1)
如果我是我,我只是在传递字符串之前将其剥离(),然后使用类似re.sub()的名称,在其中我们可以使用' flags = re.IGNORECASE '如下所示。
import re
s = " new South Wales ".strip()
for r in (("New South Wales", "NSW"), ("Victoria", "VIC"), ("Queensland", "QLD"), ("South Australia", "SA"), ("Western Australia", "WA"), ("Northern Territory", "NT"), ("Tasmania", "TAS"), ("Australian Capital Territory", "ACT")):
_regex = '{0}|{1}'.format(r[0], r[1])
if re.match(_regex, s, flags=re.IGNORECASE):
subbed_string = re.sub(r[0], r[1], s, flags=re.IGNORECASE)
print({'state': subbed_string.upper()})
此外,在尝试替换值之前,我还添加了对匹配项的检查。否则,您可能会输出错误的结果。例如:
((('塔斯马尼亚州','TAS'){'州':'新南威尔士州'})