re.sub()的不区分大小写的正则表达式模式的问题

时间:2017-06-29 17:39:24

标签: python regex dictionary case-insensitive

我在Python中使用正则表达式进行字典操作。我想从字典项中删除1dc.com1DC.com1dc.COM1DC.COM

示例词典 -

{'system_name': 'a1pvdb092', 'fdc_inv_sa_team': 'X2AIX_GBS'}
{'system_name': 'W00000001.1DC.com', 'fdc_inv_sa_team': 'LAA.BRAZIL.AAA.WINDOWS\n'}
{'system_name': 'a10000048', 'fdc_inv_sa_team': 'X2AIX_NSS'}
{'system_name': 'a10000049', 'fdc_inv_sa_team': 'X2AIX_NSS'}

预期产出 -

['a1pvdb092']
['W00000001']
['a10000048']
['a10000049']

脚本 -

import re
from opswareConnect import data

for row in data:
    arg1 = [row["system_name"],]
    arg1 = re.sub('[.1DC.com]\\b', '', str(arg1))
    print arg1

脚本输出 -

['a1pvdb092']
['WBPVAP001Dco']
['a10000048']
['a10000049']

1 个答案:

答案 0 :(得分:4)

的正则表达式

正则表达式为\.1dc\.com。反斜杠会转义通常匹配任何字符而不仅仅是句点的点。

使用re.IGNORECASE标记使搜索大小写不敏感。

使用re.sub()查找并删除目标表达式。

完整解决方案

import re

data = [
    {'system_name': 'a1pvdb092', 'fdc_inv_sa_team': 'X2AIX_GBS'},
    {'system_name': 'W00000001.1DC.com', 'fdc_inv_sa_team': 'LAA.BRAZIL.AAA.WINDOWS\n'},
    {'system_name': 'a10000048', 'fdc_inv_sa_team': 'X2AIX_NSS'},
    {'system_name': 'a10000049', 'fdc_inv_sa_team': 'X2AIX_NSS'},
]

for row in data:
    sysname = row['system_name']
    print([re.sub(r'\.1dc\.com', '', sysname, flags=re.IGNORECASE)])

输出

['a1pvdb092']
['W00000001']
['a10000048']
['a10000049']