如何使用Re替换字典值的子字符串

时间:2019-01-03 16:24:02

标签: python python-2.7 dictionary

我试图遍历字典并使用Re替换子字符串,但是我的字典最终以空值结尾。我在下面概述了我的代码:

mydict = {
    'Getting links from: https://www.foo.com/': 
    [
        '├─BROKEN─ http://www.broken.com/',
        '├─BROKEN─ http://www.set.com/',
        '├─BROKEN─ http://www.one.com/'
    ],
    'Getting links from: https://www.bar.com/': 
    [
        '├─BROKEN─ http://www.broken.com/'
    ]
}

val = "├─BROKEN─"

for k, v in mydict.iteritems():
  for i, s in enumerate(v):
      v[i] = re.sub(r'.*├─BROKEN─', '', val)

此代码生成没有值的字典:

mydict = {
    'Getting links from: https://www.foo.com/': 
    [
        '',
        '',
        ''
    ],
    'Getting links from: https://www.bar.com/': 
    [
        ''
    ]
}

我想要的是:

mydict = {
    'Getting links from: https://www.foo.com/': 
    [
        'http://www.broken.com/',
        'http://www.set.com/',
        'http://www.one.com/'
    ],
    'Getting links from: https://www.bar.com/': 
    [
        'http://www.broken.com/'
    ]
}

我想念什么?

2 个答案:

答案 0 :(得分:3)

您这里不需要正则表达式,似乎有点贵。使用字符串replace()strip()

mydict = {
    'Getting links from: https://www.foo.com/': 
    [
        '├─BROKEN─ http://www.broken.com/',
        '├─BROKEN─ http://www.set.com/',
        '├─BROKEN─ http://www.one.com/'
    ],
    'Getting links from: https://www.bar.com/': 
    [
        '├─BROKEN─ http://www.broken.com/'
    ]
}

val = "├─BROKEN─"

for k, v in mydict.items():
    mydict[k] = [x.replace(val, '').strip() for x in v]

print(mydict)

# {'Getting links from: https://www.foo.com/': ['http://www.broken.com/', 'http://www.set.com/', 'http://www.one.com/'],
#  'Getting links from: https://www.bar.com/': ['http://www.broken.com/']}

答案 1 :(得分:3)

修改了正则表达式的代码。

import re

mydict = {
    'Getting links from: https://www.foo.com/': 
    [
        '├─BROKEN─ http://www.broken.com/',
        '├─BROKEN─ http://www.set.com/',
        '├─BROKEN─ http://www.one.com/'
    ],
    'Getting links from: https://www.bar.com/': 
    [
        '├─BROKEN─ http://www.broken.com/'
    ]
}


for k, v in mydict.iteritems():
  for i, s in enumerate(v):
      v[i] = re.sub(r'\├─BROKEN─', '', s)

输出:

{'Getting links from: https://www.bar.com/': [' http://www.broken.com/'],
 'Getting links from: https://www.foo.com/': [' http://www.broken.com/',
                                              ' http://www.set.com/',
                                              ' http://www.one.com/']}

评论|中所说的是special character