re.sub用相应的匹配替换多个列表(Python)

时间:2017-12-04 20:58:22

标签: python

通过正则表达式,我想用相应的极性标签来注释给定句子的情感词典,所以我编写了如下代码行。

import re
vocab = ['good/POSI','bad/NEAG','strong/POSI','dirty/NEGA', 'never/SWIT']
sent = ["It is really good", "strong man never gets his body dirty"]

for token in vocab:
    word = re.sub(r'(\\w+)\\/[A-Z]+_[A-Z]+','\\1', token)
    TA = re.sub(str(word),str(token), str(sent))
print(TA)

我试着得到这样的结果。

["It is really good/POSI", "strong/POSI man never/SWIT gets his body dirty/NEGA"]

不幸的是,我不能,而且我不知道哪些线路有问题。 有没有更好的注释方法?

1 个答案:

答案 0 :(得分:1)

我建议将vocab列表更改为字典:

>>> vocab = {v[:v.find('/')]: v for v in vocab}
>>> vocab
{'dirty': 'dirty/NEGA', 'good': 'good/POSI', 'never': 'never/SWIT', 'bad': 'bad/NEAG', 'strong': 'strong/POSI'}

通过这种方式,您可以使用字典中的值替换\w+

result = []
for line in sent:
    line = re.sub(r'(\w+)', lambda w: vocab.get(w.group(), w.group()), line)
    result.append(line)
print(result)

这将输出您想要的内容:

['It is really good/POSI', 'strong/POSI man never/SWIT gets his body dirty/NEGA']