如何在字符串周围添加标点符号?

时间:2018-09-04 01:20:46

标签: python regex spacy

我想在标识为NNS的单词周围添加方括号。能够将其识别为单个单词,如何将其与句子重新结合。

import spacy, re

nlp = spacy.load('en_core_web_sm')
s = u"The cats woke up but the dogs slept."

doc = nlp(s)
for token in doc:
    if (token.tag_ == 'NNS'):
        print ([token])

当前结果:

[cats]
[dogs]

预期结果:

The [cats] woke up but the [dogs] slept.

3 个答案:

答案 0 :(得分:3)

一个常见的习惯用法是使用列表来收集单词,然后将它们加入:

sentence = []
doc = nlp(s)
for token in doc:
    if (token.tag_ == 'NNS'):
        sentence.append('[' + token + ']')
    else:
        sentence.append(token)

sentence = ' '.join(sentence)

答案 1 :(得分:2)

@John Blart,答案是使用列表理解的正确选择:

import spacy

nlp = spacy.load('en_core_web_sm')
s = u"The cats woke up but the dogs slept."

doc = nlp(s)
print(' '.join(['[{}]'.format(token) if token.tag_ == 'NNS' else '{}'.format(token) for token in doc])

答案 2 :(得分:0)

import spacy

nlp = spacy.load('en_core_web_sm')
s = u"The cats woke up but the dogs slept."
doc = nlp(s)
sentence = []
doc = nlp(s)
for token in doc:
    if (token.tag_ == 'NNS'):
        sentence.append('[' + (token.text) + ']')
    else:
        sentence.append(token.text)

sentence = ' '.join(sentence)
print sentence

结果:

The [cats] woke up but the [dogs] slept .