比较列表的子项并在Python中进行更改

时间:2014-12-17 14:27:21

标签: python list part-of-speech

我有两个源自词性标注器的列表,如下所示:

pos_tags = [('This', u'DT'), ('is', u'VBZ'), ('a', u'DT'), ('test', u'NN'), ('sentence', u'NN'), ('.', u'.'), ('My', u"''"), ('name', u'NN'), ('is', u'VBZ'), ('John', u'NNP'), ('Murphy', u'NNP'), ('and', u'CC'), ('I', u'PRP'), ('live', u'VBP'), ('happily', u'RB'), ('on', u'IN'), ('Planet', u'JJ'), ('Earth', u'JJ'), ('!', u'.')]


pos_names = [('John', 'NNP'), ('Murphy', 'NNP')]

我想创建一个最终列表,用pos_names中的列表项更新pos_tags。所以基本上我需要在pos_tags中找到John和Murphy并用NNP替换POS标签。

3 个答案:

答案 0 :(得分:0)

您可以从pos_names创建一个充当查找表的字典。然后,您可以使用get在表格中搜索可能的替换项,如果没有找到替换项,则保留标记。

d = dict(pos_names)
pos_tags = [(word, d.get(word, tag)) for word, tag in pos_tags]

答案 1 :(得分:0)

鉴于

pos_tags = [('This', u'DT'), ('is', u'VBZ'), ('a', u'DT'), ('test', u'NN'), ('sentence', u'NN'), ('.', u'.'), ('My', u"''"), ('name', u'NN'), ('is', u'VBZ'), ('John', u'NNP'), ('Murphy', u'NNP'), ('and', u'CC'), ('I', u'PRP'), ('live', u'VBP'), ('happily', u'RB'), ('on', u'IN'), ('Planet', u'JJ'), ('Earth', u'JJ'), ('!', u'.')]

names = ['John', 'Murphy']

你可以这样做:

[next((subl for subl in pos_tags if name in subl)) for name in names]

会给你:

[('John', u'NNP'), ('Murphy', u'NNP')]

答案 2 :(得分:0)

我同意字典对于这个问题是一个更自然的解决方案,但如果您需要pos_tags以便更明确的解决方案:

for word, pos in pos_names:
    for i, (tagged_word, tagged_pos) in enumerate(pos_tags):
        if word == tagged_word:
            pos_tags[i] = (word,pos)

(对于大量单词,字典会更快,因此您可能需要考虑将单词顺序存储在列表中并使用字典进行POS分配。)