Question

我只是在学习使用Python的nltk。我正在使用POS标记。我想知道的是如何使用标签。例如，这是伪代码：

words = []
teststr = "George did well in the test."
tokens = nltk.word_tokenize(teststr)
words = nltk.pos_tag(tokens)

我想做这样的事情：

if words[i] == "proper noun":
    #do something

如何检查单词是名词还是动词或任何其他词性。有人可以帮帮我吗？感谢。

Answer 1

如果查看pos_tag函数调用的结果，将返回以下列表：

[('George', 'NNP'), ('did', 'VBD'), ('well', 'RB'), ('in', 'IN'), ('the', 'DT'), ('test', 'NN'), ('.', '.')]

如果您遍历列表以根据值作为专有名词来执行某些操作，则需要以下代码：

if words[i][1] == 'NNP':
    # do something

NNP是一个单数专有名词。该列表中的每个条目都是一个元组，第一个值是单词，第二个值是pos。