我只需要打印NN'和' VB'输入句子中的单词。
import nltk
import re
import time
var = raw_input("Please enter something: ")
exampleArray = [var]
def processLanguage():
try:
for item in exampleArray:
tokenized = nltk.word_tokenize(item)
tagged = nltk.pos_tag(tokenized)
print tagged
time.sleep(555)
except Exception, e:
print str(e)
processLanguage()
答案 0 :(得分:5)
如何改变
print tagged
到
print [(word, tag) for word, tag in tagged if tag in ('NN', 'VB')]
答案 1 :(得分:1)
您可能需要使用POS标记的前2个字符,请参阅NLTK - Get and Simplify List of Tags
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>/</key>
<dict>
<key>Parent/</key>
<dict>
<key>level1/</key>
<dict>
<key>key1</key>
<string>value1</string>
<key>level2/</key>
<dict/>
</dict>
</dict>
</dict>
</dict>
</plist>
答案 2 :(得分:0)
您可以尝试以下方法:
TypeError: href.contains is not a function
通过在“ selective_pos”列表中添加您的选择性词性,您可以选择任何喜欢的词。