仅选择' NN'和' VB'来自NTLK pos_tag的话

时间:2015-07-04 11:44:16

标签: python string nlp nltk part-of-speech

我只需要打印NN'和' VB'输入句子中的单词。

import nltk
import re
import time

var = raw_input("Please enter something: ")


exampleArray = [var]


def processLanguage():
    try:
        for item in exampleArray:
            tokenized = nltk.word_tokenize(item)
            tagged = nltk.pos_tag(tokenized)
            print tagged

            time.sleep(555)


    except Exception, e:
        print str(e)

processLanguage()

3 个答案:

答案 0 :(得分:5)

如何改变

    print tagged

    print [(word, tag) for word, tag in tagged if tag in ('NN', 'VB')]

答案 1 :(得分:1)

您可能需要使用POS标记的前2个字符,请参阅NLTK - Get and Simplify List of Tags

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>/</key>
    <dict>
        <key>Parent/</key>
        <dict>
            <key>level1/</key>
            <dict>
                <key>key1</key>
                <string>value1</string>
                <key>level2/</key>
                <dict/>
            </dict>
        </dict>
    </dict>
</dict>
</plist>

答案 2 :(得分:0)

您可以尝试以下方法:

TypeError: href.contains is not a function

通过在“ selective_pos”列表中添加您的选择性词性,您可以选择任何喜欢的词。