如何提取WordNet synset的偏移量,在Python NLTK中提供一个synset?

时间:2015-07-04 16:56:39

标签: python nlp nltk semantics wordnet

WordNet中的感知偏移是一个8位数字,后跟一个POS标记。例如,synset'dog.n.01'的偏移量为'02084071-n'。我尝试了以下代码:

    from nltk.corpus import wordnet as wn

    ss = wn.synset('dog.n.01')
    offset = str(ss.offset)
    print (offset)

但是,我得到了这个输出:

    <bound method Synset.offset of Synset('dog.n.01')>

如何以这种格式获得实际偏移量:'02084071-n'?

1 个答案:

答案 0 :(得分:5)

>>> from nltk.corpus import wordnet as wn
>>> ss = wn.synset('dog.n.01')
>>> offset = str(ss.offset()).zfill(8) + '-' + ss.pos()
>>> offset
u'02084071-n'