Question

我试图计算一个没有下位词的名词的所有下位词（在该名词下面的名词层次结构中是终点）。例如，对于“实体”（层次结构中最高的名词），结果应该是没有下义词的所有名词的计数（所有名词都是层次结构中的终端）。对于终端本身的名词，数字必须为1.我有一个名词列表。输出必须为列表中的每个名词提供这样的计数。

经过大量搜索，试验和错误，这是我提出的代码（只有相关部分）：

import nltk
from nltk.corpus import wordnet as wn

def get_hyponyms(synset): #function source:https://stackoverflow.com/questions/15330725/how-to-get-all-the-hyponyms-of-a-word-synset-in-python-nltk-and-wordnet?rq=1
    hyponyms = set()
    for hyponym in synset.hyponyms():
        hyponyms |= set(get_hyponyms(hyponym))
    return hyponyms | set(synset.hyponyms())

with open("list-nouns.txt", "rU") as wordList1:
    myList1 = [line.rstrip('\n') for line in wordList1]
    for word1 in myList1:
        list1 = wn.synsets(word1, pos='n')
        countTerminalWord1 = 0  #counter for synsets without hyponyms
        countHyponymsWord1 = 0  #counter for synsets with hyponyms
        for syn_set1 in list1:
            syn_set11a = get_hyponyms(syn_set1)
            n = len(get_hyponyms(syn_set1))  #number of hyponyms
            if n > 0:
                countHyponymsWord1 += n
            else:
                countTerminalWord1 += 1
            for syn_set11 in syn_set11a:
                syn_set111a = get_hyponyms(syn_set11)
                n = len(get_hyponyms(syn_set11))
                if n > 0:
                    countHyponymsWord1 += n
                else: 
                    countTerminalWord1 += 1
                #...further iterates in the same way for the following levels
        print (countHyponymsWord1)
        print (countTerminalWord1)

（该代码还尝试计算所有具有下位词的名词，但这不是必需的。）

主要问题是我不能为19个步骤的名词层次结构的整个深度重复此代码。它很快就会出现'SystemError：太多静态嵌套块'。

如何解决这个问题的帮助或建议将不胜感激。

如何计算NLTK和WordNet没有下位词的名词的下位词？

0 个答案: