如何在NLTK的senseval模块中获得感觉定义?

时间:2013-05-05 03:50:55

标签: nltk word-sense-disambiguation

在NLTK senseval模块中,感官的格式为HARD1HARD2等(请参阅来源here)。但是,似乎没有办法获得实际的定义。我正在尝试实现Lesk算法,我现在正试图检查Lesk算法预测的感觉是否正确(使用WordNet中的定义)。

我遇到的问题是如何使用senseval答案(HARD1HARD2)统一WordNet定义。有人知道如何将SENSEVAL意义转化为定义,或者在某处查找吗?

1 个答案:

答案 0 :(得分:3)

我最终发现这些与WordNet 1.7中的感官相对应,这是非常陈旧的(在Mac OS X或Ubuntu 11.04上似乎不易安装)。

我找不到WordNet 1.7的在线版本。

该网站还提供了有关这三个语料库的一些有用信息。例如,它说interest的六种感官来自朗文英语在线词典(大约2001年)。见here

它将HARD的来源描述为WordNet 1.7。

最终,我最终手动将定义映射到WordNet 3.0中的定义。如果你有兴趣,这是字典。但请注意,我不是语言学方面的专家,而且他们并不精确

# A map of SENSEVAL senses to WordNet 3.0 senses.
# SENSEVAL-2 uses WordNet 1.7, which is no longer installable on most modern
# machines and is not the version that the NLTK comes with.
# As a consequence, we have to manually map the following
# senses to their equivalent(s).
SV_SENSE_MAP = {
    "HARD1": ["difficult.a.01"],    # not easy, requiring great physical or mental
    "HARD2": ["hard.a.02",          # dispassionate
              "difficult.a.01"],
    "HARD3": ["hard.a.03"],         # resisting weight or pressure
    "interest_1": ["interest.n.01"], # readiness to give attention
    "interest_2": ["interest.n.03"], # quality of causing attention to be given to
    "interest_3": ["pastime.n.01"],  # activity, etc. that one gives attention to
    "interest_4": ["sake.n.01"],     # advantage, advancement or favor
    "interest_5": ["interest.n.05"], # a share in a company or business
    "interest_6": ["interest.n.04"], # money paid for the use of money
    "cord": ["line.n.18"],          # something (as a cord or rope) that is long and thin and flexible
    "formation": ["line.n.01","line.n.03"], # a formation of people or things one beside another
    "text": ["line.n.05"],                 # text consisting of a row of words written across a page or computer screen
    "phone": ["telephone_line.n.02"],   # a telephone connection
    "product": ["line.n.22"],       # a particular kind of product or merchandise
    "division": ["line.n.29"],      # a conceptual separation or distinction
    "SERVE12": ["serve.v.02"],       # do duty or hold offices; serve in a specific function
    "SERVE10": ["serve.v.06"], # provide (usually but not necessarily food)
    "SERVE2": ["serve.v.01"],       # serve a purpose, role, or function
    "SERVE6": ["service.v.01"]      # be used by; as of a utility
}