词义消除对词的消歧

时间:2014-06-20 23:24:08

标签: nlp artificial-intelligence word-sense-disambiguation

假设我有一个单词A和一个单词B,其中我用B作为提示意味着A的意思。例如,A = bass,B =音乐,给定这个单词对,作为人类,我们可以马上知道A这个词是什么意思。

我知道有很多算法适用于句子。我想知道是否已经开发出仅针对一对单词进行WSD的算法。

1 个答案:

答案 0 :(得分:7)

Word Sense Disambiguation(WSD)是消除给定上下文句子/文档的单词的任务。在两个令牌短语的情况下,上下文基本上是另一个令牌。

您可以试用不同的WSD软件,这里有一个列表:Anyone know of some good Word Sense Disambiguation software?

我将使用pywsdhttps://github.com/alvations/pywsd)为您举例:

$ wget https://github.com/alvations/pywsd/archive/master.zip
$ unzip master.zip
$ cd pywsd-master
$ python
Python 2.7.5+ (default, Feb 27 2014, 19:37:08) 
[GCC 4.8.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from lesk import simple_lesk
# disambiguating the word 'bass' given the context 'bass music'
>>> simple_lesk('bass music', 'bass') 
Synset('bass.n.07')
>>> disambiguated = simple_lesk('bass music', 'bass')
>>> disambiguated.definition
<bound method Synset.definition of Synset('bass.n.07')>
>>> disambiguated.definition()
u'the member with the lowest range of a family of musical instruments

或者,您可以在NLTKhttps://github.com/nltk/nltk/blob/develop/nltk/wsd.py)中使用新模块,因为您有最新版本:

from nltk.wsd import lesk
disambiguated = lesk(context_sentence="bass music", ambiguous_word="bass")
print disambiguated.definition()

(免责声明:我在pywsd编写了leskNLTK模块