从wordnet获取名词和动词

时间:2010-10-04 17:57:04

标签: java nlp wordnet

我正在努力寻找一个单词是名词还是动词等

我找到了麻省理工学院的Java Wordnet界面 有一个像这样的示例代码,但是当我使用这个时,我得到的错误是Dictionary是抽象类而无法实例化

public void testDictionary() throws IOException {


// construct the URL to the Wordnet dictionary directory

String wnhome = System.getenv("WNHOME");

String path = wnhome + File.separator + "dict";

URL url = new URL("file", null, path);

    // construct the dictionary object and open it

IDictionary dict = new Dictionary(url);

dict.open();


// look up first sense of the word "dog"

IIndexWord idxWord = dict.getIndexWord("dog", POS.NOUN);

IWordID wordID = idxWord.getWordIDs().get(0);

IWord word = dict.getWord(wordID);

System.out.println("Id = " + wordID);

System.out.println("Lemma = " + word.getLemma());

System.out.println("Gloss = " + word.getSynset().getGloss());

 }

我还有另一个wordnet的java接口

danbikel的界面

但我没有得到查询的答案

WordNet wn=new WordNet("/usr/share/wordnet");
    Morphy m = new Morphy(wn);

    System.out.println(m.morphStr("search","NOUN").length);

字符串长度始终为0,此方法的正确参数是什么?这是方法的javadoc,我做错了什么?

public String[] morphStr(String origstr, String pos)
Tries several techniques on origstr to find possible base forms (lemmas).

Specified by:
morphStr in interface MorphyRemote
Parameters:
origstr - word or collocation, separated either by whitespace, '_' or '-', to find lemma of
pos - part of speech of origstr
Returns:
array of possible lemmas for origstr, possibly of length 0 if no lemmas could be found

2 个答案:

答案 0 :(得分:0)

我个人推荐Yawni,这是旧JWordNet项目的新名称。要获取搜索字词的所有词性,您可以调用FileBackedDictionary.synsets(yourQueryWord),然后遍历返回的Synset调用getPOS()

答案 1 :(得分:0)

你解决了问题吗?我之前也使用过JWI但不同之处在于我将我的IDictionary变量声明为静态...但其余部分几乎相同。要获取名词,你必须使用迭代:

  

最终的迭代者   ITR = dict.getIndexWordIterator(POS.NOUN)   而(itr.hasNext())...