我使用java jwi API搜索wordnet以获取单词的同义词。问题是它只给了我一个结果这个词来找到它的同义词本身。请指导我。是否有可能获得给定单词的所有可能同义词的列表?我的代码是:
public void searcher() {
try {
url = new URL("file", null, path);
dict = new Dictionary(url);
try {
dict.open();
} catch (IOException ex) {
JOptionPane.showMessageDialog(null, "Dictionary directory does not exist\n" + ex + "\nClass:Meaning Thread", "Dictionary Not Found Error", JOptionPane.ERROR_MESSAGE);
}
IIndexWord idxWord = dict.getIndexWord("capacity", POS.NOUN);
IWordID wordID = idxWord.getWordIDs().get(0);
IWord word = dict.getWord(wordID);
//Adding Related Words to List of Realted Words
ISynset synset = word.getSynset();
for (IWord w : synset.getWords()) {
System.out.println(w.getLemma());
}
} catch (Exception e) {
}
}
输出仅为:
capacity
本身!实际的同义词必须是:
capability
capacitance
content
electrical capacitance
mental ability...(so on)
所以我在代码中遗漏了什么,或者有人能告诉我任何想法什么是真正的问题?
提前致谢
答案 0 :(得分:4)
所以,这里是我使用Java JAWS进行wordnet搜索的答案!步骤是:
1- Download WordNet Dictionary from
2- Install WordNet
3- Go to Installed Directory and copied the WordNet Directory (in my case C:\Program Files (x86) was the Directory for WordNet Folder)
4- Pasted it into my Java Project (under MyProject>WordNet)
5- Making Path to the directory as:
File f=new File("WordNet\\2.1\\dict");
System.setProperty("wordnet.database.dir", f.toString());
6- Got Synonyms as:
public class TestJAWS{
public static void main(String[] args){
String wordForm = "capacity";
// Get the synsets containing the word form=capicity
File f=new File("WordNet\\2.1\\dict");
System.setProperty("wordnet.database.dir", f.toString());
//setting path for the WordNet Directory
WordNetDatabase database = WordNetDatabase.getFileInstance();
Synset[] synsets = database.getSynsets(wordForm);
// Display the word forms and definitions for synsets retrieved
if (synsets.length > 0){
ArrayList<String> al = new ArrayList<String>();
// add elements to al, including duplicates
HashSet hs = new HashSet();
for (int i = 0; i < synsets.length; i++){
String[] wordForms = synsets[i].getWordForms();
for (int j = 0; j < wordForms.length; j++)
{
al.add(wordForms[j]);
}
//removing duplicates
hs.addAll(al);
al.clear();
al.addAll(hs);
//showing all synsets
for (int i = 0; i < al.size(); i++) {
System.out.println(al.get(i));
}
}
}
}
else
{
System.err.println("No synsets exist that contain the word form '" + wordForm + "'");
}
}
你必须拥有 jaws-bin.jar
答案 1 :(得分:1)
你得到的是&#34; capacity#1&#34;,它具有&#34;执行或生产&#34;的能力的含义,它确实只有一个同义词。 (使用PWN搜索页面来了解WordNet如何将单词组织成同义词。)
听起来你所追求的是所有同义词中所有同义词的联合?我想你要么使用getSenseEntryIterator()
,要么只是在idxWord.getWordIDs().get(0);
周围放一个循环,用循环计数器替换0
,这样你就不仅得到了数组中的第一个项目。 / p>
答案 2 :(得分:0)
如果您想使用JWI并希望获取多于1个同义词,请从这个确切位置更改您的代码:
IIndexWord idxWord = dict.getIndexWord(inputWord, POS.NOUN);
try {
int x = idxWord.getTagSenseCount();
for (int i = 0; i < x; i++) {
IWordID wordID = idxWord.getWordIDs().get(i);
IWord word = dict.getWord(wordID);
// Adding Related Words to List of Realted Words
ISynset synset = word.getSynset();
for (IWord w : synset.getWords()) {
System.out.println(w.getLemma());
// output.add(w.getLemma());
}
}
} catch (Exception ex) {
System.out.println("No synonym found!");
}
完美无缺。