计算相似性度量

时间:2013-01-27 15:39:14

标签: java wordnet

我正在使用JWNL并根据wordnet计算字符串之间的相似性度量,我运行此代码:

import java.io.FileInputStream;
import java.util.HashMap;
import java.util.Map;

import net.didion.jwnl.JWNL;
import shef.nlp.wordnet.similarity.SimilarityInfo;
import shef.nlp.wordnet.similarity.SimilarityMeasure;

public class wordnet
{

public static void main(String[] args) throws Exception 
{
    //Initialize WordNet - this must be done before you try
    //and create a similarity measure otherwise nasty things
    //might happen!
    JWNL.initialize(new FileInputStream("test/wordnet.xml"));

    //Create a map to hold the similarity config params
    Map<String,String> params = new HashMap<String,String>();

    //the simType parameter is the class name of the measure to use
    params.put("simType","shef.nlp.wordnet.similarity.JCn");

    //this param should be the URL to an infocontent file (if required
    //by the similarity measure being loaded)
    params.put("infocontent","file:test/ic-bnc-resnik-add1.dat");

    //this param should be the URL to a mapping file if the
    //user needs to make synset mappings
    params.put("mapping","file:test/domain_independent.txt");

    //create the similarity measure
    SimilarityMeasure sim = SimilarityMeasure.newInstance(params);


    //get a similarity that involves a mapping
    SimilarityInfo d=sim.getSimilarity("english", "english");
    System.out.println(d.getSynset1());
    System.out.println(d.getSynset2());
    System.out.println(d.getSimilarity());
    System.out.println(d);

}

}

但我不知道为什么结果为零?!

结果是:

  

2013年1月27日下午7:03:00 net.didion.jwnl.util.MessageLog doLog
  信息:安装字典net.didion.jwnl.dictionary.FileBackedDictionary@48fbc0
  [Synset:[Offset:6074471] [POS:名词]单词:英语 - (研究英语语言和文学的学科)]
  [Synset:[Offset:6074471] [POS:名词]单词:英语 - (研究英语语言和文学的学科)]
  0.0
  英语#n#3 english#n#3 0.0

你能帮我吗?

1 个答案:

答案 0 :(得分:1)

你必须确保使用相应的WordNet字典试用Wordnet 2.0 否则尝试WS4J使用具有Wordnet 3.0嵌入的synset之间的相似性度量