比较对列表以找到类似物

时间:2015-04-28 16:48:57

标签: java algorithm

Job {B37D5239-EDBA-11E4-80C2-00155D9ACA1F} : 
WarningMessage An error occured while applying the partial configuration [PartialConfiguration]ExternalIntegrationConfiguration. The error message is : 
The Local Configuration Manager is not configured with a certificate. Resource '[File]GpgProgram' in configuration 'ExternalIntegrationConfiguration' cannot be processed..

在测试时,我正在测试2部电影,每部电影都有20个独特的单词,分为成对的单词和频率

Movie1{{'hello',5},{'foo',3}}
Movie2{{'hi',2},{'foo',2}}

2 个答案:

答案 0 :(得分:1)

您的错误是您的getWordsAndFrequency()方法实际上向words添加了更多条目。因此,每次调用它时,单词列表会变得越来越长。要解决此问题,您应该计算单词和频率一次,然后将这些Pairs添加到列表中,然后只需在getWordsAndFrequency()方法中返回列表,而不是每次都计算它。

答案 1 :(得分:0)

您可以将数据(当前存储在成对的arraylist中)放在hashmap中吗? 然后,您可以计算两部电影之间的关键字集合的交集,并添加其分数

例如:

Map<String, Integer> keyWordsMovie1 = movie1.getWordsAndFrequency();
Map<String, Integer> keyWordsMovie2 = movie2.getWordsAndFrequency();
Set<String> commonKeyWords = new HashSet<String>(keyWordsMovie1.keySet()); //set of all keywords in movie1
intersection.retainAll(keyWordsMovie2.keySet());

for (String keyWord : intersection){
    int freq1 = keyWordsMovie1.get(keyWord);
    int freq2 = keyWordsMovie2.get(keyWord);    
    //you now have the frequencies of the keyword in both movies
}