我有2个包含术语权重的文件,我的目标是计算余弦相似度 cos = dotproduct(weight1,weights2)/ euclidianDistance(weight1)* euclidianDistance(weight2));
这是我的代码:
import java.io.*;
import java.util.*;
public class tp5
{
private static BufferedReader br1;
private static BufferedReader br2;
public static double getSimilarity(File file1, File file2)
throws IOException
{
br1 = new BufferedReader(new FileReader(file1));
String line1;
line1 = br1.readLine();
ArrayList<String> words1 = new ArrayList<String>();
for (String word : line1.split(" ")) {
words1.add(word);
}
br2 = new BufferedReader(new FileReader(file2));
String line2;
line2 = br2.readLine();
ArrayList<String> words2 = new ArrayList<String>();
for (String word : line2.split(" ")) {
words2.add(word);
}
int i;
int j;
int k;
// Double [] temp = null;
Double DotProduct = (double) 0 ;
Double euclid1 = (double) 0;
Double euclid2 = (double) 0;
for (j = 0; j < words1.size(); j++) {
DotProduct += Double.parseDouble(words1.get(j)) * Double.parseDouble(words2.get(j));
}
for (i = 0; i < words1.size(); i++) {
euclid1 = Math.pow(Double.parseDouble(words1.get(i)), Double.parseDouble(words1.get(i)));
}
euclid1 = Math.sqrt(euclid1);
for (k = 0; k < words1.size(); k++) {
euclid2 = Math.pow(Double.parseDouble(words2.get(k)), Double.parseDouble(words2.get(k)));
}
euclid2 = Math.sqrt(euclid2);
return DotProduct / (euclid1 * euclid2);
}
public static void main(String[] args)
throws IOException
{
File file1 = new File("texte.95-1.poids");
File file2 = new File("texte.95-2.poids");
System.out.println(getSimilarity(file1, file2));
}
}
我的重量可能是这样的问题,例如重量= 0.750305594399894
我在Double.parseDouble
Exception in thread "main" java.lang.NumberFormatException: For input string: "" 0.750305594399894" at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043) at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110) at java.lang.Double.parseDouble(Double.java:538)
解决方案是什么?
答案 0 :(得分:0)
当您尝试将String解析为数字时,会发生抛出的异常Route::controller
,但该数字是平滑的。可能是因为逗号(尝试点),因为它是一个空字符串,或者因为有一个字母。
我希望我有所帮助。
祝你有个愉快的一天。 :)
答案 1 :(得分:0)
刚刚使用了Double.valueOf(字符串编号),并且您的测试用例没有问题。