我想计算长度为5的两个向量之间的归一化欧几里德距离。使用Apache Math和RealVector的简单方法不会规范化距离,所以我尝试使用Weka。我已经关注了java代码:
Attribute one = new Attribute("one");
Attribute two = new Attribute("two");
Attribute three = new Attribute("three");
Attribute four = new Attribute("four");
Attribute five = new Attribute("five");
FastVector attributes = new FastVector();
attributes.addElement(one);
attributes.addElement(two);
attributes.addElement(three);
attributes.addElement(four);
attributes.addElement(five);
Instances wVector = new Instances("Vector", attributes, 0);
Instance firstInstance = new Instance(attributes.size());
firstInstance.setDataset(wClassVector);
firstInstance.setValue(one, 1.0);
firstInstance.setValue(two, 2.0);
firstInstance.setValue(three, 3.0);
firstInstance.setValue(four, 4.0);
firstInstance.setValue(five, 5.0);
Instance secondInstance = new Instance(attributes.size());
secondInstance.setDataset(wClassVector);
secondInstance.setValue(one, 10.0);
secondInstance.setValue(two, 20.0);
secondInstance.setValue(three, 30.0);
secondInstance.setValue(four, 40.0);
secondInstance.setValue(five, 50.0);
EuclideanDistance ed = new EuclideanDistance(wClassVector);
Double wDist = ed.distance(firstInstance, secondInstance);
ed.setDontNormalize(true);
Double wDist1 = ed.distance(firstInstance, secondInstance);
为什么计算未归一化距离wDist1
正确的以太归一化距离wDist
得到NaN
作为结果?
答案 0 :(得分:0)
距离的标准化基于创建距离函数的数据集实例的属性值的范围。
您的wVector
数据集不包含任何实例。您必须添加如下实例:
wVector.add(firstInstance);
wVector.add(secondInstance);
然后它应该按预期工作。