请参阅此代码,这是我所掌握的JCCD API。 ^ _ ^
BufferedReader in = new BufferedReader(new FileReader(f.getFile()));
String filePath = f.getNama(); // getName of file
final Antlr3JavaLexer lexer = new Antlr3JavaLexer();
lexer.preserveWhitespacesAndComments = false;
try {
lexer.setCharStream(new ANTLRReaderStream(in));
} catch (IOException e) {
e.printStackTrace();
return false;
}
StringBuilder sbu = new StringBuilder();
while (true) {
org.antlr.runtime.Token token = lexer.nextToken();
if (token.getType() == lexer.EOF) {
break;
}
sbu.append(token.getType());
System.out.println(token.getType());
}
它为TestFileOne.java
提供了这样的输出876116423877916429791644323742916418167432388167444266238816449164291643016743444242877916429791641179164432310329164351674323742916420164432316461643016444426623164616430164444242881644442879010116429164164224143234242[]
和这个TestFileTwo.java
876116423877916429791644323742916418167432388167444266238816449164291643016743444242877916429791641179164432310329164351674323742916420164432316461643016444426623164616430164444242881644442879010116429164164224143234242[]
现在我的问题是,任何人都可以给我一个线索或建议,以实现预期结果的jaccard相似性,例如输出类似于相似性的百分比? 非常感谢你......