Question

int queryVector = 1;
    double similarity = 0.0;
    int wordPower;
    String[][] arrays = new String[filename][2];
    int row;
    int col;


    for (a = 0; a < filename; a++) {
        int totalwordPower = 0;
        int totalWords = 0;
        try {
            System.out
                    .println(" _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _  ");
            System.out.println("\n");
            System.out.println("The word inputted : " + word2);
            File file = new File(
                    "C:\\Users\\user\\fypworkspace\\TextRenderer\\abc" + a
                            + ".txt");
            System.out.println(" _________________");

            System.out.print("| File = abc" + a + ".txt | \t\t \n");

            for (int i = 0; i < array2.length; i++) {

                totalCount = 0;
                wordCount = 0;

                Scanner s = new Scanner(file);
                {
                    while (s.hasNext()) {
                        totalCount++;
                        if (s.next().equals(array2[i]))
                            wordCount++;

                    }

                    System.out.print(array2[i] + " --> Word count =  "
                            + "\t " + "|" + wordCount + "|");
                    System.out.print("  Total count = " + "\t " + "|"
                            + totalCount + "|");
                    System.out.printf("  Term Frequency =  | %8.4f |",
                            (double) wordCount / totalCount);

                    System.out.println("\t ");

                    double inverseTF = Math.log10((float) numDoc
                            / (numofDoc[i]));
                    System.out.println("    --> IDF = " + inverseTF);

                    double TFIDF = (((double) wordCount / totalCount) * inverseTF);
                    System.out.println("    --> TF/IDF = " + TFIDF + "\n");

                    totalWords += wordCount;

                    wordPower = (int) Math.pow(wordCount, 2);

                    totalwordPower += wordPower;

                    System.out.println("Document Vector : " + wordPower);

                    similarity = (totalWords * queryVector)
                            / ((Math.sqrt((totalwordPower)) * (Math
                                    .sqrt(((queryVector * 3))))));



                }
            }
        } catch (FileNotFoundException e) {
            System.out.println("File is not found");
        }
        System.out.println("The total query frequency for this file is "
                + totalWords);
        System.out.println("The total document vector : " + totalwordPower);

        System.out.println("The similarity is " + similarity);
    }
}

}

您好我想根据上面的代码对SIMILARITY SCORE进行排序。这是2个文本文件的示例输出。我一共有10个文本文件。

输入的单词：你好吗

| File = abc0.txt |
怎么 - ＆gt;字数= | 0 |总计数= | 1289 |期限频率= | 0.0000 |
- ＆GT; IDF = 1.0413926851582251 - ＆GT; TF / IDF = 0.0

文件向量：0 是 - ＆gt;字数= | 0 |总计数= | 1289 |期限频率= | 0.0000 |
- ＆GT; IDF = 0.43933269383026263 - ＆GT; TF / IDF = 0.0

文件向量：0 你 - ＆gt;字数= | 0 |总计数= | 1289 |期限频率= | 0.0000 |
- ＆GT; IDF = 0.1962946357308887 - ＆GT; TF / IDF = 0.0

文件向量：0 此文件的总查询频率为0 总文档向量：0 相似性是NaN

输入的单词：你好吗

| File = abc1.txt |
怎么 - ＆gt;字数= | 0 |总计数= | 426 |期限频率= | 0.0000 |
- ＆GT; IDF = 1.0413926851582251 - ＆GT; TF / IDF = 0.0

文件向量：0 是 - ＆gt;字数= | 0 |总计数= | 426 |期限频率= | 0.0000 |
- ＆GT; IDF = 0.43933269383026263 - ＆GT; TF / IDF = 0.0

文件向量：0 你 - ＆gt;字数= | 3 |总计数= | 426 |期限频率= | 0.0070 |
- ＆GT; IDF = 0.1962946357308887 - ＆GT; TF / IDF = 0.0013823565896541458

文件向量：9 此文件的总查询频率为3 总文件向量：9 相似度为0.5773502691896257

注意：这是两个文本文件的示例运行。我总共有10个文本文件。

如何从最高到最低对SIMILARITY分数进行排序？有什么建议吗？

Answer 1

将SIMILARITY分数添加到列表中并使用库方法排序。它按升序排序，你可以从最后读它。

ArrayList<Double> arrayList = new ArrayList<Double>();
Collections.sort(arrayList);

或者您可以声明一个比较器并使用它，如下所示。

ArrayList<Double> arrayList = new ArrayList<Double>();
Comparator<Double> comparator = Collections.reverseOrder();
Collections.sort(arrayList,comparator);

HTH

排序从程序计算的分数

1 个答案: